Interview › Istio & Service Mesh
What is your rollback strategy if an Istio upgrade degrades traffic?
Istio & Service Mesh · Advanced level
Answer
If an Istio upgrade degrades traffic, my rollback strategy is to stop expansion, move affected namespaces back to the previous revision or revision tag, restart affected workloads, and, if gateways are impacted, roll back gateway deployments or traffic routing first. I keep old control plane and manifests until the rollback window closes.
Technical explanation
The fastest safe rollback depends on whether the issue is sidecar data plane, gateway data plane, control plane, CRD/API behavior, or mesh config compatibility.
Revision-based upgrades make rollback targeted instead of cluster-wide.
Before upgrading, I define objective rollback triggers such as p99 latency, 5xx burn rate, proxy crash loop, or mTLS failures.
Hands-on example
Rollback runbook:
$ istioctl tag set stable --revision old
$ kubectl label ns payments istio.io/rev=stable --overwrite
$ kubectl rollout restart deploy -n payments
$ istioctl proxy-status | grep payments
For gateway issue:
$ kubectl rollout undo deploy/istio-ingressgateway -n istio-system
Verify SLO recovery before resuming upgrade.
Check how well your resume matches the role with our free resume checker— match score, ATS check, and the skills you're missing.
More Istio & Service Mesh interview questions
- What is Istio, and what are the core capabilities it provides?
- What is the difference between the Istio control plane and data plane?
- What is istiod, and what does it do?
- What is Envoy, and what role does it play in Istio?
- What is the sidecar pattern, and how does Istio inject the proxy?
- How does automatic sidecar injection work (namespace label, webhook)?
- What is the Istio ambient (sidecarless) mode, and how does it differ from sidecar mode?
- What is the difference between ztunnel and a waypoint proxy in ambient mode?