Interview Istio & Service Mesh

What is your rollback strategy if an Istio upgrade degrades traffic?

Istio & Service Mesh · Advanced level

Answer

If an Istio upgrade degrades traffic, my rollback strategy is to stop expansion, move affected namespaces back to the previous revision or revision tag, restart affected workloads, and, if gateways are impacted, roll back gateway deployments or traffic routing first. I keep old control plane and manifests until the rollback window closes.

Technical explanation

The fastest safe rollback depends on whether the issue is sidecar data plane, gateway data plane, control plane, CRD/API behavior, or mesh config compatibility.

Revision-based upgrades make rollback targeted instead of cluster-wide.

Before upgrading, I define objective rollback triggers such as p99 latency, 5xx burn rate, proxy crash loop, or mTLS failures.

Hands-on example

Rollback runbook:

$ istioctl tag set stable --revision old

$ kubectl label ns payments istio.io/rev=stable --overwrite

$ kubectl rollout restart deploy -n payments

$ istioctl proxy-status | grep payments

For gateway issue:

$ kubectl rollout undo deploy/istio-ingressgateway -n istio-system

Verify SLO recovery before resuming upgrade.

Preparing for an interview?

Check how well your resume matches the role with our free resume checker— match score, ATS check, and the skills you're missing.

More Istio & Service Mesh interview questions

← All Istio & Service Mesh questions