What is your rollback strategy if an Istio upgrade degrades traffic?

Question

Accepted Answer

If an Istio upgrade degrades traffic, my rollback strategy is to stop expansion, move affected namespaces back to the previous revision or revision tag, restart affected workloads, and, if gateways are impacted, roll back gateway deployments or traffic routing first. I keep old control plane and manifests until the rollback window closes. The fastest safe rollback depends on whether the issue is sidecar data plane, gateway data plane, control plane, CRD/API behavior, or mesh config compatibility. Revision-based upgrades make rollback targeted instead of cluster-wide. Before upgrading, I define objective rollback triggers such as p99 latency, 5xx burn rate, proxy crash loop, or mTLS failures.

What is your rollback strategy if an Istio upgrade degrades traffic?

Answer

Technical explanation

Hands-on example

More Istio & Service Mesh interview questions