Interview › Istio & Service Mesh
How do you debug high tail latency introduced after enabling the mesh?
Istio & Service Mesh · Advanced level
Answer
To debug high tail latency after enabling the mesh, I compare before/after latency at each hop: client, ingress gateway, source proxy, destination proxy, and application. I look for retries, connection-pool limits, mTLS CPU cost, DNS issues, telemetry overhead, EnvoyFilter cost, and downstream saturation.
Technical explanation
Tail latency is often amplified by retries, queueing, or connection limits rather than average proxy overhead.
Separate application latency from proxy-added latency using access logs, traces, and metrics from both source and destination.
Check resource throttling on istio-proxy; CPU limits can cause sharp p99 latency jumps.
Hands-on example
Debug steps:
$ kubectl top pod -n app --containers
$ istioctl proxy-config clusters deploy/frontend -n app | grep backend
PromQL: compare p99 istio_request_duration by source and destination.
Temporarily disable new retries or filters in staging to isolate the regression.
Check how well your resume matches the role with our free resume checker— match score, ATS check, and the skills you're missing.
More Istio & Service Mesh interview questions
- What is Istio, and what are the core capabilities it provides?
- What is the difference between the Istio control plane and data plane?
- What is istiod, and what does it do?
- What is Envoy, and what role does it play in Istio?
- What is the sidecar pattern, and how does Istio inject the proxy?
- How does automatic sidecar injection work (namespace label, webhook)?
- What is the Istio ambient (sidecarless) mode, and how does it differ from sidecar mode?
- What is the difference between ztunnel and a waypoint proxy in ambient mode?