Interview › Istio & Service Mesh
What metrics would you alert on for the mesh itself?
Istio & Service Mesh · Advanced level
Answer
I alert on mesh control-plane health, proxy sync, gateway health, xDS push errors, certificate expiration, injection failures, 5xx/error-rate at gateways, mTLS or authorization failures, high proxy CPU/memory, rejected config, and abnormal request latency introduced at the proxy layer.
Technical explanation
Control-plane alerts tell us whether the mesh can accept changes and support scaling events.
Data-plane alerts tell us whether user traffic is affected.
Gateway alerts need special attention because gateways are shared choke points.
Hands-on example
Alert examples:
1. istiod unavailable or no ready replicas.
2. Proxy sync stale for more than 5 minutes.
3. Ingress gateway 5xx burn rate exceeds SLO.
4. Certificate expiry under threshold.
5. Envoy memory near limit or OOMKilled.
6. Spike in RBAC denied traffic after a policy deploy.
Check how well your resume matches the role with our free resume checker— match score, ATS check, and the skills you're missing.
More Istio & Service Mesh interview questions
- What is Istio, and what are the core capabilities it provides?
- What is the difference between the Istio control plane and data plane?
- What is istiod, and what does it do?
- What is Envoy, and what role does it play in Istio?
- What is the sidecar pattern, and how does Istio inject the proxy?
- How does automatic sidecar injection work (namespace label, webhook)?
- What is the Istio ambient (sidecarless) mode, and how does it differ from sidecar mode?
- What is the difference between ztunnel and a waypoint proxy in ambient mode?