Interview Istio & Service Mesh

What metrics would you alert on for the mesh itself?

Istio & Service Mesh · Advanced level

Answer

I alert on mesh control-plane health, proxy sync, gateway health, xDS push errors, certificate expiration, injection failures, 5xx/error-rate at gateways, mTLS or authorization failures, high proxy CPU/memory, rejected config, and abnormal request latency introduced at the proxy layer.

Technical explanation

Control-plane alerts tell us whether the mesh can accept changes and support scaling events.

Data-plane alerts tell us whether user traffic is affected.

Gateway alerts need special attention because gateways are shared choke points.

Hands-on example

Alert examples:

1. istiod unavailable or no ready replicas.

2. Proxy sync stale for more than 5 minutes.

3. Ingress gateway 5xx burn rate exceeds SLO.

4. Certificate expiry under threshold.

5. Envoy memory near limit or OOMKilled.

6. Spike in RBAC denied traffic after a policy deploy.

Preparing for an interview?

Check how well your resume matches the role with our free resume checker— match score, ATS check, and the skills you're missing.

More Istio & Service Mesh interview questions

← All Istio & Service Mesh questions