Interview Observability

How do you choose good SLIs for a service? [Basic]

Answer

Good SLIs are user-centric, measurable, attributable, and hard to game. I choose SLIs that represent the experience users actually care about: availability, latency, correctness, freshness, and durability depending on the service.

Technical explanation

For synchronous APIs, good SLIs are success ratio and latency below a threshold.

For pipelines, good SLIs include freshness, completeness, and processing delay.

Avoid SLIs that only measure internals, such as pod count or CPU, unless the user impact is direct and proven.

Hands-on example

Hands-on: for an order API, define good events as POST /orders returning 2xx within 750 ms, excluding client 4xx validation errors. In Prometheus, create a numerator for good requests and a denominator for total eligible requests, then graph the ratio by service and environment.

Preparing for an interview?

Check how well your resume matches the role with our free resume checker— match score, ATS check, and the skills you're missing.

More Observability interview questions

← All Observability questions