Interview Observability

How would you measure observability coverage across services? [Advanced]

Answer

I measure observability coverage by checking whether every production service has owned metrics, logs, traces, dashboards, alerts, SLOs, runbooks, and correlation metadata. Coverage should be measured against operational outcomes, not just whether an agent is installed.

Technical explanation

Required attributes include service name, owner/team, environment, version, cluster, and runbook links.

Coverage should include signal quality: useful labels, structured logs, trace propagation, and actionable alerts.

Review coverage as part of production readiness and monthly operational reviews.

Hands-on example

Hands-on scorecard: for each service, mark RED metrics present, p95/p99 latency available, structured logs with trace_id, traces across dependencies, SLO defined, burn-rate alert configured, dashboard link, runbook link, and owner label. Track percent complete by team.

Preparing for an interview?

Check how well your resume matches the role with our free resume checker— match score, ATS check, and the skills you're missing.

More Observability interview questions

← All Observability questions