Interview Observability

What is observability, and how is it different from traditional monitoring? [Basic]

Answer

Observability is the ability to understand the internal state of a system from the signals it emits. Traditional monitoring tells me whether known checks are healthy; observability lets me ask new questions during unknown failure modes using metrics, logs, traces, events, and context.

Technical explanation

Monitoring is usually built around predefined dashboards and thresholds such as CPU greater than 80 percent or HTTP 5xx greater than 2 percent.

Observability focuses on debuggability: high-quality telemetry, useful dimensions, service ownership, correlation IDs, and enough context to explain why something is happening.

In SRE terms, monitoring is a subset of observability. A mature platform uses both: alerts for known user-impacting symptoms and exploratory telemetry for investigation.

Hands-on example

Hands-on: for a checkout service, expose request_count, request_duration, and error_count metrics, emit structured JSON logs with trace_id and order_id_hash, and propagate W3C trace context. When latency spikes, start from the SLO alert, open the latency dashboard, jump to slow traces for checkout to payment, then inspect only the correlated logs for those trace IDs.

Preparing for an interview?

Check how well your resume matches the role with our free resume checker— match score, ATS check, and the skills you're missing.

More Observability interview questions

← All Observability questions