What is the difference between a histogram and a summary, and the trade-offs? [Basic]
Answer
Histograms bucket observations and allow server-side aggregation and percentile calculation with histogram_quantile. Summaries calculate quantiles in the client and are harder to aggregate across instances. I usually prefer histograms for service latency in distributed systems.
Technical explanation
Histograms produce bucket time series such as le='0.5', le='1', and le='+Inf'.
Summaries can provide accurate client-side quantiles for one process but cannot be correctly averaged across replicas.
Histogram bucket choice matters: buckets should align to user-relevant thresholds and SLO objectives.
Hands-on example
PromQL: histogram_quantile(0.95, sum(rate(http_request_duration_seconds_bucket[5m])) by (le, service)). This gives p95 per service across all replicas, which is a key reason histograms are preferred over summaries for fleet-level dashboards.
Check how well your resume matches the role with our free resume checker— match score, ATS check, and the skills you're missing.
More Observability interview questions
- What is observability, and how is it different from traditional monitoring? [Basic]
- What are the three pillars of observability (metrics, logs, traces)? [Basic]
- What is the difference between monitoring and observability in practice? [Basic]
- What are the four golden signals of monitoring? [Basic]
- What is the difference between the USE method and the RED method? [Basic]
- When would you use the USE method versus the RED method? [Basic]
- What is an SLI, an SLO, and an SLA, and how do they relate? [Basic]
- How do you choose good SLIs for a service? [Basic]