What is a burn rate, and how do you alert on it? [Basic]
Answer
Burn rate is how fast a service is consuming its error budget compared with the allowed rate. A 2x burn rate means the service is consuming budget twice as fast as planned; a 14x burn rate means it can exhaust a 28-day budget in about two days.
Technical explanation
Burn-rate alerts are more SLO-aligned than raw error thresholds because they account for the reliability target and time window.
Fast burn detects severe incidents quickly; slow burn catches smaller issues that accumulate over hours or days.
The alert expression usually divides current error ratio by the allowed error ratio.
Hands-on example
PromQL sketch: error_ratio = sum(rate(http_requests_total{status=~'5..'}[5m])) / sum(rate(http_requests_total[5m])). For a 99.9 percent SLO, allowed error ratio is 0.001. burn_rate = error_ratio / 0.001. Page on high burn over short and medium windows.
Check how well your resume matches the role with our free resume checker— match score, ATS check, and the skills you're missing.
More Observability interview questions
- What is observability, and how is it different from traditional monitoring? [Basic]
- What are the three pillars of observability (metrics, logs, traces)? [Basic]
- What is the difference between monitoring and observability in practice? [Basic]
- What are the four golden signals of monitoring? [Basic]
- What is the difference between the USE method and the RED method? [Basic]
- When would you use the USE method versus the RED method? [Basic]
- What is an SLI, an SLO, and an SLA, and how do they relate? [Basic]
- How do you choose good SLIs for a service? [Basic]