Interview Observability

What is a burn rate, and how do you alert on it? [Basic]

Answer

Burn rate is how fast a service is consuming its error budget compared with the allowed rate. A 2x burn rate means the service is consuming budget twice as fast as planned; a 14x burn rate means it can exhaust a 28-day budget in about two days.

Technical explanation

Burn-rate alerts are more SLO-aligned than raw error thresholds because they account for the reliability target and time window.

Fast burn detects severe incidents quickly; slow burn catches smaller issues that accumulate over hours or days.

The alert expression usually divides current error ratio by the allowed error ratio.

Hands-on example

PromQL sketch: error_ratio = sum(rate(http_requests_total{status=~'5..'}[5m])) / sum(rate(http_requests_total[5m])). For a 99.9 percent SLO, allowed error ratio is 0.001. burn_rate = error_ratio / 0.001. Page on high burn over short and medium windows.

Preparing for an interview?

Check how well your resume matches the role with our free resume checker— match score, ATS check, and the skills you're missing.

More Observability interview questions

← All Observability questions