What is the difference between a symptom-based and a cause-based alert, and which is better? [Basic]
Answer
A symptom-based alert fires on user-visible impact, such as high error rate or missed latency SLO. A cause-based alert fires on a suspected reason, such as CPU high or disk almost full. For paging, symptom-based alerts are usually better; cause alerts are useful for tickets and diagnostics.
Technical explanation
Symptom alerts are less noisy because they correspond to user pain and require action.
Cause alerts can be valuable when a condition will definitely become user-impacting, such as disk full in 30 minutes.
Good alerting separates page-worthy symptoms from dashboard or ticket-worthy causes.
Hands-on example
Example: do not page only because CPU is 85 percent. Page because checkout error-budget burn is high. Put CPU, memory, throttling, DB pool, and queue depth on the runbook dashboard so the responder can find the cause after the page.
Check how well your resume matches the role with our free resume checker— match score, ATS check, and the skills you're missing.
More Observability interview questions
- What is observability, and how is it different from traditional monitoring? [Basic]
- What are the three pillars of observability (metrics, logs, traces)? [Basic]
- What is the difference between monitoring and observability in practice? [Basic]
- What are the four golden signals of monitoring? [Basic]
- What is the difference between the USE method and the RED method? [Basic]
- When would you use the USE method versus the RED method? [Basic]
- What is an SLI, an SLO, and an SLA, and how do they relate? [Basic]
- How do you choose good SLIs for a service? [Basic]