What is the Pushgateway, and why is it discouraged for most cases? [Basic]
Answer
The Pushgateway lets short-lived jobs push metrics that Prometheus can later scrape. It is discouraged for most cases because it bypasses normal target health semantics, can become a bottleneck, and can leave stale metrics if lifecycle cleanup is not handled.
Technical explanation
Prometheus recommends the pull model for most services because it naturally exposes target availability through up.
Pushgateway is appropriate for service-level batch job results, not per-instance machine metrics or long-running services.
If used, metrics should include grouping keys carefully and be deleted when the job is no longer relevant.
Hands-on example
Example: a nightly reconciliation job pushes reconciliation_last_success_timestamp_seconds and reconciliation_records_processed_total to Pushgateway. Alert if time() - reconciliation_last_success_timestamp_seconds > 27h. Do not push per-pod CPU metrics or per-request metrics through Pushgateway.
Check how well your resume matches the role with our free resume checker— match score, ATS check, and the skills you're missing.
More Observability interview questions
- What is observability, and how is it different from traditional monitoring? [Basic]
- What are the three pillars of observability (metrics, logs, traces)? [Basic]
- What is the difference between monitoring and observability in practice? [Basic]
- What are the four golden signals of monitoring? [Basic]
- What is the difference between the USE method and the RED method? [Basic]
- When would you use the USE method versus the RED method? [Basic]
- What is an SLI, an SLO, and an SLA, and how do they relate? [Basic]
- How do you choose good SLIs for a service? [Basic]