How would you design ingestion controls to manage observability cost at scale? [Advanced]
Answer
I design ingestion controls with budgets, quotas, sampling, filtering, retention tiers, cardinality limits, and ownership tags. The goal is to preserve high-value debugging and SLO signals while preventing uncontrolled telemetry growth.
Technical explanation
Controls should exist at source, collector, backend, and review-process levels.
Every signal should have an owner, purpose, retention class, and cost visibility.
High-cardinality labels, debug logs, and unsampled traces need explicit approval in production.
Hands-on example
Hands-on design: require service.name, team, env, and cost_center attributes. At the collector, drop health-check logs, hash or remove PII, enforce max label cardinality policies, sample successful traces, keep error traces, and route audit logs to longer-retention Splunk indexes.
Check how well your resume matches the role with our free resume checker— match score, ATS check, and the skills you're missing.
More Observability interview questions
- What is observability, and how is it different from traditional monitoring? [Basic]
- What are the three pillars of observability (metrics, logs, traces)? [Basic]
- What is the difference between monitoring and observability in practice? [Basic]
- What are the four golden signals of monitoring? [Basic]
- What is the difference between the USE method and the RED method? [Basic]
- When would you use the USE method versus the RED method? [Basic]
- What is an SLI, an SLO, and an SLA, and how do they relate? [Basic]
- How do you choose good SLIs for a service? [Basic]