What is the difference between metrics-based and log-based alerting, and the cost implications? [Advanced]

Question

Accepted Answer

Metrics-based alerting evaluates pre-aggregated numeric time series and is usually cheaper, faster, and more reliable for paging. Log-based alerting searches event data and is useful for rare conditions or specific error patterns, but it can be more expensive and noisy at scale. Metrics are compact and purpose-built for alert evaluation, making them ideal for SLO burn, latency, traffic, and saturation. Logs carry richer context but require high-volume ingestion and search processing. Use log alerts sparingly for conditions that cannot be represented safely as metrics, such as specific audit violations or unique fatal error signatures.

What is the difference between metrics-based and log-based alerting, and the cost implications? [Advanced]

Answer

Technical explanation

Hands-on example

More Observability interview questions