What is a HorizontalPodAutoscaler, and what metrics can it scale on?

Question

Accepted Answer

The HorizontalPodAutoscaler changes replica count based on observed demand. It commonly scales on CPU and memory through metrics-server, and it can also scale on custom or external metrics such as queue depth, request rate, or business-specific load. HPA needs metrics availability; without metrics-server or custom metrics adapter it cannot make correct decisions. Scaling should be based on a signal that reflects user demand, not only CPU if CPU is not the bottleneck. Kubernetes workload controllers encode different lifecycle guarantees: interchangeable replicas, stable identities, node-local agents, or finite tasks. Storage decisions must align with durability, access mode, zone placement, backup, restore, and failover behavior. Autoscaling should be designed with metrics, scheduling constraints, PDBs, and node capacity together.

What is a HorizontalPodAutoscaler, and what metrics can it scale on?

Answer

Technical explanation

Hands-on example

More Kubernetes, Docker, Helm & Podman interview questions