Interview Kubernetes, Docker, Helm & Podman

What is a HorizontalPodAutoscaler, and what metrics can it scale on?

Kubernetes, Docker, Helm & Podman · Basic level

Answer

The HorizontalPodAutoscaler changes replica count based on observed demand. It commonly scales on CPU and memory through metrics-server, and it can also scale on custom or external metrics such as queue depth, request rate, or business-specific load.

Technical explanation

HPA needs metrics availability; without metrics-server or custom metrics adapter it cannot make correct decisions.

Scaling should be based on a signal that reflects user demand, not only CPU if CPU is not the bottleneck.

Kubernetes workload controllers encode different lifecycle guarantees: interchangeable replicas, stable identities, node-local agents, or finite tasks.

Storage decisions must align with durability, access mode, zone placement, backup, restore, and failover behavior.

Autoscaling should be designed with metrics, scheduling constraints, PDBs, and node capacity together.

Hands-on example

1. Deploy a workload for this exercise using kubectl apply and a small test image such as nginx, busybox, or a purpose-built app: configure HPA for a web Deployment and generate load.

2. Inspect ownerReferences, events, Pods, PVCs, PVs, EndpointSlices, and metrics depending on the resource being tested.

3. Create a realistic disruption: delete a Pod, scale replicas, restart a node, fill a queue, or recreate storage attachment in a test environment.

4. Write the runbook entry covering expected behavior, safe rollback, and what alarms should exist.

Preparing for an interview?

Check how well your resume matches the role with our free resume checker— match score, ATS check, and the skills you're missing.

More Kubernetes, Docker, Helm & Podman interview questions

← All Kubernetes, Docker, Helm & Podman questions