Interview › Kubernetes, Docker, Helm & Podman
What is a HorizontalPodAutoscaler, and what metrics can it scale on?
Kubernetes, Docker, Helm & Podman · Basic level
Answer
The HorizontalPodAutoscaler changes replica count based on observed demand. It commonly scales on CPU and memory through metrics-server, and it can also scale on custom or external metrics such as queue depth, request rate, or business-specific load.
Technical explanation
HPA needs metrics availability; without metrics-server or custom metrics adapter it cannot make correct decisions.
Scaling should be based on a signal that reflects user demand, not only CPU if CPU is not the bottleneck.
Kubernetes workload controllers encode different lifecycle guarantees: interchangeable replicas, stable identities, node-local agents, or finite tasks.
Storage decisions must align with durability, access mode, zone placement, backup, restore, and failover behavior.
Autoscaling should be designed with metrics, scheduling constraints, PDBs, and node capacity together.
Hands-on example
1. Deploy a workload for this exercise using kubectl apply and a small test image such as nginx, busybox, or a purpose-built app: configure HPA for a web Deployment and generate load.
2. Inspect ownerReferences, events, Pods, PVCs, PVs, EndpointSlices, and metrics depending on the resource being tested.
3. Create a realistic disruption: delete a Pod, scale replicas, restart a node, fill a queue, or recreate storage attachment in a test environment.
4. Write the runbook entry covering expected behavior, safe rollback, and what alarms should exist.
Check how well your resume matches the role with our free resume checker— match score, ATS check, and the skills you're missing.
More Kubernetes, Docker, Helm & Podman interview questions
- What is Kubernetes, and what problem does it solve over running containers manually?
- Explain the Kubernetes control plane components (API server, etcd, scheduler, controller manager).
- What runs on a worker node (kubelet, kube-proxy, container runtime)?
- What is a Pod, and why does Kubernetes schedule Pods rather than containers?
- What is the difference between a Pod, a ReplicaSet, and a Deployment?
- How does a Deployment perform a rolling update, and how do maxSurge and maxUnavailable work?
- How do you roll back a Deployment, and how does Kubernetes track revisions?
- What is a Service, and what are the types (ClusterIP, NodePort, LoadBalancer, ExternalName)?