Interview › Kubernetes, Docker, Helm & Podman
What is OOMKilled, and how do you diagnose and prevent it?
Kubernetes, Docker, Helm & Podman · Basic level
Answer
OOMKilled means the kernel killed the container because it exceeded its memory cgroup limit or the node was under memory pressure. I diagnose it with kubectl describe, previous logs, metrics, memory profiles, and node events, then fix the leak or resize requests and limits.
Technical explanation
OOMKilled can come from a real leak, bad sizing, sudden load, large startup allocation, or sidecar overhead not included in planning.
Use container_memory_working_set_bytes, application heap metrics, and previous logs to distinguish leak from legitimate sizing.
Health and resources are production controls, not just YAML fields; wrong settings cause outages, noisy restarts, bad rollouts, or wasted capacity.
Requests affect scheduling and node capacity planning; readiness affects traffic; liveness affects restart behavior.
Validate settings with real load, startup timing, memory profiles, and deployment rollout behavior.
Hands-on example
1. Create a namespace and deploy a small HTTP app specifically to test: trigger and diagnose an OOMKilled container in a safe namespace.
2. Add probes and resources in YAML, then run kubectl describe pod, kubectl top pod, and kubectl rollout status to observe behavior.
3. Introduce a controlled failure such as slow startup, bad health endpoint, CPU load, or memory spike.
4. Tune thresholds, requests, and limits until rollout and runtime behavior are stable, then document the production values and why.
Check how well your resume matches the role with our free resume checker— match score, ATS check, and the skills you're missing.
More Kubernetes, Docker, Helm & Podman interview questions
- What is Kubernetes, and what problem does it solve over running containers manually?
- Explain the Kubernetes control plane components (API server, etcd, scheduler, controller manager).
- What runs on a worker node (kubelet, kube-proxy, container runtime)?
- What is a Pod, and why does Kubernetes schedule Pods rather than containers?
- What is the difference between a Pod, a ReplicaSet, and a Deployment?
- How does a Deployment perform a rolling update, and how do maxSurge and maxUnavailable work?
- How do you roll back a Deployment, and how does Kubernetes track revisions?
- What is a Service, and what are the types (ClusterIP, NodePort, LoadBalancer, ExternalName)?