Interview › Kubernetes, Docker, Helm & Podman
How do you troubleshoot a Pod in CrashLoopBackOff?
Kubernetes, Docker, Helm & Podman · Intermediate level
Answer
CrashLoopBackOff means a container starts, exits, and kubelet backs off before restarting it again. I check previous logs, exit code, command/args, probes, config, secrets, dependencies, and recent deployment changes.
Technical explanation
BackOff is a symptom, not a root cause; the root cause is usually visible in previous logs, exit code, events, or config diff.
Check whether probes are killing an otherwise healthy slow-starting process.
Troubleshooting starts from state and events: get, describe, logs, previous logs, events, and then node/runtime/network checks.
Separate scheduling failures, image pull failures, runtime failures, app failures, and traffic-routing failures so you do not fix the wrong layer.
Operational commands like drain and rollback must respect PDBs, probes, and workload disruption tolerance.
Hands-on example
1. In a non-production namespace, create this safe broken scenario: create a CrashLoopBackOff with bad command and use --previous logs.
2. Follow a fixed triage order: kubectl get, describe, logs or logs --previous, events, rollout status, node status, and then runtime/network checks.
3. Fix only one variable at a time so the root cause is clear rather than accidentally masked.
4. Save the commands and final diagnosis as an interview-ready incident walkthrough.
Check how well your resume matches the role with our free resume checker— match score, ATS check, and the skills you're missing.
More Kubernetes, Docker, Helm & Podman interview questions
- What is Kubernetes, and what problem does it solve over running containers manually?
- Explain the Kubernetes control plane components (API server, etcd, scheduler, controller manager).
- What runs on a worker node (kubelet, kube-proxy, container runtime)?
- What is a Pod, and why does Kubernetes schedule Pods rather than containers?
- What is the difference between a Pod, a ReplicaSet, and a Deployment?
- How does a Deployment perform a rolling update, and how do maxSurge and maxUnavailable work?
- How do you roll back a Deployment, and how does Kubernetes track revisions?
- What is a Service, and what are the types (ClusterIP, NodePort, LoadBalancer, ExternalName)?