Interview AWS

How do you troubleshoot an EC2 instance that is unreachable over SSH?

AWS · Advanced level

Answer

For unreachable SSH, I check instance state, status checks, IP path, security groups, NACLs, routes, public/VPN path, key/user, sshd, host firewall, disk, CPU, and logs. Session Manager is often the fastest recovery path.

Technical explanation

Separate network reachability, authentication, and host-health checks so you do not chase the wrong layer.

Operations at scale should prefer managed access, automation, immutable infrastructure, repeatable runbooks, and auditability over manual host-by-host changes.

Troubleshooting should isolate layers: identity, network, host, application, dependency, deployment, and AWS service signals.

Patch, access, AMI, and incident workflows must be tested and measurable so they do not depend on tribal knowledge.

Hands-on example

1. Set up a sandbox EC2 fleet with SSM Agent, IAM instance role, CloudWatch Agent, hardened AMI baseline, and no unnecessary inbound access.

2. Perform the operation through automation: Session Manager, Run Command, Patch Manager, Image Builder, ASG instance refresh, or a runbook.

3. Introduce a realistic failure and use logs, metrics, status checks, and reachability tools to troubleshoot layer by layer.

4. Update the runbook and define the alarm or compliance check that would catch the issue next time.

Preparing for an interview?

Check how well your resume matches the role with our free resume checker— match score, ATS check, and the skills you're missing.

More AWS interview questions

← All AWS questions