Interview › Databases & Caching
How long does an RDS Multi-AZ failover typically take, and what triggers it?
Databases & Caching · Basic level
Answer
RDS Multi-AZ is a high-availability deployment where RDS maintains a standby or alternate writer in another Availability Zone and automatically fails over when the primary cannot serve safely. Traditional Multi-AZ DB instance failover is commonly around 60 to 120 seconds, but application recovery can be longer.
Technical explanation
Multi-AZ is for availability, not read scaling in the traditional DB instance model.
Failover can be triggered by infrastructure failure, AZ disruption, maintenance, storage/network issues, or manual reboot with failover.
Applications must reconnect, retry idempotently, respect DNS, and handle in-flight transaction failure.
Hands-on example
Staging failover test:
$ while true; do date; psql "$DB_URL" -c "select now();" && echo ok || echo failed; sleep 1; done
$ aws rds reboot-db-instance --db-instance-identifier orders-stage --force-failover
Measure first failure, first success after recovery, API 5xx, p95 latency, and whether any pod restart was required.
Check how well your resume matches the role with our free resume checker— match score, ATS check, and the skills you're missing.
More Databases & Caching interview questions
- What is Amazon RDS, and what does it manage for you versus self-managed databases?
- What database engines does RDS support?
- What is the difference between RDS and Aurora?
- What is Multi-AZ in RDS, and how does automatic failover work?
- What is the difference between Multi-AZ and a read replica?
- When would you use a read replica, and can it become a standalone database?
- Can a read replica be in a different region, and why would you do that?
- What is the difference between automated backups and manual snapshots in RDS?