Interview Databases & Caching

What is Multi-AZ in RDS, and how does automatic failover work?

Databases & Caching · Basic level

Answer

RDS Multi-AZ is a high-availability deployment where RDS maintains a standby or alternate writer in another Availability Zone and automatically fails over when the primary cannot serve safely. Traditional Multi-AZ DB instance failover is commonly around 60 to 120 seconds, but application recovery can be longer.

Technical explanation

Multi-AZ is for availability, not read scaling in the traditional DB instance model.

Failover can be triggered by infrastructure failure, AZ disruption, maintenance, storage/network issues, or manual reboot with failover.

Applications must reconnect, retry idempotently, respect DNS, and handle in-flight transaction failure.

Hands-on example

Staging failover test:

$ while true; do date; psql "$DB_URL" -c "select now();" && echo ok || echo failed; sleep 1; done

$ aws rds reboot-db-instance --db-instance-identifier orders-stage --force-failover

Measure first failure, first success after recovery, API 5xx, p95 latency, and whether any pod restart was required.

Preparing for an interview?

Check how well your resume matches the role with our free resume checker— match score, ATS check, and the skills you're missing.

More Databases & Caching interview questions

← All Databases & Caching questions