Interview › Databases & Caching
How do you size an ElastiCache cluster (memory, nodes, shards)?
Databases & Caching · Advanced level
Answer
I size ElastiCache from working-set memory, key/value overhead, fragmentation, CPU, network, connection count, replica needs, shard distribution, failover headroom, and growth forecast.
Technical explanation
Failover testing must measure the whole application recovery path, not just service events.
Connection storms are prevented with pool caps, rolling deploys, jittered startup, exponential backoff, and readiness gates.
For interview stories, state the business driver, migration plan, validation, rollback, and measurable result such as p95 latency, hit ratio, cost, or error-rate improvement.
Hands-on example
Capacity example:
Current RDS free storage = 500 GB
Growth = 20 GB/day
Days remaining = 25
Action threshold = 45 days, so act now.
SQL size check:
SELECT relname, pg_size_pretty(pg_total_relation_size(oid)) FROM pg_class WHERE relkind = 'r' ORDER BY pg_total_relation_size(oid) DESC LIMIT 20;
Plan: enable storage autoscaling with max cap, archive old data, review index bloat, test restore time, and evaluate partitioning.
Check how well your resume matches the role with our free resume checker— match score, ATS check, and the skills you're missing.
More Databases & Caching interview questions
- What is Amazon RDS, and what does it manage for you versus self-managed databases?
- What database engines does RDS support?
- What is the difference between RDS and Aurora?
- What is Multi-AZ in RDS, and how does automatic failover work?
- How long does an RDS Multi-AZ failover typically take, and what triggers it?
- What is the difference between Multi-AZ and a read replica?
- When would you use a read replica, and can it become a standalone database?
- Can a read replica be in a different region, and why would you do that?