Interview › Databases & Caching
How do you decide which columns to index?
Databases & Caching · Intermediate level
Answer
I decide indexes from real workload evidence: WHERE predicates, JOIN keys, ORDER BY, GROUP BY, uniqueness rules, and top expensive queries. I prioritize selective columns and query-specific composite indexes, then verify with EXPLAIN ANALYZE.
Technical explanation
Use real workload evidence from pg_stat_statements, slow logs, Performance Insights, or traces before adding indexes.
EXPLAIN shows the plan; EXPLAIN ANALYZE runs the query and compares estimated versus actual rows and timing.
Sequential scans are not always bad; for small tables or low-selectivity filters they may be optimal.
Hands-on example
Index tuning example:
EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM orders WHERE customer_id = $1 ORDER BY created_at DESC LIMIT 20;
CREATE INDEX CONCURRENTLY idx_orders_customer_created ON orders(customer_id, created_at DESC);
Re-run EXPLAIN and confirm lower execution time, fewer buffers read, and no large sort. For covering reads, add INCLUDE columns where appropriate.
Check how well your resume matches the role with our free resume checker— match score, ATS check, and the skills you're missing.
More Databases & Caching interview questions
- What is Amazon RDS, and what does it manage for you versus self-managed databases?
- What database engines does RDS support?
- What is the difference between RDS and Aurora?
- What is Multi-AZ in RDS, and how does automatic failover work?
- How long does an RDS Multi-AZ failover typically take, and what triggers it?
- What is the difference between Multi-AZ and a read replica?
- When would you use a read replica, and can it become a standalone database?
- Can a read replica be in a different region, and why would you do that?