Interview Databases & Caching

How do you decide which columns to index?

Databases & Caching · Intermediate level

Answer

I decide indexes from real workload evidence: WHERE predicates, JOIN keys, ORDER BY, GROUP BY, uniqueness rules, and top expensive queries. I prioritize selective columns and query-specific composite indexes, then verify with EXPLAIN ANALYZE.

Technical explanation

Use real workload evidence from pg_stat_statements, slow logs, Performance Insights, or traces before adding indexes.

EXPLAIN shows the plan; EXPLAIN ANALYZE runs the query and compares estimated versus actual rows and timing.

Sequential scans are not always bad; for small tables or low-selectivity filters they may be optimal.

Hands-on example

Index tuning example:

EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM orders WHERE customer_id = $1 ORDER BY created_at DESC LIMIT 20;

CREATE INDEX CONCURRENTLY idx_orders_customer_created ON orders(customer_id, created_at DESC);

Re-run EXPLAIN and confirm lower execution time, fewer buffers read, and no large sort. For covering reads, add INCLUDE columns where appropriate.

Preparing for an interview?

Check how well your resume matches the role with our free resume checker— match score, ATS check, and the skills you're missing.

More Databases & Caching interview questions

← All Databases & Caching questions