PostgreSQL Performance Optimization for Web Developers
Most slow web applications have a database problem. This guide covers the practical PostgreSQL tuning techniques that make the biggest difference — from reading query plans to index strategies and connection pooling.
When a web application slows down as it scales, the database is the culprit in the majority of cases. Network latency, unoptimised ORM queries, missing indexes, connection exhaustion, bloated tables — these are the patterns that transform a fast app into one that times out under load. PostgreSQL is a powerful, highly tuneable database system, and most web applications barely scratch the surface of its optimisation capabilities.
This guide focuses on practical techniques: the ones that produce real, measurable improvements in production applications. We will cover how to identify slow queries, how to read query execution plans, the index strategies that matter most, and the configuration and connection management changes that make a difference at scale.
Step 1: Find the Slow Queries
Optimisation without measurement is guesswork. Before changing anything, identify which queries are actually slow. PostgreSQL provides several tools for this:
pg_stat_statements
The pg_stat_statements extension is the most important query performance tool in PostgreSQL. It tracks execution statistics for every query type that has run against the database:
-- Enable the extension (run once)
CREATE EXTENSION IF NOT EXISTS pg_stat_statements;
-- Find the slowest queries by total execution time
SELECT
round(total_exec_time::numeric, 2) AS total_ms,
calls,
round(mean_exec_time::numeric, 2) AS mean_ms,
round((100 * total_exec_time / sum(total_exec_time) OVER ())::numeric, 2) AS pct,
query
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 20;
This query shows you the 20 queries consuming the most cumulative time on your database. Focus on the queries with the highest total time — these are your highest-value optimization targets. A query that runs in 5ms but executes 100,000 times per day costs more than a 500ms query that runs once a day.
log_min_duration_statement
For development and staging environments, you can log all queries exceeding a duration threshold in postgresql.conf:
log_min_duration_statement = 100 # log queries taking longer than 100ms
Step 2: Read Query Execution Plans with EXPLAIN ANALYZE
Once you have identified a slow query, EXPLAIN ANALYZE shows you exactly how PostgreSQL executes it — and why it might be slow:
EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT u.email, count(o.id) as order_count
FROM users u
LEFT JOIN orders o ON o.user_id = u.id
WHERE u.created_at > '2025-01-01'
GROUP BY u.email
ORDER BY order_count DESC;
Key things to look for in the output:
- Sequential Scans (Seq Scan) on large tables — indicates a missing index
- Actual rows vs Estimated rows — large differences indicate stale statistics; run
ANALYZEto update them - Nested Loop with high cost on large datasets — may benefit from a different join strategy or index
- Buffers: hit vs read — high reads indicate data not cached in memory
Use explain.depesz.com
For complex query plans, paste your EXPLAIN ANALYZE output into explain.depesz.com — it provides a colour-coded visualisation that makes the expensive operations immediately obvious.
Step 3: Index Strategies That Matter
Indexes are the most impactful optimisation for read-heavy web applications. But indiscriminate index creation degrades write performance. Here are the patterns that deliver the most value:
Covering Indexes (Index-Only Scans)
If a query SELECT and WHERE clause use the same columns, a covering index allows PostgreSQL to satisfy the query entirely from the index without touching the main table heap — dramatically faster for frequently executed queries:
-- Query: SELECT email, created_at FROM users WHERE status = 'active'
-- Covering index includes all columns referenced in the query:
CREATE INDEX idx_users_status_covering
ON users (status) INCLUDE (email, created_at);
Partial Indexes
When most queries filter on a common condition, a partial index indexes only the relevant rows — smaller, faster, and cheaper to maintain:
-- If 95% of queries target active users, don't index the inactive ones:
CREATE INDEX idx_users_active_email
ON users (email)
WHERE status = 'active';
-- Orders awaiting fulfilment only (not completed/cancelled history):
CREATE INDEX idx_orders_pending
ON orders (created_at, customer_id)
WHERE status = 'pending';
Composite Index Column Order
For composite indexes, column order matters: put the most selective column first, and match the order of conditions in your WHERE clauses. An index on (status, created_at) is not usable for a query that filters only on created_at:
-- Good: (user_id, created_at) for queries like:
-- WHERE user_id = $1 AND created_at > $2
CREATE INDEX idx_orders_user_date ON orders (user_id, created_at);
-- This index is NOT used for: WHERE created_at > $1 (no user_id filter)
-- Create a separate index for that query pattern
Create Indexes Concurrently
On production tables, always create indexes with CONCURRENTLY to avoid table locks:
CREATE INDEX CONCURRENTLY idx_orders_user_date ON orders (user_id, created_at);
Step 4: N+1 Query Problems (ORM Anti-Patterns)
The N+1 query problem is the single most common database performance issue in web applications using ORMs (Django ORM, ActiveRecord, SQLAlchemy, Eloquent). It occurs when an application loads a list of objects and then individually queries the database for related data on each item:
# Django — N+1 problem: 1 query for posts + N queries for authors
posts = Post.objects.all()[:100]
for post in posts:
print(post.author.name) # New DB query for EACH post!
# Fix: use select_related (JOIN) or prefetch_related (separate IN query)
posts = Post.objects.select_related('author').all()[:100]
for post in posts:
print(post.author.name) # No additional queries
Tools like Django Debug Toolbar, Bullet (Rails), or SQLAlchemy's query logging make N+1 problems visible. Always enable query logging in development and watch for "100 queries for a page that should need 3."
Step 5: Connection Pooling with PgBouncer
PostgreSQL creates a new process for each database connection, which is expensive. A web application with 100 concurrent workers maintaining persistent connections consumes significant memory and imposes overhead on every new connection. At scale, connection exhaustion becomes a hard ceiling on performance.
PgBouncer is a lightweight connection pooler that sits between your application and PostgreSQL. It maintains a small pool of actual database connections and multiplexes many application connections onto them:
# pgbouncer.ini
[databases]
myapp = host=localhost port=5432 dbname=myapp
[pgbouncer]
pool_mode = transaction # Best for web apps — pool at transaction level
max_client_conn = 1000 # Max connections from application
default_pool_size = 20 # Actual PostgreSQL connections
server_idle_timeout = 600
With transaction pool mode, a connection from the pool is only held for the duration of a transaction, then returned immediately — allowing 1,000 application workers to share 20 PostgreSQL connections. For most CRUD-heavy web apps, this reduces database memory usage by 80% and eliminates connection-related bottlenecks entirely.
Step 6: Critical postgresql.conf Tuning
Default PostgreSQL configuration is conservative — designed to run on a minimal machine. For a dedicated database server, these settings provide significant gains:
# Shared memory — set to 25% of total RAM
shared_buffers = 4GB # For 16GB RAM server
# Query planner memory — set per connection (used for sorts/hashes)
work_mem = 64MB # Increase if seeing disk sorts in EXPLAIN
# Effective cache size — hint to planner (set to ~75% of RAM)
effective_cache_size = 12GB
# WAL settings for write-heavy workloads
wal_buffers = 64MB
checkpoint_completion_target = 0.9
max_wal_size = 4GB
# Autovacuum tuning (prevent table bloat)
autovacuum_vacuum_scale_factor = 0.02 # Trigger more frequently on large tables
autovacuum_analyze_scale_factor = 0.01
Use PGTune (pgtune.leopard.in.ua) to generate a baseline configuration tuned to your server's hardware profile — it provides a good starting point before manual fine-tuning.
Step 7: Table Maintenance — VACUUM and ANALYZE
PostgreSQL's MVCC (Multi-Version Concurrency Control) architecture means old row versions ("dead tuples") accumulate in tables after updates and deletes. VACUUM removes dead tuples; ANALYZE updates statistics used by the query planner.
Autovacuum handles this automatically, but high-traffic tables may need manual intervention:
-- Check table bloat and dead tuple count
SELECT schemaname, tablename,
n_dead_tup, n_live_tup,
round(n_dead_tup::numeric / NULLIF(n_live_tup, 0) * 100, 2) AS dead_pct,
last_autovacuum, last_autoanalyze
FROM pg_stat_user_tables
ORDER BY n_dead_tup DESC
LIMIT 20;
-- Manual vacuum analyze on a high-traffic table
VACUUM ANALYZE orders;
Tables with dead_pct above 10–20% are accumulating bloat that slows sequential scans and wastes disk space. If autovacuum can't keep up (visible when dead tuples grow continuously), either tune autovacuum's per-table settings or schedule manual VACUUM runs during off-peak hours.
Priority Checklist
If you have limited time and want the highest-impact changes first:
- Enable pg_stat_statements and identify your slowest queries
- Run EXPLAIN ANALYZE on each — look for sequential scans on large tables
- Add missing indexes (especially on foreign key columns and common WHERE conditions)
- Fix N+1 problems in your ORM layer
- Deploy PgBouncer in transaction mode if you have more than 50 concurrent app workers
- Tune shared_buffers and effective_cache_size to match your hardware
- Monitor table bloat and autovacuum health on your busiest tables
These seven steps address the vast majority of PostgreSQL performance problems in web applications. The key is measurement first — always profile before changing anything, and measure again after to confirm the impact. Database optimisation without benchmarking is engineering by intuition, and intuition is frequently wrong when it comes to query planners.