TracksSpecializations and Deep DivesAdvanced Database ConceptsDatabase Monitoring and Debugging(7 of 7)

Database Monitoring and Debugging

The database is often the bottleneck in web applications. When things slow down, the database is usually the first place to look. Good monitoring reveals problems before users notice them, and knowing how to debug database issues is an essential skill.

Key Metrics to Watch

These metrics tell you how your database is doing:

Query latency measures how long queries take. Track percentiles — p50 (median), p95, and p99. A p99 of 500ms means 1% of queries take over half a second.

Connections show how many clients are connected. Watch for connections approaching your limit, or many connections waiting for locks.

Cache hit ratio indicates how often data comes from memory versus disk. Below 95% often signals a problem.

Replication lag matters if you use replicas. High lag means replicas are falling behind, which can cause stale reads.

Disk usage and I/O reveal storage pressure. Running out of disk space is a common cause of outages.

Finding Slow Queries

Most databases can log queries that exceed a time threshold:

-- PostgreSQL: log queries taking over 1 second
ALTER SYSTEM SET log_min_duration_statement = 1000;
SELECT pg_reload_conf();

-- Find currently running queries
SELECT pid, now() - pg_stat_activity.query_start AS duration, query
FROM pg_stat_activity
WHERE state = 'active';

Once you find slow queries, use EXPLAIN ANALYZE to understand why they're slow. The execution plan shows whether indexes are being used and where time is spent.

Common Issues and Fixes

Missing indexes cause full table scans. If EXPLAIN shows "Seq Scan" on a large table, you probably need an index.

Lock contention happens when transactions hold locks too long. Look for long-running transactions and consider breaking them into smaller units.

Connection exhaustion occurs when you hit the connection limit. Use connection pooling to share connections across requests.

Table bloat accumulates from updates and deletes. Run VACUUM regularly in PostgreSQL to reclaim space.

Using AI for Database Debugging

When you're stuck, AI can help interpret execution plans:

"My database query is slow. Here's the EXPLAIN ANALYZE output: [paste output]. What's causing the slowness and how do I fix it?"

Provide the query, the execution plan, and relevant table sizes. AI can often spot missing indexes or inefficient join orders.

See More

Further Reading

You need to be signed in to leave a comment and join the discussion