TracksSpecializations and Deep DivesPerformance EngineeringPerformance Fundamentals(1 of 7)

Performance Fundamentals

Performance isn't a single number — it's a collection of metrics that together describe how well your system serves users. Before optimizing anything, you need to understand what you're measuring and why certain metrics matter more than others.

The Three Pillars of Performance

Latency measures how long operations take to complete. When a user clicks a button, latency is the time until they see a response. Lower is better, but the distribution matters as much as the average.

Throughput measures how many operations your system handles per time unit — requests per second, transactions per minute, or messages per hour. Higher throughput means your system can serve more users simultaneously.

Resource usage tracks what your system consumes: CPU, memory, disk I/O, and network bandwidth. Understanding resource usage helps you identify bottlenecks and plan capacity.

Why Percentiles Matter

Averages lie. If 99 requests complete in 100ms and one takes 10 seconds, the average is 199ms — but that doesn't reflect anyone's actual experience.

Percentiles tell the real story. The p50 (median) shows what half your users experience. The p95 shows what 95% of users experience — only 5% see worse. The p99 captures the experience of your unluckiest 1%.

Consider this example: p50 at 100ms, p95 at 500ms, p99 at 2000ms, but an average of 150ms. The average looks fine, but 1 in 100 users waits 2 full seconds. For a site with millions of requests, that's thousands of frustrated users daily.

Focus on p95 and p99 for user-facing systems. These "tail latencies" often reveal problems that averages hide.

How Users Perceive Performance

Human perception sets your targets. Responses under 100ms feel instant — users perceive no delay. Between 100-300ms, there's a slight but acceptable delay. From 300ms to 1 second, users notice waiting but stay engaged. Beyond 1 second, the flow breaks and users lose focus.

These thresholds should guide your performance goals. An API endpoint at 800ms might seem "fast enough" technically, but users feel that delay with every interaction.

Measure First, Optimize Second

The cardinal rule of performance work: never optimize based on assumptions. Profile your code, measure your systems, and let data guide your efforts. The bottleneck is rarely where you expect it.

Establish baselines before making changes. Without knowing where you started, you can't prove your optimizations helped — or accidentally made things worse.

See More

Further Reading

Last updated December 26, 2025

You need to be signed in to leave a comment and join the discussion