performance engineering
Reading Performance Testing by Use Case
Most performance-testing programmes have a load generator. Few have a representative dataset. Almost none can inject failures while measuring user-perceived latency. The reasons are multi-causal — a culture that treats performance as a release-time formality, plans that under-budget the supporting work, applications whose testability was never designed in, and a tool catalogue organised by category rather than by intent. This article reframes performance testing around seven use cases — API load testing in CI/CD, full-stack validation, microservice resilience, database benchmarking, frontend optimisation, capacity planning, and pre-production data realism — and uses them as the spine of a practical campaign-setup guide: what each test is trying to prove, what testability hooks it requires, what it realistically costs, what cultural pre-requisites it has, and which combination of tools assembles it.
Reading Observability by Intent
Tool taxonomies organise observability by metrics, traces, logs, and profiles. Practitioners organise it by intent: what am I trying to understand, debug, or prove? This article reframes the observability stack around six common intents — Golden Signals, latency propagation, high-cardinality debugging, low-overhead profiling, black-box, and cost-efficient at scale — with the workflows, the right tool combinations, the anti-patterns to avoid, and a dedicated treatment of how unified APM platforms (Datadog, New Relic, Dynatrace) fit in the intent-routing framing.
Coordinated Omission: Why Your Latency Numbers Lie
Most HTTP benchmarking tools quietly hide tail latency when the server slows down. The phenomenon is called coordinated omission, it shows up almost exclusively in p99 and beyond, and it has caused production incidents at organisations that thought their load tests were green. This post explains the mechanism, demonstrates it empirically with a reproducible benchmark of eight tools across a healthy control and four server pathologies, and shows how to fix it with a constant arrival-rate workload model.
A Performance Engineer's Library
An annotated bibliography of books, talks, blogs, podcasts, and academic resources for the practising performance engineer — covering systems performance, observability, capacity planning, and software performance engineering. The textual companion to the Awesome Performance Engineering tool list.