SLO-Driven Load Testing with k6: Capacity Baselines and Release Gates

Most teams do load testing at one of two extremes: either they skip it entirely, or it devolves into a “how many RPS did we hit?” contest. The operational truth, however, is this: what you need to win in production is not maximum throughput, it is staying within your SLO. That is why I anchor load testing to the following question:

Under what load can this service stay without breaking its target SLO (latency + error)?

1) Define the SLO first

Before you launch a load test, write down these three targets:

p95 latency (e.g. 250ms)
error rate (e.g. 0.5%)
a “saturation signal” you will track during the run (CPU, thread pool, DB conn, queue lag, etc.)

2) k6 scenario: simple but decision-producing

A sample k6 skeleton (assuming an HTTP API):

import http from 'k6/http';
import { check, sleep } from 'k6';

export const options = {
  stages: [
    { duration: '2m', target: 20 },
    { duration: '3m', target: 50 },
    { duration: '3m', target: 80 },
    { duration: '2m', target: 0 },
  ],
  thresholds: {
    http_req_failed: ['rate<0.005'],       // error rate < 0.5%
    http_req_duration: ['p(95)<250'],      // p95 < 250ms
  },
};

export default function () {
  const res = http.get(`${__ENV.BASE_URL}/api/v1/health`);
  check(res, { 'status 200': (r) => r.status === 200 });
  sleep(0.2);
}

The scenario’s purpose is not “the highest possible request count”; it is to observe the boundary at which you remain inside the SLO.

3) Capacity baseline: compare “today” with “tomorrow”

For real operational benefit, the test must not be a one-shot. Two practical options:

Persist the baseline as JSON (e.g. the last successful release)
Compare new test results against the baseline to catch regressions

A simple approach:

Export k6 output as JSON via --summary-export
In CI, compare p95 and error metrics against the baseline
Fail the pipeline above a meaningful drift threshold

4) Release gate logic: make performance an “acceptance gate”

The release gate rule I recommend:

Even if the functional tests pass,
If p95 or the error rate breaches the SLO, or noticeably regresses against baseline,
The release drops into “guarded” mode (smaller canary, more aggressive rollback, shorter observation window).

This model promotes “performance testing” from a report into actual change control.

5) The most common field mistake: the test environment does not reflect reality

To reduce this trap:

Use production-like data (at minimum, similar data volume and index behavior)
Do cache warmup separately; measure only after warmup completes
Simulate the impact of test users on rate limiting / auth
Test downstream dependencies against controlled real instances rather than fakes

Conclusion: load testing must become the language of capacity management

SLO-driven load testing with k6 turns performance from “weather forecast” into operational decision input. When baseline and gate work together, you catch “release-slowing” changes before they ship. That shifts you from incidents toward controlled improvement.

SLO-Driven Load Testing with k6: Capacity Baselines and Release Gates

1) Define the SLO first

2) k6 scenario: simple but decision-producing

3) Capacity baseline: compare “today” with “tomorrow”

4) Release gate logic: make performance an “acceptance gate”

5) The most common field mistake: the test environment does not reflect reality

Conclusion: load testing must become the language of capacity management

Comments

Curated digest, hand-picked by me — not the AI

Your Reading Stats

Related Posts

Designing Maintenance Waves for Kubernetes Node OS Patching

Anatomy of Database Index Structures: Fundamentals of Query

Swap Fire: My Kubernetes Experiment on a 7.6 GB VPS

1) Define the SLO first

2) k6 scenario: simple but decision-producing

3) Capacity baseline: compare “today” with “tomorrow”

4) Release gate logic: make performance an “acceptance gate”

5) The most common field mistake: the test environment does not reflect reality

Conclusion: load testing must become the language of capacity management

Comments

Curated digest, hand-picked by me — not the AI

Your Reading Stats

Related Posts

Designing Maintenance Waves for Kubernetes Node OS Patching

Anatomy of Database Index Structures: Fundamentals of Query

Swap Fire: My Kubernetes Experiment on a 7.6 GB VPS

Klavye Kısayolları