Health checks for internal services are usually framed as active probes. The model works, but it’s not always enough. Especially with stateful applications, services that perform expensive handshakes, or flows that only fail under real traffic, an active probe can produce a false sense of security. HAProxy’s passive health-check approach observes the real request flow and gives a more natural signal.

What does passive health checking mean?
Instead of sending a separate test request to the server, you treat a backend as temporarily bad based on the error rate and failed connections in real requests. In this model HAProxy uses signals like:
- connection errors,
- timeouts,
- specific error codes,
- consecutive failed responses.
This approach doesn’t have to replace active checking entirely; but it offers a strong supplementary signal, especially for internal traffic.
A simple configuration example
backend internal_api
balance roundrobin
option redispatch
default-server inter 3s fall 3 rise 2 on-error mark-down
server api1 10.10.20.11:8080 check
server api2 10.10.20.12:8080 check
Here on-error mark-down accelerates how real error behavior affects backend state. It needs to be paired with log and metric observation.
When is it especially useful?
Passive health checking shines in scenarios like:
- services where active probes trigger expensive operations,
- applications that fail only on specific header or auth flows,
- APIs carrying heavy east-west traffic on the internal network,
- backends that experience partial failure but aren’t fully down.
For these kinds of services, synthetic probes usually only see a small slice of the problem.
What to look at on the monitoring side
HAProxy stats and logs are useful in these areas:
- per-backend error rate,
- retry and redispatch counts,
- spikes in
srv_abrtandecon, - short-lived mark-down waves.
This data exposes not just load balancer health but also the behavioral quality of the application behind it.
Conclusion
Passive health checking with HAProxy for internal services is a powerful technique, especially for surfacing failures that only show up under real traffic. Active probing still has value, but treating it as the single source of truth usually leaves gaps. Once you can turn the result of a real request into a health signal, the load balancing layer starts making more honest decisions.