İçeriğe Atla
Mustafa Erbay
Tutorials · 13 min read · görüntülenme Türkçe oku
100%

Service Discovery with Consul: Health Checks and the DNS Interface

A guide to building an operable service discovery layer with Consul through health-driven service registration and the DNS interface.

Service Discovery with Consul: Health Checks and the DNS Interface — cover image

In enterprise infrastructures, service discovery often gets reduced to “add a DNS record.” Then the following problems show up:

  • When you move a service, the DNS reality lags behind (TTL/caching).
  • When a node breaks, DNS still hands it back (no health awareness).
  • Teams keep asking “where was this service running again?” by digging through wikis, spreadsheets, and tickets.

A discovery layer like Consul isn’t just “another product” here; it’s a control plane that turns the variability of the infrastructure into something manageable. In this article I focus on building an operable, field-ready model, especially via health checks + the DNS interface.

1) Frame the problem correctly: discovery, or routing?

Discovery’s goal isn’t “moving traffic”; it’s finding the right target.

  • The routing/LB layer carries the traffic (L4/L7).
  • The discovery layer answers the question “which instance is healthy?”

If you don’t make this distinction, you end up loading discovery with unwarranted expectations and the design balloons.

2) Where should Consul sit? (minimum viable)

Core building blocks:

  • A Consul server cluster (odd count, e.g. 3/5)
  • Consul agents (on each node)
  • Service registration model (catalog)
  • Health checks (critical)
  • DNS interface (for clients)

3) Health-check design: not ping, but ability to do work

Split health checks into three classes:

  1. Process check: is the service process up?
  2. Port check: is the port listening?
  3. Functional check: can the service actually do work? (critical path)

Push the discovery decision toward class 3 wherever possible. Because a wrong “healthy” verdict is the most expensive type of problem: it returns errors to users and stretches triage out.

4) The DNS interface: making peace with TTL and caching

The DNS interface gives you broad and pragmatic client compatibility. But by the nature of DNS:

  • There is caching
  • TTL behavior is not the same everywhere

So for DNS-based discovery there are two pragmatic approaches:

  • Low TTL + observation: changes propagate quickly but load goes up
  • Medium TTL + stability: less load, slower propagation

The right approach I’ve seen in the field isn’t “the lowest TTL”; it’s an operable TTL. Also, for “high churn” services (pods/instances that change very often), client-side discovery (like a sidecar) may fit better than DNS.

5) Runbook: what do I do during a “going to the wrong target” incident?

Symptom: some requests get errors, others don’t; load is fluctuating.

  • Which instances are DNS answers returning? (sample it)
  • Is the health-check verdict actually correct?
  • Pull the problematic instance out of the catalog (temporarily) and find the root cause
  • Are stale answers still circulating because of TTL/cache?
  • Has Consul server/agent latency gone up? (raft, disk, network)

6) Security: discovery = inventory + target map

The information in the Consul catalog is also valuable to an attacker. So:

  • Move UI/API access into the management network
  • Use the ACL/policy model
  • Wire audit logs into the central log/SIEM pipeline

Don’t let discovery, “for the sake of convenience,” turn into an inventory leak surface.

7) Closing

Service discovery with Consul is more than DNS record management: it produces a target list that lives off of health signals. The right design accepts the TTL/caching reality, pushes health checks closer to the ability to do real work, and operates the discovery layer as a critical service. Once that discipline is in place, “where is the service?” stops being a ticket and becomes a system answer.

Paylaş:

Bu yazı faydalı oldu mu?

Yükleniyor...

Bu yazı nasıldı?

ME

Mustafa Erbay

Sistem Mimarisi · Network Uzmanı · Altyapı, Güvenlik ve Yazılım

2006'dan bu yana sistem mimarisi, network, sunucu altyapıları, büyük yapıların kurulumu, yazılım ve sistem güvenliği ekseninde çalışıyorum. Bu blogda sahada karşılığı olan teknik deneyimlerimi paylaşıyorum.

Kişisel Notlar

Bu notlar sadece sizde saklanır. Tarayıcınızda yerel olarak tutulur.

Hazır 0 karakter

Comments

Server-side AI Moderation

Comments are AI-moderated server-side and stored permanently.

?
0/2000

Server-side AI moderation

✉️ Free · No spam · Unsubscribe anytime

Curated digest, hand-picked by me — not the AI

Once a week: the most important post of the week, behind-the-scenes notes, and a "what I actually used this week" section. Less noise, more signal.

  • 📌
    Best of the week Single most-worth-reading post
  • 🔧
    Toolbox notes Real tools I used this week
  • 🧠
    Behind-the-scenes Notes that don't make it to blog

We don't spam. Unsubscribe anytime. · Tracked only by Umami (self-hosted, no Google).

Your Reading Stats

0

Posts Read

0m

Reading Time

0

Day Streak

-

Favorite Category

Related Posts