İçeriğe Atla
Mustafa Erbay
Tutorials · 12 min read · görüntülenme Türkçe oku
100%

Network Drift with NetBox + Nornir: An Approval-Driven Remediation…

Detect configuration drift, approve fixes through Git, and apply them under control: source of truth → report → PR → rollout.

Network Drift with NetBox + Nornir: An Approval-Driven Remediation… — cover image

On the network side, “configuration drift” is unavoidable: emergency fixes, vendor differences, on-site pressure… Instead of trying to outright “ban” drift, the sustainable answer is to detect it and fix it under control.

In this article I walk through a practical flow:

NetBox (source of truth) → Nornir (execute) → Git PR (approval) → rollout (rings).

Target architecture (minimum viable)

The leanest version of this flow runs on these components:

  • NetBox: device/interface/IP/VLAN/tenant inventory
  • Git repo: the “desired state” (templates + variables)
  • Nornir job:
    • pull the inventory from NetBox
    • fetch running-config from each device
    • render the “expected” output via templates
    • produce a diff (the report)
    • apply after approval (commit + tag)

Step 1 — Make the NetBox inventory “automation-friendly”

Practices that smooth out the drift flow on the NetBox side:

  • Assign roles to devices (core/edge/access)
  • Use site/region fields consistently
  • Align the VLAN/VRF model with what’s actually deployed
  • Add an “automation ring” custom field (canary/pilot/prod)

The goal: be able to slice the Nornir inventory by tags.

Step 2 — Nornir inventory: NetBox as the source

Two important details on the Nornir side:

  1. Start with a read-only NetBox API token (for the report stage)
  2. Use a separate token/identity for the “apply after approval” stage

This separation splits the risk between “produce a report” and “apply changes.”

Step 3 — Drift report: produce a per-device diff

The report stage aims to answer:

  • Which devices are drifting?
  • What class of drift is it? (ACL, routing, interface, NTP, SNMP, syslog…)
  • Is the drift “expected” (a planned change) or a surprise?

I prefer producing the report in two formats:

  • For humans: a Markdown summary plus the most critical diffs
  • For machines: JSON (CI gate / metrics)

Step 4 — PR workflow: “an approved drift remediation”

Standardize this information inside the PR:

  • The list of affected devices (by ring)
  • Type of change (routing/ACL/…)
  • Expected impact (risk)
  • Rollback command/plan
  • Change window (if any)

Step 5 — Rollout: ring by ring

For rollout discipline, I follow this sequence:

  1. Canary: 1–3 devices
  2. Pilot: a small site/tenant
  3. Prod: the remainder

Measure at every stage:

  • Routing adjacency flap?
  • Packet loss / latency?
  • ACL hitcount anomaly?
  • CPU spike?

Step 6 — Rollback has to be real

Rollback can’t be “in theory it exists” — it has to be runnable in practice:

  • Treat the applied config change like a “transaction”
  • If the vendor supports it, lean on commit/confirm features
  • Keep the changes small and atomic

Closing: make drift visible first, then reduce it

The first win from this flow is not “less drift,” it is making drift visible. What is visible becomes manageable; what is manageable can be standardized.

If you’d like, the natural next step is layering “risk scoring” and “automatic maintenance window selection” on top of this flow — gates that adjust to the class of drift.

Paylaş:

Bu yazı faydalı oldu mu?

Yükleniyor...

Bu yazı nasıldı?

ME

Mustafa Erbay

Sistem Mimarisi · Network Uzmanı · Altyapı, Güvenlik ve Yazılım

2006'dan bu yana sistem mimarisi, network, sunucu altyapıları, büyük yapıların kurulumu, yazılım ve sistem güvenliği ekseninde çalışıyorum. Bu blogda sahada karşılığı olan teknik deneyimlerimi paylaşıyorum.

Kişisel Notlar

Bu notlar sadece sizde saklanır. Tarayıcınızda yerel olarak tutulur.

Hazır 0 karakter

Comments

Server-side AI Moderation

Comments are AI-moderated server-side and stored permanently.

?
0/2000

Server-side AI moderation

✉️ Free · No spam · Unsubscribe anytime

Curated digest, hand-picked by me — not the AI

Once a week: the most important post of the week, behind-the-scenes notes, and a "what I actually used this week" section. Less noise, more signal.

  • 📌
    Best of the week Single most-worth-reading post
  • 🔧
    Toolbox notes Real tools I used this week
  • 🧠
    Behind-the-scenes Notes that don't make it to blog

We don't spam. Unsubscribe anytime. · Tracked only by Umami (self-hosted, no Google).

Your Reading Stats

0

Posts Read

0m

Reading Time

0

Day Streak

-

Favorite Category

Related Posts