Most organizations think of IPv6 migration as “let’s open the address, done.” On the ground, IPv6 arrives bundled with DNS, firewall policy, load balancing, observability, and incident reflexes. If you don’t design it correctly, “two protocols = double the surface area for problems.”
In this post I describe the approach that has paid off most reliably for me in practice: start under control with dual-stack, target IPv6-only, and make the transition measurable.
Why aim for IPv6-only?
If dual-stack persists for too long, it creates operational debt:
- Every service must be tested for both address families (IPv4/IPv6)
- Firewalls/ACLs are managed twice
- On the DNS side, “A/AAAA behavior” gives you two distinct error classes
- Troubleshooting runbooks double
The road to IPv6-only means a leaner network and steadier operations in the long term. Dual-stack here is just the bridge.
Before you start: a real inventory
First clarify these:
- Places with IPv4 dependencies: legacy applications, license servers, device management interfaces
- L7 edge: how does the CDN/WAF/Ingress/LB tier handle IPv6?
- DNS behavior: resolver chain, split-horizon, cache policy
- Observability: are the IP fields in your logs IPv6-ready? (index, parsers, dashboards)
- Security: does the firewall platform manage IPv6 policy at parity?
Addressing and routing: start “small but rule-bound”
The model I like:
- A /48 (or /56 if needed) per site / DC / region
- A /64 per VLAN/segment
- Before turning on Router Advertisement (RA) “automatically,” begin with static and controlled allocation (especially on server segments)
On the routing side:
- Bring IPv6 up on the core/backbone first as a transport (IGP + BGP policy)
- Then move service segments to dual-stack
- Expand client/endpoint coverage last
DNS: the A/AAAA balance and “always measure”
The critical risk in dual-stack: client and resolver behavior.
- Happy Eyeballs (the IPv6/IPv4 race on the client) exists in most modern systems, but don’t assume it’s “everywhere.”
- Once you publish AAAA, some old stacks try IPv6 and stall.
My rule:
- Carry IPv6 on the network
- Bring up the service on IPv6 (back-end ready)
- Open AAAA in DNS gradually (canary)
- Watch latency and error rate at every step
A simple way to measure, to start with:
- A/AAAA ratio in the resolver query logs
- “Client IP family” in application logs
- IPv6 connection count in LB/Ingress metrics
Firewall and segmentation: IPv6 policy is not a “copy-paste”
In the IPv4 world NAT sometimes gives you a “bad but practical” feeling of isolation. IPv6 has no NAT (and shouldn’t). Therefore:
- The “we had NAT, nothing was reaching us anyway” stance falls apart
- The actual security model surfaces: stateful firewall + correct segmentation
Checklist:
- Are IPv6 inbound/outbound rules at parity with IPv4?
- Is “deny by default” really in place?
- Don’t “fully block” ICMPv6 (it’s required for PMTUD and neighbor discovery)
- Logging: do drops on IPv6 land in the same pipeline?
MTU and PMTUD: the most insidious incident class
Fragmentation works differently in IPv6; Path MTU Discovery (PMTUD) has to flow correctly.
Symptoms:
- The TCP connection establishes, but on large payloads “weird” timeouts appear
- The problem only shows up on certain paths
- Random 504/timeouts at the application layer
Operational approach:
- Verify that ICMPv6 “Packet Too Big” messages aren’t being dropped
- Plan MTU on tunnels (VPN, GRE, overlay)
- Observability: track PMTU-related error logs / kernel counters
Migration phases (the order I use in the field)
Phase 0 — Preparation (1–2 sprints)
- Inventory + an “IPv6 ready” checklist
- Monitoring/logging fields parse IPv6 correctly
- Firewall policy modeling (parity with IPv4)
Phase 1 — Backbone (transport network) on IPv6 (1 sprint)
- Carry IPv6 on core routing + edge uplinks
- BGP policy, route-map, and prefix-lists defined
- OOB management network (where possible) goes dual-stack first
Phase 2 — Server segments dual-stack (2–4 sprints)
- Controlled addressing first, not stateless
- Service discovery/DNS rolled out gradually
- Load balancer health-check and logging at parity
Phase 3 — Service front-door IPv6 (gradual)
- IPv6 listeners on Ingress/LB
- An AAAA canary (e.g. 1% of traffic)
- Watch error rate, latency, and “retry storm” signals
Phase 4 — IPv6-only islands (the most valuable phase)
The aim here: test IPv6-only against real traffic.
- Bring up a new internal service segment as IPv6-only
- If there are IPv4 dependencies, plan transitional bridges like NAT64/DNS64
- Runbook: “How do we resolve an incident in an IPv6-only segment?”
Phase 5 — IPv6-only by default (the long-term target)
The right time to disable dual-stack isn’t “when everything is perfect”; it’s when measurements give you enough confidence.
Incident runbook: what do I keep ready in operations?
My standard set:
- A 5-minute triage flow to separate “DNS, routing, or firewall?”
dig/drillexamples comparing A/AAAAtraceroute -6/mtr -6paths- Firewall log queries: IPv6 drop analysis
- Quick steps to check for a PMTU suspicion
Conclusion
The IPv6 migration isn’t a “network project”; it’s end-to-end operational design. Start with controlled dual-stack, hold observability and security at parity, then build real production muscle through IPv6-only islands. This approach reduces risk and keeps you out of the “infinite dual-stack” trap.