In multi-account / multi-VPC-VNet environments, talking about hub-spoke or Transit Gateway is easy; the hard part is designing it with secure defaults and in a way that’s operable. Because in real life the problem is not “I can’t connect”; the problem is:
- Unwanted east-west reachability via wrong route propagation
- Inspection (firewall/NVA) bypass
- After a change, the question “which flow goes which way?” gets no answer
In this article I’m summarizing a governance approach for Transit Gateway (or equivalent hub-spoke building blocks) that has worked for me in the field.
1) First, draw the boundaries: which traffic falls into which class?
Before designing the transit layer, classify the traffic:
- North-South: internet / SaaS / outside world
- East-West: service calls between VPCs/VNets
- Shared Services: shared services such as DNS, logging, CI runner, artifacts, registry
- On-prem: data center, MPLS, legacy segments
Three questions per class:
- Is inspection mandatory? (FW, DLP, IDS, proxy)
- Where is identity enforced? (IAM, mTLS, proxy auth)
- Where is the observability signal? (flow log, firewall log, NDR)
2) Secure default: “propagation off, association deliberate”
The most common mistake I’ve seen: a “hidden mesh” growing through automatic propagation in Transit route tables.
My practical rule:
- Route propagation: default off
- Association: every attachment (spoke) is deliberately bound to a table
- Egress / inspection: separate tables and separate “gate” points
3) Reference pattern: 4-section architecture
A simple but strong starting pattern:
A) Spokes (Application VPC/VNet)
- Application subnets
- Minimum egress (control exists even without NAT)
- Spokes don’t see each other by default
B) Shared Services
- DNS forwarder/resolver
- Logging/telemetry collector
- Artifact/registry
- Management bastion/SSO proxy (a “no jump host” model if possible)
C) Inspection VPC/VNet (Security gate)
- Firewall/NVA/IDS
- Proxy / TLS inspection if required, here
- Routes and guardrails to prevent “bypass”
D) Egress VPC/VNet (Internet egress)
- NAT, egress firewall, allowlist/denylist
- Centralized control for SaaS egress
This split makes the word “governance” practical: who exits from where, who talks to whom, where it’s measured.
4) Guardrails: take the price tag off a wrong change
In the transit layer, a small mistake creates a big blast radius. So guardrails are not “nice to have”.
My minimum set:
- An approval gate for route table changes (PR / IaC)
- Prefix-list / route-map style constraints (per platform)
- An automated check for “inspection bypass” (reachability analysis)
- Mandatory flow logs + retention policy
- Tag/label standards (attachment owner, env, criticality)
5) Operations: a 5-minute answer to “which flow goes which way?”
The goal of a transit design is not just connectivity; it is fast diagnosis during an incident.
The approach that works for me in the field:
- A “golden path” flow list per critical spoke (e.g. app→db, app→shared-dns, app→internet)
- For these flows:
- Flow log query templates
- Firewall log correlation points
- Route table snapshot and diff
6) Change discipline: transit needs “small waves”
Transit route changes shouldn’t happen “20 times a day” like an application deploy.
A practical policy:
- Change window + pre-mortem (if critical)
- Canary spoke (start in non-prod or low risk)
- Rollback plan (return to the previous route set)
7) Runbook: “Spoke can’t reach another spoke”
Triage order:
- Spoke route table: is there a route for the destination prefix?
- Transit association: is it bound to the correct table?
- Propagation: did a route leak from the wrong table / did it never arrive?
- Inspection: is the flow visible in the firewall log? (if not, bypass or no route)
- NACL/SG: even if transit is correct, is there an L4 block?
Closing
Transit Gateway/hub-spoke, when designed correctly, simplifies segmentation, centralizes inspection, and accelerates operational diagnosis. When designed wrong, it produces “hidden mesh” and hides the risks.
My target is this: Transit is not a box that provides connectivity; it is a control plane for security and operations.