In an enterprise, DNS gets framed as if “DNS just resolves names”; in the field, DNS is most often the best policy injection point. Because malware, phishing, misrouted proxies, and “covert egress” scenarios — most of them leave their first trace at the name resolution layer.
The RPZ (Response Policy Zone) approach lets you build an enterprise DNS firewall layer using the recursive resolver’s “modify or block the answer” capability. The key difference: you do this without breaking the client, without losing exception management, and while keeping operations sustainable.
What problem does RPZ solve?
With RPZ, the resolver can make decisions like:
- Return NXDOMAIN for specific domains (block)
- Resolve specific domains to a “sinkhole” IP (limited quarantine)
- Suppress certain record types (especially
A/AAAA) via policy
As a result:
- The chance that users hit malware C2 drops
- Phishing domains get blocked at the first hop
- In a “secure egress” approach, DNS becomes part of the network policy
The right placement in the architecture: The recursive resolver layer
Plant RPZ at the recursive resolver:
- A clear client-side setting: “enterprise resolver” required via DHCP/AD/MDM
- At least 2 resolvers (HA per site/region)
- Resolver logs flow into a single pipeline (SIEM + metrics)
Bolting RPZ onto the authoritative DNS as a “side job” usually doesn’t work; because the policy ownership is part of the resolver/edge operation.
Policy source: feed + local exception
The sustainable model in the field is two layered:
- Upstream threat feed (commercial or open-source lists) → automatic updates
- Local exception/allowlist → ticket-based, time-bounded (TTL), and auditable
Example: RPZ on BIND (concept)
To illustrate the RPZ logic, a simple example (in prod, align it with your own standard):
// named.conf (örnek)
response-policy { zone "rpz.local"; } break-dnssec yes;
zone "rpz.local" {
type master;
file "/etc/bind/db.rpz.local";
allow-query { none; };
};
Sample RPZ zone file:
$TTL 60
@ IN SOA rpz.local. hostmaster.rpz.local. (1 3600 600 86400 60)
IN NS rpz.local.
; Blok: phishing domain
bad-example.com CNAME .
*.bad-example.com CNAME .
; Sinkhole: telemetri için kontrollü yönlendirme (içte bir HTTP 204 servisine)
telemetry-test.example A 10.60.10.10
Notes:
- The
CNAME .approach behaves like a “policy NXDOMAIN” in most installations. - If you use a sinkhole, resolve to an IP that you log and keep secured.
Operational runbook: How do you “roll out RPZ”?
In the field I move in this order:
- Pilot ring: A small OU/VLAN/site pinned to the resolver
- Observation: 48–72 hours of “what’s breaking” reporting (top NXDOMAIN, top blocked, top exceptions)
- Exception flow: ticket template + duration + owner + rollback
- Rollout: ring1 → ring2 → all users
- Audit: prune exceptions periodically (90-day rule)
Exception ticket template (minimum)
- Domain/FQDN (justification if a wildcard is needed)
- Affected application / business process
- Owner (team) + approver
- Duration (e.g. 7 days) + “if it’ll be permanent” make-permanent action
- Alternative: is there a way to handle the app via proxy/allowlist instead?
Measurement: How do you tell whether RPZ “is working”?
Minimum metric set:
blocked_qps(time series)top_blocked_domains(with cardinality control)- Resolver
SERVFAILratio (a wrong policy can break DNS) - Exception count (if it’s growing over time, the model may be breaking)
- Resolver latency p95/p99
In the logs, these fields are gold:
- source IP / device identity (when possible)
- query name + query type
- policy hit (RPZ)
- response code
The “don’t break things” discipline alongside security
While treating RPZ as a security layer, don’t forget two production realities:
- Some SaaS apps generate domains very aggressively (CDN, telemetry, A/B test)
- Some legacy applications mask DNS timeouts as “application errors”
So:
- Use wildcard blocks very carefully
- Manage RPZ changes like a “change” (small ring first)
- Write a rollback plan (one-liner: turn the policy zone off and reload)
Conclusion
Building an enterprise DNS firewall with RPZ moves network security beyond “perimeter device” alone and pushes it into the name resolution layer. The real win isn’t in the technical setting; it’s in the model of exception discipline, observability, and ownership. When those three settle in, RPZ becomes a control plane that lowers phishing/malware risk while also producing strong signals for the operations team.