Architecture & Implementation

What should organisations measure to know if managed DNS is working?

By NHI Mgmt Group Editorial Team Updated June 23, 2026 Domain: Architecture & Implementation

Track resolution latency, successful failover behaviour, signed-zone coverage, and the share of critical services that depend on a single resolver path. If outages or spoofing resistance are not being tested, the control is present but not proven.

Why This Matters for Security Teams

Managed DNS is often treated as plumbing, but it is a control plane for availability, routing, and trust. If teams only measure whether a query returns an answer, they miss the more important question: whether DNS is reliably steering users and services to the right place under stress. That is why the NIST Cybersecurity Framework 2.0 emphasis on resilience matters here, alongside NHIMG guidance in the Top 10 NHI Issues.

The right measurements show whether managed DNS is reducing operational risk, not just serving responses. That includes latency, availability, propagation, failover correctness, signed-zone coverage, and how many critical workloads still depend on a single resolver path. For teams managing NHIs and service accounts, DNS also affects whether credentials, callbacks, and API routes remain reachable during failover and incident response. NHI Mgmt Group notes that 79% of organisations have experienced secrets leaks, with 77% of those incidents causing tangible damage, which is a reminder that weak control planes rarely stay theoretical for long.

In practice, many security teams discover DNS weaknesses only after a regional outage, a resolver misconfiguration, or a spoofing attempt has already disrupted production.

How It Works in Practice

Use metrics that reflect both correctness and resilience. Resolution latency should be measured from the client side and, where possible, by geography and network path. Success rates should distinguish between ordinary queries, high-volume zones, signed zones, and critical internal names. Failover should be tested as a real event, not assumed from vendor status pages. Current guidance suggests treating DNS as a service dependency with explicit reliability targets, not as an invisible utility.

A practical measurement model usually includes:

Median and tail resolution latency for internal and external domains
Query success rate during steady state and during resolver failure tests
Signed-zone coverage and DNSSEC validation success where deployed
Percentage of critical services with a single resolver path or a single provider dependency
Propagation time for record changes, especially during incident response or migration

Operational testing should include spoofing resistance, cache behaviour, and whether fallbacks preserve policy intent. The Ultimate Guide to NHIs is relevant here because DNS is often part of the dependency chain for service accounts, API callbacks, and automated workflows. If managed DNS is the basis for routing, then a working control must prove that it can still resolve correctly when a resolver, zone, or upstream path is degraded. That aligns with the resilience orientation in the NIST Cybersecurity Framework 2.0 and with lifecycle visibility in the NHI Lifecycle Management Guide.

These controls tend to break down when resolver paths are hidden inside cloud defaults, because teams lose visibility into which applications depend on a single upstream DNS chain.

Common Variations and Edge Cases

Tighter DNS measurement often increases operational overhead, requiring organisations to balance visibility against instrumentation cost and test complexity. That tradeoff is real, especially in hybrid estates where some services use public resolvers, some use internal recursive DNS, and others inherit DNS from managed platforms. Best practice is evolving on how far to standardise these paths, so it is better to label gaps clearly than to assume uniform coverage.

One common edge case is split-horizon DNS, where internal and external answers differ by design. In that environment, a single success metric can be misleading unless it is segmented by client context. Another is disaster recovery, where failover may technically work but still violate application expectations because TTLs are too long or cached records linger after cutover. A third is DNSSEC, where signed-zone coverage may be high but validation still fails in downstream resolvers that are not configured correctly. The practical test is whether critical services still route correctly under failure, not whether the platform reports green.

NHIMG research shows only 5.7% of organisations have full visibility into their service accounts, which matters here because service discovery and automated access paths often depend on DNS naming consistency. Teams that need a broader governance lens should also review the Ultimate Guide to NHIs — Regulatory and Audit Perspectives and the Top 10 NHI Issues. A control can be deployed and still not be proven if no one has tested failover, spoofing resistance, or resolver diversity under realistic load.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Non-Human Identity Top 10 address the attack and risk surface, while NIST CSF 2.0 and NIST AI RMF set the governance and control requirements practitioners need to meet.

Framework	Control / Reference	Relevance
NIST CSF 2.0	DE.CM	DNS health metrics support continuous monitoring of availability and anomalies.
OWASP Non-Human Identity Top 10	NHI-08	Managed DNS affects NHI-dependent service access and exposure of secrets paths.
NIST AI RMF		Resilience metrics help manage system reliability risks that affect AI-enabled services.

Use AI RMF to define reliability measures for identity and routing dependencies that support automated systems.

Deepen Your Knowledge

Ultimate Guide to NHIs → NHI Foundation Course → Discussion Forum →

NHIMG Editorial Note
Reviewed and updated by the NHIMG editorial team on June 23, 2026.
NHI Mgmt Group — the #1 independent authority on Non-Human Identity, IAM, and Agentic AI security. nhimg.org

#1 Authority in NHI Education, Research and Advisory, empowering organizations to tackle the critical risks posed by Non-Human Identities (NHIs), including AI Agents.

Get in Touch

Quick Links

FAQ

NHI 101 Articles

Legal & Policies

What should organisations measure to know if managed DNS is working?

Why This Matters for Security Teams

How It Works in Practice

Common Variations and Edge Cases

Standards & Framework Alignment

Related resources from NHI Mgmt Group