TL;DR: A production crash was traced to a use-after-free bug in Envoy’s DNS resolver, c-ares, where a specific NXDOMAIN, search-domain retry, and connection-refused sequence could trigger a heap fault and remote denial of service, according to Pomerium. The case shows how dependency failures become application availability risks when identity-aware access relies on deep networking stacks.
NHIMG editorial — based on content published by Pomerium: It's always DNS part ∞, tracking down a use-after-free bug in Envoy's DNS resolver c-ares
By the numbers:
- Pomerium said c-ares versions <= 1.34.5 were affected by the crash condition.
- 1.33.14, m published fixes in Envoy 1.33.14, 1.34.12, 1.35.8, and 1.36.4.
- Pomerium reported the issue on 2025-12-10 after reproducing it with ASan.
Questions worth separating out
Q: What breaks when DNS resolver bugs affect an identity-aware proxy?
A: When a DNS resolver bug affects an identity-aware proxy, the failure is often broader than a single lookup error.
Q: Why do shared libraries create identity risk in access infrastructure?
A: Shared libraries create identity risk because they sit underneath policy enforcement and can fail without warning.
Q: How do teams know if their access path is resilient enough?
A: Teams know the access path is resilient enough when they can replay malformed inputs, DNS failures, and retry storms without taking down the enforcement layer.
Practitioner guidance
- Inventory deep access dependencies Document every library and resolver used by identity-aware proxies, sidecars, and policy enforcement components.
- Test resolver failure paths under load Run replayable chaos and memory-safety tests against DNS retry, timeout, and connection-refused scenarios in clustered environments.
- Treat shared runtime libraries as patch-priority assets Track c-ares, Envoy, and similar low-level dependencies with the same patch urgency you use for exposed access services.
What's in the full article
Pomerium's full blog post covers the operational detail this post intentionally leaves for the source:
- The exact ASan output and c-ares call path that isolated the use-after-free.
- The minimal unit test used to reproduce the NXDOMAIN plus connection-refused sequence.
- The interim patching workflow, customer hotfix rollout, and upstream disclosure timeline.
- The specific Envoy and c-ares versions included in the released fixes.
👉 Read Pomerium's analysis of the Envoy DNS use-after-free bug →
Envoy DNS resolver crash: what it means for identity-aware access?
Explore further
Dependency resilience is part of identity governance, not a separate reliability concern. Identity-aware gateways and access proxies sit in the control path for human, NHI, and service traffic, so a crash in a shared library becomes an access outage even when IAM policy is unchanged. That makes resolver stability, runtime observability, and dependency hygiene part of the governance surface. Practitioners should treat the access path as a governed system, not just a policy decision.
A few things that frame the scale:
- 97% of NHIs carry excessive privileges, increasing unauthorised access and broadening the attack surface, according to Ultimate Guide to NHIs.
- 71% of NHIs are not rotated within recommended time frames, which means control drift often persists long after the initial deployment decision.
A question worth separating out:
Q: Who should own failures in embedded access dependencies?
A: Ownership should sit with the team that ships the access path, even when the bug lives in a third-party dependency. If the proxy or control plane fails, the business impact is local to that service, so the owner must track patching, test coverage, and recovery expectations for the full dependency chain.
👉 Read our full editorial: Envoy DNS use-after-free shows the blast radius of dependencies