Subscribe to the Non-Human & AI Identity Journal

How should teams govern Cloudflare settings that sit outside Terraform?

Teams should treat unmanaged Cloudflare settings as governance gaps, not minor exceptions. Inventory them, assign ownership, and import the highest-risk resources into infrastructure as code so changes become reviewable and reversible. Where immediate import is not possible, isolate those settings, document the exception, and set a short remediation path.

Why This Matters for Security Teams

Cloudflare settings that live outside Terraform are not just configuration drift. They are control gaps that can bypass approval, rollback, and evidence collection. When edge security, routing, DNS, firewall rules, or access policies are changed manually, teams lose the audit trail needed for change management and incident response. That matters even more for identity-adjacent settings because unmanaged secrets, tokens, and access paths often become the easiest route to lateral movement. NIST’s Cybersecurity Framework 2.0 frames this as a governance problem, not a tooling inconvenience. NHIMG’s research on lifecycle discipline makes the same point: unmanaged identity assets age into blind spots unless they are inventoried and brought under control, as discussed in the Ultimate Guide to NHIs — Lifecycle Processes for Managing NHIs. The practical risk is that “temporary” manual settings become the permanent posture. In practice, many security teams discover these gaps only after an outage, a privilege review, or an incident reveals that no one can prove who changed what, when, or why.

How It Works in Practice

The governing pattern is straightforward: classify unmanaged Cloudflare settings, assign ownership, and decide whether each item should be imported, isolated, or retired. Start with a complete inventory of anything not represented in Terraform, including DNS records, WAF rules, access policies, page rules, rate limits, tunnels, and API tokens. Then rank each item by business impact, exposure level, and change frequency. High-risk items should be imported into infrastructure as code so they inherit review, testing, and version control. Lower-risk items can remain outside Terraform only if they are explicitly documented, time-boxed, and monitored.

A useful operating model is:

  • Import the settings that affect public exposure, authentication, or traffic routing first.
  • Use ownership tags or metadata so every exception has a responsible team and a review date.
  • Restrict manual changes through Cloudflare permissions so only a small set of operators can touch unmanaged resources.
  • Reconcile drift on a fixed cadence and compare the live Cloudflare state to the declared desired state.
  • Track exceptions as security debt, not engineering convenience.

This matters because unmanaged settings often hide in “known but tolerated” corners of the stack until a service change, incident, or audit forces a cleanup. NHIMG’s Top 10 NHI Issues highlights how unmanaged identity and access pathways create durable risk, especially when paired with weak lifecycle control. Where teams need a standards anchor, the operational discipline aligns with least privilege and continuous monitoring in NIST CSF 2.0, while Regulatory and Audit Perspectives reinforces that evidence quality depends on whether changes are reviewable and attributable. These controls tend to break down in fast-moving edge environments with frequent emergency changes because manual fixes accumulate faster than teams can import and reconcile them.

Common Variations and Edge Cases

Tighter IaC control often increases short-term operational overhead, so teams have to balance speed of remediation against change-risk reduction. Not every Cloudflare setting should be forced into Terraform on day one, and current guidance suggests treating some items as controlled exceptions while the import backlog is worked down. That is especially true for break-glass changes, vendor-managed integrations, or ephemeral testing environments where the cost of full automation may exceed the near-term risk.

The main edge cases are:

  • Legacy zones where state import is incomplete or brittle.
  • Settings managed by multiple teams, which create ownership ambiguity.
  • Emergency response changes that need immediate rollback paths before they are codified.
  • API tokens and service accounts used by automation, which should be reviewed as NHIs, not as generic admin access.

For those cases, best practice is evolving, but the direction is clear: minimize the time a setting remains outside Terraform, and keep the exception visible until it is remediated. The broader pattern is consistent with NHIMG’s research on NHI lifecycle control and with the 230M AWS environment compromise, which shows how quickly weakly governed access paths can become systemic. Where a manual Cloudflare setting gates production traffic or privileged access, the exception should be treated as a temporary risk decision, not a stable operating model.