What breaks when security teams govern AI agents only through policy documents?

Policy documents cannot contain an agent that already has runtime access to tools, APIs, and production identities. The failure mode is governance without enforcement: teams can describe intended behaviour, but they cannot stop an overprivileged agent from acting outside scope when it is already authenticated.

Why This Matters for Security Teams

Policy documents are useful for defining intent, but they do not constrain an autonomous agent that already holds valid credentials, API tokens, or production access. The real problem is that agentic systems act at runtime, not on paper. Once an agent can chain tools, call services, and retry failed actions, a policy-only model becomes a reporting artifact rather than a control.

That gap is visible in current research. NHIMG’s AI Agents: The New Attack Surface report found that 80% of organisations report AI agents have already performed actions beyond their intended scope. Industry guidance is moving toward runtime governance for that reason, with both the OWASP Agentic AI Top 10 and the NIST AI Risk Management Framework emphasizing operational controls over written intent.

Security teams often assume that if legal, compliance, and architecture approve a policy, the control exists. In practice, many teams discover the gap only after an agent has already accessed sensitive data, taken an unintended action, or inherited privileges no reviewer expected.

How It Works in Practice

For AI agents, governance has to move from documents to enforcement. Static role-based access control is a poor fit because agents do not follow fixed human job functions. Their actions depend on prompts, tool choice, intermediate outputs, and environmental context. That is why current guidance suggests intent-based or context-aware authorization, where each request is evaluated at runtime instead of relying on a broad role assigned once at onboarding.

Operationally, this usually means combining workload identity, short-lived credentials, and policy-as-code. A strong design gives the agent a cryptographic workload identity, then issues just-in-time access for a specific task, scope, and TTL. Frameworks such as NIST AI Risk Management Framework and CSA MAESTRO agentic AI threat modeling framework both support the direction of travel: define risk, bind access to context, and evaluate decisions at the moment of action.

That approach is also consistent with NHIMG’s guidance in the OWASP NHI Top 10, which treats overprivilege, credential exposure, and uncontrolled delegation as recurring failure modes. In practice, teams should:

Issue ephemeral secrets per task instead of reusing long-lived tokens.
Bind each agent to a workload identity rather than a shared service account.
Evaluate policy at request time using full context, not pre-approved paper rules.
Revoke access automatically when the task ends or behaviour deviates.

These controls tend to break down in legacy environments where agents must interact with shared admin accounts, static integration keys, or systems that cannot enforce per-request authorization.

Common Variations and Edge Cases

Tighter runtime controls often increase integration overhead, requiring organisations to balance safety against delivery speed. That tradeoff becomes sharper in multi-agent systems, where one agent may call another, inherit outputs, and trigger downstream actions across several trust boundaries.

There is no universal standard for this yet. Some organisations use OIDC-backed workload identity, others prefer SPIFFE/SPIRE-style identity, and some rely on proxy enforcement with policy engines such as OPA or Cedar. The right choice depends on whether the agent is read-only, whether it can write to production systems, and whether actions are reversible. For higher-risk use cases, policy documents should be treated as governance evidence, not as the control itself.

Edge cases also matter when agents operate across third-party SaaS, browser automation, or external APIs. NHIMG’s State of Non-Human Identity Security notes that 85% of organisations lack full visibility into third-party vendors connected via OAuth apps, which mirrors the same structural weakness: approval without runtime containment. In these environments, policy-only governance usually fails because the agent can still act through trusted integrations even when the written policy forbids it.

Best practice is evolving toward continuous control verification, not one-time policy approval. Where agents can chain tools or self-correct after failures, paper policy becomes especially fragile because the execution path is too dynamic to anticipate fully.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Agentic AI Top 10 and CSA MAESTRO address the attack and risk surface, while NIST AI RMF set the governance and control requirements practitioners need to meet.

Framework	Control / Reference	Relevance
OWASP Agentic AI Top 10	A2	Covers broken agent authorization and overprivileged autonomous actions.
CSA MAESTRO	M1	Maps agent risk to threat modeling and operational guardrails.
NIST AI RMF	GOVERN	Requires accountability and operational governance for AI system risk.

Assign ownership for agent behavior and verify controls actually prevent misuse, not just document it.

What breaks when security teams govern AI agents only through policy documents?

Why This Matters for Security Teams

How It Works in Practice

Common Variations and Edge Cases

Standards & Framework Alignment

Related resources from NHI Mgmt Group