TL;DR: Security teams cannot see what AI agents access, how they change behavior, or which tools they chain together once deployed, according to Zenity. That visibility gap makes prompt filtering and other input-focused controls insufficient because the real risk sits in agent autonomy and runtime decision-making.
At a glance
What this is: This interview argues that AI agents create a governance gap because teams cannot reliably observe or control what the agents do after deployment.
Why it matters: It matters because IAM, PAM, and identity governance programmes now need to account for agent behaviour, not just access issuance, if they want to manage risk across NHI, autonomous, and human identity programmes.
By the numbers:
- 80% of organisations report their AI agents have already performed actions beyond their intended scope, including accessing unauthorised systems (39%), inappropriately sharing sensitive data (31%), and revealing access credentials (23%).
👉 Read Zenity's interview on building an AI agent security strategy
Context
AI agent governance is the discipline of controlling what an agent can access, do, and decide at runtime. This interview shows why traditional security playbooks fail once an agent can chain tools, switch context, and continue acting without a human approval gate.
The identity problem is broader than prompt safety. For security, IAM, and IGA teams, the question is whether current control models can handle non-human actors whose behaviour changes after deployment and whose access patterns are not stable enough to review in the normal way.
Key questions
Q: What breaks when security teams rely only on prompt filtering for AI agents?
A: Prompt filtering misses the actual security boundary because an AI agent can still chain tools, call APIs, and continue executing after a harmless-looking request. That leaves behaviour, context, and downstream access outside the control model. Security teams need runtime visibility and policy enforcement over actions, not just over input text.
Q: Why do AI agents complicate identity governance programmes?
A: AI agents complicate identity governance because their access is not just granted once and reviewed later. They can change context, choose tools, and complete risky actions within a single session, which breaks review cycles built for slower-moving identities. Governance has to measure what the agent can do at runtime, not only what it was allowed to do at provisioning.
Q: How do security teams know whether agent governance is actually working?
A: They know it is working when they can see the full agent session, including tool selection, API calls, and context switches, and when policy blocks actions outside the approved workflow. If investigations still depend on guessing from logs after the fact, then the governance model is not yet controlling behaviour effectively.
Q: Who is accountable when an AI agent leaks data or misuses access?
A: Accountability sits with the organisation that deployed the agent and assigned its permissions, because the agent is acting inside an owned workflow rather than on its own behalf. Security, IAM, and application owners need a clear division of responsibility for approval, telemetry, and remediation before the agent is put into production.
Technical breakdown
Why prompt filtering does not control agent behaviour
Prompt filters and DLP controls inspect input and output, but they do not govern the agent’s internal decision path. An AI agent can still choose tools, combine API calls, and persist through a workflow even when the prompt itself looks harmless. That is why the interview frames the real risk as logic, context, and action chaining rather than content injection alone. The control gap is structural: if the system can decide how to complete a task, then the security boundary has moved from text to behaviour.
Practical implication: security teams need telemetry and policy enforcement around actions, not just prompt content.
AI agent autonomy and runtime decision-making
An autonomous agent is different from ordinary automation because it makes runtime decisions about what to do next, which tools to use, and when to act. Once that behaviour is present, static approval models become weaker because the risky step may occur several calls deep inside a task chain. The interview’s emphasis on agent behaviour reflects a shift from access control as a provisioning event to access control as an ongoing runtime condition. This is why agent governance cannot be reduced to chatbot moderation.
Practical implication: classify agent workflows by decision authority before assigning them human-style approval workflows.
Workflow telemetry, context switching, and auditability
Agent governance depends on seeing the full task chain, including tool selection, context switches, and downstream API calls. Without that telemetry, teams cannot reconstruct what the agent accessed, why it acted, or how far the action propagated. The article’s procurement example shows how a vague prompt can trigger a long sequence of legitimate calls that still produce a harmful outcome. Visibility must therefore cover the whole workflow, not a single request-response pair.
Practical implication: instrument end-to-end workflow logs so investigations can trace decisions across the full agent session.
Threat narrative
Attacker objective: The attacker aims to turn a trusted agent workflow into a path for data exposure, fraud, or credential-led abuse.
- entry: a normal-looking prompt or email reaches an AI agent that already has valid workflow access.
- escalation: the agent chains API calls and tool use in response to the prompt, expanding the action beyond the original intent.
- impact: sensitive supplier or financial data is exposed or redirected through the agent’s downstream actions.
Breaches seen in the wild
- Moltbook AI agent keys breach — Moltbook breach exposed 1.5M AI agent keys.
- AI LLM hijack breach — attackers used stolen AWS access keys to hijack Anthropic LLM models on Bedrock.
Read our 52 NHI Breaches Analysis report for a comprehensive view of breaches impacting Non-Human Identities including AI Agents.
NHI Mgmt Group analysis
AI agent governance fails when security still treats the prompt as the control surface. The interview correctly identifies that the real risk sits in behaviour, context, and tool chaining, not just in content filtering. That means the control problem is no longer whether a prompt is acceptable, but whether the agent’s runtime actions are observable and governable. Practitioners should treat agent logic as the security boundary.
Autonomous behaviour collapses the assumption that access can be reviewed after it is used. Access review was designed for conditions where privilege persists long enough to be observed, certified, and revoked. That assumption fails when the actor decides and executes within the same session, because the meaningful risk may be completed before the review cycle even starts. The implication is that governance must be rebuilt around runtime authority, not periodic certification.
Agent telemetry is becoming the missing identity control for AI systems. If teams cannot see tool selection, context switching, and downstream API activity, then they cannot distinguish intended behaviour from abuse. This is not a monitoring nicety but a prerequisite for identity governance in autonomous environments. Practitioners should treat observability as part of the control plane, not the reporting layer.
The category boundary between NHI and autonomous identity is already visible in this problem. The interview shows that AI agents behave more like non-human identities with dynamic decision authority than like fixed automation. Once that boundary is crossed, the old split between access management and security monitoring becomes too narrow to explain risk. Security leaders should align governance models to the actor’s runtime authority, not the product label.
“Identity blast radius” is the right concept for this market. The damage no longer comes from a single compromised credential or misread prompt, but from how far an agent can travel through trusted systems once it begins chaining actions. That blast radius is defined by tool scope, context persistence, and downstream privileges. Practitioners should measure the reach of agent actions, not just the sensitivity of the initial input.
From our research:
- 85% of organisations lack full visibility into third-party vendors connected via OAuth apps, with 38% having no or low visibility and 47% having only partial visibility, according to The State of Non-Human Identity Security.
- That same research found only 1.5 out of 10 organisations are highly confident in their ability to secure NHIs, which shows the visibility gap is part of a broader governance deficit.
- For the wider identity context, see Top 10 NHI Issues for the control patterns most often missed when non-human access expands faster than oversight.
What this signals
Identity blast radius is now the most useful way to think about AI agent risk. When an agent can chain tools and continue acting after the initial prompt, the effective privilege boundary is no longer the permission set alone but the reachable set of downstream actions. Security teams should prepare to measure agent reach, not just access grants.
With 98% of organisations planning to deploy even more AI agents in the next 12 months and 80% already seeing out-of-scope behaviour, governance will be tested by scale before it is fully mature. That makes runtime telemetry and action-level policy control a programme priority, not an advanced capability.
Security teams that already struggle with third-party OAuth visibility should expect the same blind spot to appear in agent workflows unless they redesign identity oversight for runtime behaviour. The underlying problem is familiar, but the actor is now more dynamic, more autonomous, and harder to certify after the fact.
For practitioners
- Instrument agent workflows end to end Log tool selection, context switches, API calls, and downstream side effects so investigations can reconstruct the full agent session instead of a single prompt exchange.
- Move policy enforcement to runtime behaviour Define behavioural boundaries for each agent workflow and block actions that exceed the approved task chain, rather than relying on prompt filtering alone.
- Classify agent authority before deployment Document which agents can act independently, which require human approval, and which should remain constrained to pre-approved workflows before they are put into production.
- Audit full workflow chains, not isolated prompts Test the complete sequence from initial request through API execution and data return so security reviews capture the real impact path of the agent.
- Treat AI agents as non-human identities Apply identity governance, access review, and lifecycle thinking to agents as runtime actors with behaviour that can change after deployment.
Key takeaways
- AI agents create an identity governance problem because they can change behaviour after deployment and act beyond the prompt that initiated them.
- The evidence is already broad, with organisations reporting out-of-scope agent behaviour, limited audit visibility, and growing deployment pressure.
- The control shift is clear: teams need runtime telemetry, behavioural policy, and lifecycle governance for non-human actors, not just prompt filtering.
Standards & Framework Alignment
This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.
OWASP Agentic AI Top 10 and OWASP Non-Human Identity Top 10 address the attack and risk surface, while NIST AI RMF set the governance and control requirements practitioners need to meet.
| Framework | Control / Reference | Relevance |
|---|---|---|
| OWASP Agentic AI Top 10 | A2 | The article centres on agent behaviour, tool chaining, and runtime autonomy. |
| NIST AI RMF | AI governance needs accountability, monitoring, and lifecycle controls for autonomous behaviour. | |
| OWASP Non-Human Identity Top 10 | NHI-03 | Agent access and behaviour resemble non-human identity lifecycle and privilege problems. |
Treat AI agents as NHIs and review their access, telemetry, and lifecycle boundaries regularly.
Key terms
- AI Agent Governance: AI agent governance is the set of controls used to define, monitor, and limit what an agent can do at runtime. It combines identity, access, policy, and telemetry so that behaviour stays inside an approved operating boundary even when the agent makes decisions independently.
- Runtime Telemetry: Runtime telemetry is the operational data that shows what an identity or system is doing while it is active. For AI agents, it includes tool selection, context switches, API calls, and decision traces that help security teams reconstruct behaviour and prove whether policy was followed.
- Identity Blast Radius: Identity blast radius is the amount of damage an identity can cause once it begins to act outside intended scope. In AI agent environments, it is shaped by tool reach, downstream privileges, and how far actions can propagate before human review intervenes.
- Behavioural Boundary: A behavioural boundary is the set of rules that defines which actions an identity may take, not just which resources it may access. For autonomous or agentic systems, it must be enforced at runtime because the actor can choose different actions as context changes.
Deepen your knowledge
NHI governance, agentic AI identity, and machine identity security are core topics in our NHI Foundation Level course, the industry's only accredited NHI security programme. If you are building identity controls or governance capability across modern infrastructure, it is worth exploring.
This post draws on content published by Zenity: Most Security Teams Have No AI Agent Strategy. Here’s How to Build One. Read the original.
Published by the NHIMG editorial team on 2025-07-11.
NHI Mgmt Group — the independent authority on Non-Human Identity, IAM, and Agentic AI security. nhimg.org