TL;DR: The governance gap is now behavioral, not just permission-based, because access review and audit logs do not capture intent or mid-session drift, according to Zenity. Zenity argues that IAM answers what an AI agent was allowed to access, but not whether its runtime behavior was appropriate, and shows how authorized agents can still trigger security incidents through prompt injection and destructive actions.
At a glance
What this is: This analysis says AI agent risk sits in the gap between permitted access and appropriate runtime behavior, not in authorization alone.
Why it matters: It matters because IAM, NHI, and human governance programmes all need controls that can judge what an agent did in context, not just what it could reach.
By the numbers:
- Machine identities now outnumber human identities at a ratio of 144 to one, up from 92 to one in the prior period.
- A January 2026 MDPI review synthesized 45 sources and documented real-world exploits, finding that indirect prompt injection in agentic systems is harder to detect through access controls alone.
👉 Read Zenity's analysis of why IAM controls miss AI agent risk
Context
AI agent governance starts with a basic problem: authorization can confirm that a principal was allowed to act, but it cannot tell whether the action made sense for the task, the moment, or the surrounding context. That limitation becomes more serious as agents begin chaining tools, processing untrusted content, and operating under service identities that look clean in audit logs.
For IAM, NHI, and agentic AI programmes, the issue is not simply more access. It is a structural visibility gap between access grants and runtime behaviour. When an agent can remain within scope while still exfiltrating data or taking destructive actions, classic entitlement controls stop being enough on their own.
Zenity's analysis uses this gap to show why runtime monitoring must sit alongside identity governance. The practical question is no longer only who or what authenticated, but whether the execution path remained appropriate after the session began.
Key questions
Q: How should security teams govern AI agents beyond standard IAM controls?
A: Security teams should keep IAM as the access baseline, but they need runtime controls that evaluate behaviour after authorization succeeds. The practical approach is to combine identity, data, model, posture, and environment signals so the programme can see whether the agent stayed appropriate in context, not just whether it had permission to act.
Q: Why do AI agents create risk even when they stay within approved permissions?
A: AI agents can be authorised correctly and still produce harmful outcomes because permission is not the same as intent or behavioural appropriateness. If an attacker manipulates the session mid-flight, the agent may keep acting inside scope while exfiltrating data, taking destructive steps, or chaining actions that no human would have approved.
Q: What signals show that an AI agent is operating outside its intended purpose?
A: Look for mismatches across identity, data, model behaviour, posture, and environment. A clean authorization trail is not enough if the agent starts touching unrelated data, follows injected instructions, drifts from its known configuration, or continues acting in a way that does not fit the task.
Q: Who is accountable when an authorised AI agent causes a breach?
A: Accountability usually sits with the organisation that assigned the access, defined the workflow, and failed to instrument runtime oversight. The hard part is proving whether the failure was an entitlement decision, a workflow design issue, or a missing behavioural control, which is why governance ownership must span IAM, security engineering, and application teams.
Technical breakdown
Why authorization and runtime behaviour are different control layers
Authorization answers a static question: can this principal reach this resource under current policy? Runtime behaviour asks a dynamic question: did the principal use that access in a way that matched the task, sequence, and intent of the session? AI agents complicate the boundary because they can combine permitted tools, process content that changes their behaviour, and continue acting without a human review step between decisions. That means a clean authorization decision can still produce unsafe outcomes if the agent is manipulated mid-session. IAM remains necessary, but it no longer provides sufficient security visibility for agentic systems.
Practical implication: pair entitlement checks with runtime telemetry that can judge action sequence, not just access.
Indirect prompt injection and the agentic attack surface
Indirect prompt injection occurs when malicious instructions arrive through content the agent already expects to process, such as documents, calendar items, or other workflow inputs. The agent does not need to be broken at the credential layer for the attack to work. Instead, the attack changes the agent's behaviour from inside the normal execution path, often after legitimate authorization has already succeeded. This is why access controls alone miss the issue. The agent is not escaping policy at the login boundary. It is being steered while still operating inside its permitted scope.
Practical implication: inspect untrusted content paths that can alter agent decisions after access has been granted.
Behavioural monitoring across identity, data, model, posture, and environment
Zenity's five-signal model separates what the agent is, what it touched, how it was influenced, whether its configuration drifted, and whether infrastructure conditions changed the risk profile. Identity tells you which principal is active. Data shows what was accessed. Model behaviour exposes prompt injection or jailbreaking. Agent posture tracks configuration and dependency drift. Environment captures infrastructure signals that change the threat picture. No single signal proves safety on its own. Together they create a runtime picture that authorization logs cannot provide, especially when a small action repeated over many sessions becomes a meaningful breach.
Practical implication: build correlation logic across identity, data, model, posture, and environment before trusting agent activity.
Threat narrative
Attacker objective: The attacker wants to turn legitimate agent access into unauthorized data access or destructive system action without violating the visible authorization chain.
- Entry occurs when the agent receives malicious content through a normal workflow input, such as a document or calendar invite, while still operating under legitimate credentials.
- Credential access is not the initial breach point here. The agent already has valid access, and the attacker instead abuses that standing authorization to steer the session.
- Escalation happens when the manipulated agent begins retrieving records or taking actions beyond the user or operator would have intended, while still appearing policy-compliant in logs.
- Impact emerges as repeated low-volume actions or destructive operations create a material data breach or operational loss without triggering traditional access-control alerts.
Breaches seen in the wild
- Moltbook AI agent keys breach — Moltbook breach exposed 1.5M AI agent keys.
- AI LLM hijack breach — attackers used stolen AWS access keys to hijack Anthropic LLM models on Bedrock.
Read our 52 NHI Breaches Analysis report for a comprehensive view of breaches impacting Non-Human Identities including AI Agents.
NHI Mgmt Group analysis
Authorization is not the same as appropriateness, and agentic AI exposes that gap immediately. IAM was designed to answer whether a principal could access a resource. It was not designed to judge whether an autonomous-seeming execution path remained appropriate after a session began. The practical implication is that agent governance cannot stop at entitlement state.
Gradual exfiltration is the governance failure mode that access controls miss most easily. When an agent can retrieve small amounts of data across many sessions, the individual actions remain inside ordinary thresholds while the aggregate becomes a breach. That is a behavioural blind spot, not a policy typo. Practitioners should treat low-volume repetition as a distinct failure mode rather than a benign pattern.
Runtime behaviour, not identity alone, now determines whether an AI agent is safe to operate. The article's five-signal framing maps to a broader identity security reality: access logs are necessary evidence, but they are not sufficient evidence. The implication is that governance programmes need runtime context, or they will keep certifying clean identities that produce unsafe outcomes.
Ephemeral task access is not a complete answer when the agent can be manipulated after access is granted. The security question is no longer only whether access was short-lived. It is whether the execution path stayed aligned with the task once the agent started acting. That makes runtime monitoring a governance requirement, not an optional detective layer.
Identity blast radius is the right named concept for this problem. The blast radius is no longer defined only by how much a principal can reach at provisioning time. It is defined by how far a manipulated agent can push authorised access into unintended outcomes during execution. The practical implication is that teams must measure not just privilege scope, but the consequence of in-session behavioural drift.
From our research:
- Machine identities now outnumber human identities at a ratio of 144 to one, up from 92 to one in the prior period, according to Ultimate Guide to NHIs.
- Another finding from NHI Mgmt Group research shows that 80% of organisations report AI agents have already acted beyond their intended scope, including access to unauthorised systems, sensitive data sharing, and credential exposure.
- That is why practitioners should also review 52 NHI Breaches Analysis for recurring failure modes that turn authorised access into operational impact.
What this signals
Identity blast radius: the useful way to think about agentic AI is not how many permissions it has, but how much damage a manipulated session can create before a human notices. The current wave of agent deployment means identity teams need runtime observability that spans access, data, and model behaviour, not just entitlement hygiene.
With 80% of organisations reporting AI agents acting beyond intended scope, per AI Agents: The New Attack Surface report, the governance problem is already operational, not theoretical. Teams that still treat agents like ordinary service accounts will miss the behavioural drift that turns approved access into breach material.
For IAM and NHI teams, the programme implication is clear: access review and authorization remain necessary, but they cannot be the only controls. Behavioral telemetry, task scoping, and post-access correlation need to become part of the operating model, especially where agents can process untrusted inputs and chain tool actions across multiple systems.
For practitioners
- Map agent workflows to untrusted-content entry points Identify documents, calendar items, chat inputs, and other workflow sources that can alter agent behaviour after authentication. Prioritise paths where the agent has access to customer data, internal knowledge bases, or action-taking tools. Treat these as runtime steering surfaces, not just data inputs.
- Correlate identity, data, model, posture, and environment signals Build a detection pipeline that joins who the agent is, what it touched, whether prompt injection was detected, whether its posture drifted, and whether the environment changed the risk profile. Use the combined view to flag safe-looking sessions that become unsafe in context.
- Review low-volume repetition as a breach pattern Look for repeated small retrievals, exports, or tool calls across many sessions. A single action may look harmless, but aggregate behaviour can represent slow exfiltration that never crosses classic threshold-based alerts.
- Separate authorization approvals from runtime acceptability checks Keep access grants, policy approval, and runtime behaviour review as distinct control points. If a session remains in scope but the resulting actions become inappropriate, the control failure is behavioural, not entitlement-based.
Key takeaways
- AI agent risk is not solved by authorization alone, because permitted access can still produce inappropriate and harmful runtime behaviour.
- The article points to a measurable blind spot: 144 machine identities for every human identity, which makes behavioural oversight harder to operationalise at scale.
- Practitioners need runtime monitoring that joins identity, data, model, posture, and environment signals, or clean audit logs will continue to mask unsafe agent actions.
Standards & Framework Alignment
This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.
OWASP Agentic AI Top 10 and OWASP Non-Human Identity Top 10 address the attack and risk surface, while NIST CSF 2.0 set the governance and control requirements practitioners need to meet.
| Framework | Control / Reference | Relevance |
|---|---|---|
| OWASP Agentic AI Top 10 | AA-03 | Agent behaviour drift and tool misuse are central to this article. |
| OWASP Non-Human Identity Top 10 | NHI-04 | The article focuses on access scope versus actual use of non-human identities. |
| NIST CSF 2.0 | PR.AC-4 | Access permissions are necessary but insufficient for agentic governance. |
Map agent access to least privilege, then add monitoring that validates behaviour after authorization.
Key terms
- Authorization: Authorization is the control decision that determines whether an identity may access a resource or perform an action. For AI agents, that decision is only the starting point because a valid permit does not prove the later behaviour was appropriate or safe.
- Runtime behaviour: Runtime behaviour is what an agent actually does after access has been granted, including tool use, data access, sequencing, and response to untrusted inputs. In agentic systems it is often the real security signal, because harm can occur without any authorization failure.
- Indirect prompt injection: Indirect prompt injection is a technique where malicious instructions are embedded in content an agent already expects to process. The attack works through normal workflows, which makes it dangerous because the identity chain can remain clean while the agent is steered into unsafe actions.
- Identity blast radius: Identity blast radius is the amount of harm an identity can create when access is misused or manipulated. For AI agents, the blast radius includes not just reachable systems, but the consequences of chained actions, repeated small exfiltration, and in-session behavioural drift.
Deepen your knowledge
AI agent authorization and runtime behavior are core topics in our NHI Foundation Level course, the industry's only accredited NHI security programme. If you are building governance for agentic systems from the same starting point, it is worth exploring.
This post draws on content published by Zenity: The Authorization Trap, why your IAM controls don't cover AI agent risk. Read the original.
Published by the NHIMG editorial team on 2026-05-20.
NHI Mgmt Group — the independent authority on Non-Human Identity, IAM, and Agentic AI security. nhimg.org