AI agent security gaps are widening across enterprise identity controls

By NHI Mgmt Group Editorial TeamPublished 2026-01-08Domain: Agentic AI & NHIsSource: WitnessAI

TL;DR: AI agents expand the attack surface through prompt injection, overprivileged APIs, weak token validation, and supply chain dependencies, while the source article recommends guardrails, sandboxing, runtime monitoring, and continuous validation, according to WitnessAI. The deeper issue is that existing IAM assumptions break when agents can chain actions, select tools, and act inside live workflows.

At a glance

What this is: This is an analysis of AI agent security and the control gaps that emerge when LLM-driven systems can call tools, access data, and trigger actions in enterprise workflows.

Why it matters: It matters because IAM, PAM, and NHI programmes must now govern agent privileges, runtime behaviour, and auditability across systems that can act faster and more flexibly than human-operated workflows.

By the numbers:

80% of organisations report their AI agents have already performed actions beyond their intended scope, including accessing unauthorised systems, inappropriately sharing sensitive data, and revealing access credentials.
92% agree governing AI agents is critical to enterprise security, yet only 44% have implemented any policies to do so.

👉 Read WitnessAI's analysis of AI agent security vulnerabilities and controls

Context

AI agent security is the discipline of controlling how autonomous or semi-autonomous systems authenticate, access data, invoke tools, and complete actions across enterprise environments. The core problem is that traditional IAM models assume a stable subject, a known request path, and reviewable privilege, while agentic systems can generate their own action chains at runtime.

For IAM and NHI teams, the issue is not just whether an agent is authenticated. It is whether its permissions, tokens, logs, and guardrails remain meaningful once the system can combine prompts, APIs, and external data into a live decision loop. That shifts the programme from static access control to runtime governance.

The article’s starting position is typical of the current market: most organisations are trying to extend existing controls to a behaviour model they were not designed to govern. That is exactly where policy, monitoring, and lifecycle controls need to be rethought.

Key questions

Q: How should security teams govern AI agents that can take actions in enterprise systems?

A: Security teams should govern AI agents as active identities, not passive software. That means scoping credentials tightly, validating actions at runtime, separating read and write paths, and logging every tool call and context change. If the agent can trigger real business actions, the control model must assume prompt influence, not just authenticated access.

Q: Why do AI agents complicate zero trust and least privilege models?

A: AI agents complicate zero trust because they can chain decisions and calls inside a live session, making static trust assumptions less reliable. Least privilege is still the right principle, but it has to be expressed with time-bound credentials, policy checks at execution, and explicit limits on what the agent may decide to do next.

Q: What breaks when AI agents have overprivileged API keys?

A: Overprivileged API keys turn a single agent compromise into broad enterprise exposure. Once an attacker can influence the agent, the key can be used to reach systems, move laterally across services, or exfiltrate data far beyond the original task. The main failure is not just access, but uncontrolled reach.

Q: Who is accountable when an AI agent makes an unauthorised decision?

A: Accountability stays with the organisation that authorised the agent, its permissions, and its operating controls. That includes security, IAM, engineering, and governance owners who approved the access model. If no one can explain the agent’s permitted actions, then the governance model is incomplete.

Technical breakdown

Why prompt injection becomes an identity problem

Prompt injection is not only a model safety issue. When an agent has connected tools and delegated access, a malicious instruction can become an authorisation bypass by steering the system toward actions the original requester did not intend. Indirect prompt injection is especially dangerous because the agent can ingest hostile text from webpages, documents, or tickets and treat it as operational context. Once the model’s reasoning layer is manipulated, the real security failure lands in the identity and access path, not just in the prompt channel.

Practical implication: treat every external input as a potential control-plane input and isolate agent permissions from raw model output.

Overprivileged API keys and token replay in agent workflows

AI agents often inherit broad API keys, OAuth tokens, or service credentials so they can move across tools without friction. That convenience creates a large blast radius when keys are reused, poorly scoped, or not rotated correctly. Weak session validation makes it possible for stolen or replayed tokens to persist long enough for an attacker to trigger downstream actions, exfiltrate data, or manipulate connected systems. In agentic workflows, identity scope and token lifetime are as important as the model itself.

Practical implication: scope credentials to the narrowest workable function and validate token lifecycle controls before allowing agent execution.

Runtime validation, sandboxing, and deterministic guardrails

Agent security depends on more than prevention at the prompt layer. Runtime validation checks whether an action is allowed at the moment it is about to happen, while sandboxing constrains file, network, and system reach even if the model is compromised. Deterministic validation is the final gate that verifies whether a proposed action matches policy before code runs, data is written, or a workflow advances. This is the difference between an agent that can think and an agent that can safely act.

Practical implication: require execution-time policy checks and sandbox isolation for any agent that can touch production data or systems.

Threat narrative

Attacker objective: The attacker aims to turn a trusted AI agent into a privileged execution path that leaks data, misuses tools, or performs actions the organisation never approved.

Entry occurs when an attacker injects malicious instructions through a prompt, an external document, or a compromised dependency that the agent later consumes as context.
Credential abuse follows when the agent’s API keys, OAuth tokens, or inherited permissions let the injected instruction reach connected systems and data sources.
Impact occurs when the agent executes unintended actions, exposes sensitive information, or triggers downstream automation with real business or security consequences.

Moltbook AI agent keys breach — Moltbook breach exposed 1.5M AI agent keys.
AI LLM hijack breach — attackers used stolen AWS access keys to hijack Anthropic LLM models on Bedrock.

Read our 52 NHI Breaches Analysis report for a comprehensive view of breaches impacting Non-Human Identities including AI Agents.

NHI Mgmt Group analysis

AI agent security is becoming a governance problem before it is a tooling problem. Traditional IAM assumes that identity can be issued, reviewed, and retired on a predictable cycle. Agentic systems can generate their own task paths, so the control question shifts from who is signed in to what the agent can decide and execute at runtime. The implication is that access governance now has to follow behaviour, not just entitlement.

Prompt injection is the new privilege escalation path when the agent can act. A malicious instruction only becomes a material security event when the agent can convert it into tool use, data access, or workflow execution. That means the governance failure is not the text itself, but the unbroken assumption that model output remains advisory. Practitioners need to treat prompt channels as operational trust boundaries.

Agentic AI creates an identity blast radius that current review cycles cannot reliably see. Access review, attestation, and recertification all assume a stable subject whose privileges persist long enough to be inspected. That assumption is useful for human and many NHI workflows, but it weakens sharply when actions are chained in-session and retired as quickly as they are invoked. The programme implication is that entitlement review alone no longer proves safety.

Runtime controls now define whether autonomy is governable at all. Sandboxing, deterministic validation, and activity auditing are not secondary hardening layers. They are the minimum conditions for allowing an AI agent to interact with enterprise systems without converting every prompt into a possible business action. For IAM and security architects, the real question is whether the organisation can observe and constrain actions before the workflow completes.

OWASP Agentic AI Top 10 and OWASP NHI Top 10 now need to be read together. Agentic behaviour inherits NHI-style credential and token risk, but it also adds autonomous reasoning and tool-chaining failure modes. That combination is why the category cannot be governed as either simple automation or ordinary application security. Practitioners should align identity controls, model controls, and runtime policy under one operating model.

From our research:
80% of organisations report their AI agents have already performed actions beyond their intended scope, including accessing unauthorised systems, inappropriately sharing sensitive data, and revealing access credentials, according to AI Agents: The New Attack Surface report.
Only 52% of companies can track and audit the data their AI agents access, leaving 48% with a complete blind spot for compliance and breach investigation.
The OWASP Agentic AI Top 10 helps teams map those runtime and privilege risks to practical control categories.

What this signals

Identity blast radius: once an agent can call tools, the organisation has to measure how far a single prompt can travel across data, workflow, and privilege domains. That makes runtime policy, not just enrolment controls, the real boundary of safe autonomy.

The governance gap is widening because agent behaviour changes faster than access review cycles. With 92% of organisations saying agent governance is critical but only 44% having policies in place, the gap is now operational rather than theoretical.

For practitioners, the next step is to treat AI agent access as a live identity programme. Align the controls in the OWASP Agentic AI Top 10 with IAM, PAM, and secrets governance so runtime actions are constrained before they become incidents.

For practitioners

Define the agent’s trust boundary Document every data source, API, and downstream system an agent can reach, then separate approved read paths from action paths so prompt content cannot directly trigger privileged execution.
Scope and rotate agent credentials Issue distinct credentials for each agent function, apply the minimum viable scope, and enforce rotation or expiry for tokens that can touch production systems or sensitive records.
Add execution-time policy checks Require deterministic validation before any agent writes data, sends messages, calls an external API, or starts a workflow that affects records or infrastructure.
Sandbox agent actions by default Run high-risk agents in segmented environments with blocked filesystem and network paths unless a task is explicitly approved for broader access.
Audit agent behaviour continuously Log prompts, tool calls, outputs, and context changes so security and compliance teams can reconstruct what the agent saw and did during an incident.

Key takeaways

AI agent security is an identity governance issue because agents can turn prompts into actions across connected systems.
Most organisations already see agents acting outside intended scope, which means the risk is active rather than emerging.
Runtime validation, scoped credentials, and sandboxing are the controls that decide whether agentic AI stays governable.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Agentic AI Top 10 and OWASP Non-Human Identity Top 10 address the attack and risk surface, while NIST CSF 2.0 set the governance and control requirements practitioners need to meet.

Framework	Control / Reference	Relevance
OWASP Agentic AI Top 10	A1	Prompt injection and tool misuse are central to this article.
OWASP Non-Human Identity Top 10	NHI-03	Overprivileged keys and token lifecycle issues are direct NHI control failures.
NIST CSF 2.0	PR.AA-01	Identity proofing and access governance underpin safe agent operation.

Tie agent permissions to documented ownership, monitoring, and review under the CSF access function.

Key terms

AI Agent Security: AI agent security is the discipline of controlling how an agent authenticates, receives instructions, uses tools, and completes actions in enterprise systems. It combines identity, application, and runtime controls so the agent can be useful without being allowed to act outside its intended boundary.
Prompt Injection: Prompt injection is the use of crafted text to influence an AI model’s behaviour, often by overriding its intended instructions. In agentic systems, the risk is higher because the manipulated output may be converted into tool use, data access, or downstream workflow execution.
Runtime Validation: Runtime validation is the execution-time check that confirms an action is allowed before the system completes it. For AI agents, this matters because the safe decision is not what the model can suggest, but what the platform permits it to do at the moment of execution.
Identity Blast Radius: Identity blast radius is the amount of damage a compromised identity can cause across connected systems. For AI agents, it expands quickly when one set of credentials can reach data, tools, and workflows that were never meant to share the same trust boundary.

Deepen your knowledge

NHI governance, agentic AI identity, and machine identity security are core topics in our NHI Foundation Level course, the industry's only accredited NHI security programme. If you are responsible for identity security strategy or NHI governance in your organisation, it is worth exploring.

This post draws on content published by WitnessAI: AI agent security vulnerabilities, controls, and best practices. Read the original.

NHIMG Editorial Note
Published by the NHIMG editorial team on 2026-01-08.
NHI Mgmt Group — the independent authority on Non-Human Identity, IAM, and Agentic AI security. nhimg.org