LLM security risks expose the limits of traditional IAM controls

By NHI Mgmt Group Editorial TeamPublished 2026-04-11Domain: Agentic AI & NHIsSource: WitnessAI

TL;DR: LLMs create runtime security gaps that traditional tools were never built to handle, with 29% of cybersecurity leaders reporting an attack on enterprise GenAI infrastructure, according to WitnessAI. The practical issue is not model quality alone but control-plane failure: access, output, and action governance must move outside the model and into runtime enforcement.

At a glance

What this is: This article breaks down the OWASP Top 10 risks for LLMs and shows why conventional security controls miss the runtime, context, and action risks created by AI systems.

Why it matters: It matters because IAM, PAM, and security teams now have to govern what models can see, say, and do across NHI, autonomous, and human workflows, not just protect credentials and apps.

By the numbers:

29% of cybersecurity leaders said their organizations had experienced an attack on enterprise GenAI infrastructure.
72% of organisations have experienced or suspect they have experienced a breach of non-human identities, 46% confirmed and 26% suspected.

👉 Read WitnessAI's breakdown of the OWASP Top 10 risks for LLMs

Context

LLM security is the problem space here, and its core issue is simple: these systems accept natural language, retrieve context, and trigger downstream actions in ways that conventional application security was never designed to govern. For IAM teams, the practical question is not whether the model is clever enough, but whether identity controls can keep up with runtime behaviour, prompt handling, and agentic action.

That gap shows up across non-human identity, agentic AI, and human usage. Service accounts, tokens, and tool-connected agents all become part of the control plane once the model can query systems or execute workflows, which is why runtime policy and least privilege matter more than static perimeter controls.

The source article is typical of the current market conversation: LLM risk is no longer being described as an abstract research concern, but as a governance problem that touches discovery, inspection, authorisation, and action control at the same time.

Key questions

Q: How should security teams govern LLMs that can call tools and APIs?

A: Treat LLMs with tool access as privileged runtime identities. Map every connected credential, require independent authorisation for sensitive actions, and inspect both prompts and outputs before downstream systems act. The model should recommend, but it should not be the final trust decision for data access, record changes, or external communications.

Q: Why do LLMs create more risk than ordinary application workloads?

A: LLMs do not just process input, they interpret natural language, retrieve context, and may trigger actions based on that interpretation. That means a malicious prompt can become an execution path, especially when the model has access to data stores, tools, or business systems. Identity scope and runtime policy therefore matter more than static perimeter controls.

Q: What breaks when prompt injection meets excessive agency?

A: The model can turn attacker-controlled text into privileged action. If the system already has permissions to write, send, delete, or query, then the injected instruction can flow through legitimate access and create real-world impact without a separate approval step. The failure is not only in the prompt, but in the over-scoped execution path behind it.

Q: How do teams decide whether an AI agent needs human approval?

A: Use the sensitivity of the action, not the cleverness of the model, as the decision point. If the agent can change records, move funds, send external messages, or access regulated data, human approval or an independent policy engine should remain in the path. The more irreversible the action, the less autonomy the agent should have.

Technical breakdown

Prompt injection and the instruction-data boundary

Prompt injection works because LLMs process instructions and content in the same conversational channel. A malicious prompt can be hidden in a document, web page, email, or direct user input, and the model may treat it as a higher-priority instruction. This is different from classic injection flaws because the attacker is not always breaking syntax, they are exploiting how the model interprets context. Indirect injection is especially hard to spot because the malicious instruction often arrives through normal business content rather than the user-facing prompt.

Practical implication: inspect inputs at the interaction layer, not only at the application edge, because the malicious instruction may arrive inside trusted content.

Excessive agency and downstream action risk

Excessive agency occurs when an LLM or agent can do more than the task requires. OWASP breaks this into excessive functionality, excessive permissions, and excessive autonomy over high-impact actions. Once an agent can write to a repository, send messages, call APIs, or modify records, a manipulated output can become an irrevocable action at machine speed. The core technical issue is not just model output quality, but whether the connected systems trust that output as if it were an approved instruction.

Practical implication: put independent authorisation checks in downstream systems so the model cannot execute sensitive actions on its own output alone.

Vector store leakage and retrieval-layer weakness

RAG systems add a retrieval layer that can be poisoned or misconfigured. Corpus poisoning inserts malicious material into the vector store, embedding manipulation distorts similarity results, and multi-tenant leakage exposes one tenant’s data to another through weak access controls. These failures are architectural, not prompt-level, because the model can retrieve compromised context before generation even begins. That makes access control on embeddings and indexes as important as access control on the model itself.

Practical implication: treat embedding indexes and vector stores as protected identity-bearing assets with the same access discipline as sensitive data stores.

Threat narrative

Attacker objective: The attacker aims to turn ordinary model interaction into unauthorised data access, harmful system actions, or policy bypass through trusted AI pathways.

Entry begins with prompt injection, where an attacker hides malicious instructions in content the model will later consume as normal workflow input.
Credential or control access is achieved when the model or agent accepts that instruction and uses its legitimate permissions to reach tools, databases, or communication systems.
Escalation occurs when excessive agency lets the manipulated model perform high-impact actions without a separate approval gate, making the output itself operationally dangerous.
Impact follows through data exposure, unauthorized actions, or operational disruption when downstream systems trust model-generated output as if it were already authorised.

Moltbook AI agent keys breach — Moltbook breach exposed 1.5M AI agent keys.
AI LLM hijack breach — attackers used stolen AWS access keys to hijack Anthropic LLM models on Bedrock.

Read our 52 NHI Breaches Analysis report for a comprehensive view of breaches impacting Non-Human Identities including AI Agents.

NHI Mgmt Group analysis

Runtime AI security is now an identity problem, not just an application problem. Once an LLM can read files, call tools, query databases, or trigger workflows, its security posture depends on the same governance questions IAM teams already ask of service accounts and privileged workloads. The difference is that the model can combine those permissions dynamically at runtime, which means static controls alone are not enough. Practitioners should treat AI runtime as part of the identity plane, not as a separate security island.

LLM security exposes an identity blast radius problem that traditional control stacks understate. Prompt injection does not need to defeat the model if the model already has broad downstream permissions. The practical failure mode is over-scoped access paired with untrusted output, which creates a direct path from text manipulation to business action. IAM and PAM teams should read this as a warning that access scope, not just model quality, determines real-world exposure.

Policy-driven governance is the named concept this article makes unavoidable. LLM deployments fail when teams assume perimeter tools can interpret conversational intent, enforce action boundaries, and stop risky outputs in the same way they stop known malicious traffic. They cannot. That assumption breaks because the control decision must happen at runtime, inside the context of model interaction. Practitioners should stop treating AI security as a bolt-on filter problem and start treating it as governed execution.

OWASP's LLM risk model is effectively an identity lifecycle model for AI systems. The risks span discovery, exposure, provisioning of capabilities, misuse at runtime, and unbounded consumption, which mirrors how mature identity programmes think about joiner, mover, and leaver states. The difference is that LLMs can shift from safe to unsafe inside a single interaction. Security leaders should use that framing to align AI governance with existing identity processes rather than inventing a separate language for the same control problem.

Agentic systems collapse the line between recommendation and execution. Excessive agency turns model output into action, which means the system is no longer just advising a human operator. That matters for accountability, because the risk now sits in the delegation chain between human intent and machine execution. Teams should treat any AI-connected action path as a privileged workflow until proven otherwise.

From our research:
The average organisation believes more than 1 in 5 of their non-human identities are insufficiently secured, according to The 2024 ESG Report: Managing Non-Human Identities.
72% of organisations have experienced or suspect they have experienced a breach of non-human identities, 46% confirmed and 26% suspected.
That same research shows enterprises that experienced a compromised NHI averaged 2.7 separate incidents in the past 12 months, which reinforces why runtime identity controls matter.

What this signals

Policy-based AI governance is becoming the default operating model for enterprise security teams. The next control conversation is not whether to allow AI, but how to constrain what it can see, say, and do in the moment of use. That points directly to the OWASP NHI Top 10 and similar runtime frameworks, because the control plane has moved closer to the interaction itself.

Identity programmes should expect AI systems to inherit the same governance failures that service accounts already show at scale. When organisations leave permissions broad, ownership unclear, or retrieval layers exposed, model behaviour becomes just another path to overreach. The practical response is to connect AI governance to the same entitlement, review, and offboarding disciplines used for NHI lifecycles.

More than 1 in 5 non-human identities are already seen as insufficiently secured, according to our 2024 ESG Report: Managing Non-Human Identities. That figure is a warning sign for AI programmes, because model-connected tokens and agent permissions will only expand the attack surface if they are managed as ordinary app configuration. Teams should assume their first AI control gap will look like an identity gap.

For practitioners

Map every AI-connected identity and token Inventory service accounts, API keys, and delegated tokens used by LLM apps and agents, then classify which systems can read data, write data, or trigger workflows. Tie each credential to an owner and a business purpose, and remove any access that is not explicitly needed for the model’s current task.
Enforce bidirectional inspection at runtime Inspect both prompts and responses before the model or downstream system can act on them. Filter malicious instructions, redact sensitive data, and block outputs that do not match approved schemas or policy boundaries.
Separate model output from authorisation Require independent approval checks in any system that receives LLM-generated commands, especially where the model can write records, send messages, or invoke APIs. Treat the model as an untrusted recommender, not an implicit decision authority.
Harden retrieval infrastructure as protected data Apply strict access control to embedding indexes, vector stores, and RAG knowledge bases, then monitor them for poisoning and cross-tenant leakage. If the retrieval layer is compromised, the model can be misled before generation even starts.
Review AI autonomy before expanding permissions Before giving an agent more tools, check whether the task truly requires autonomy or only assistance. If human review is still the control point, keep high-impact actions outside the agent’s direct execution path.

Key takeaways

LLM risk is an identity and runtime governance problem, not only a model safety problem.
The biggest failures come from prompt injection, excessive agency, and weak retrieval-layer controls that let text become action.
Practitioners should move AI control points outside the model and apply least privilege, inspection, and independent authorisation at runtime.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Agentic AI Top 10 and OWASP Non-Human Identity Top 10 address the attack and risk surface, while NIST CSF 2.0 set the governance and control requirements practitioners need to meet.

Framework	Control / Reference	Relevance
OWASP Agentic AI Top 10	LLM01	Prompt injection and excessive agency are core agentic AI threats.
OWASP Non-Human Identity Top 10	NHI-01	Model-connected tokens and service identities behave like NHIs.
NIST CSF 2.0	PR.AC-4	Access permissions must constrain AI-connected actions and data access.

Place runtime inspection and authorization between model output and any executable action.

Key terms

Prompt Injection: Prompt injection is a technique that embeds malicious instructions into text, documents, or other content that an LLM later processes as if it were legitimate input. The control failure is not syntax alone. It is the model’s inability to reliably separate data from instruction in a shared natural-language channel.
Excessive Agency: Excessive agency is the condition where an AI system can do more than the assigned task requires, especially when it can call tools or trigger actions with broad permissions. In practice, it turns model output into operational risk because the system can act beyond the intended scope of human oversight.
Retrieval-Layer Weakness: Retrieval-layer weakness is the exposure created by RAG indexes, vector stores, and embedding systems when they can be poisoned, misconfigured, or accessed across tenants. It matters because the model may retrieve compromised context before it generates an answer, making the control issue part of the data plane, not just the prompt plane.
Runtime Policy Enforcement: Runtime policy enforcement is the practice of checking AI interactions as they happen, before prompts, outputs, or tool calls are allowed to proceed. It is different from static configuration because it evaluates context, intent, and action risk at the moment of execution, when the real decision is being made.

Deepen your knowledge

LLM runtime governance and agentic access control are covered in our NHI Foundation Level course, the industry's only accredited NHI security programme. If you are extending IAM into AI-connected workflows, it is a practical place to start.

This post draws on content published by WitnessAI: the OWASP Top 10 risks for LLMs and how to defend against them. Read the original.

NHIMG Editorial Note
Published by the NHIMG editorial team on 2026-04-11.
NHI Mgmt Group — the independent authority on Non-Human Identity, IAM, and Agentic AI security. nhimg.org