Why traditional access controls fail in LLM deployments

By NHI Mgmt Group Editorial TeamPublished 2025-08-12Domain: Agentic AI & NHIsSource: Pomerium

TL;DR: Prompt-driven LLM apps outgrow static API keys and coarse IAM roles because a single session can reach model endpoints, vector stores, and downstream tools, according to Pomerium. The security problem is not the model alone, but the lack of continuous, identity-aware policy at the prompt boundary.

At a glance

What this is: This is an analysis of why traditional access controls fail in LLM deployments and why prompt-aware, identity-based enforcement is needed.

Why it matters: It matters because IAM, NHI, and autonomous-system programmes all need to control how prompts, tokens, and tool calls inherit identity and policy across every request.

By the numbers:

80% of identity breaches involved compromised non-human identities such as service accounts and API keys.
97% of NHIs carry excessive privileges, increasing unauthorised access and broadening the attack surface.
Only 5.7% of organisations have full visibility into their service accounts.
96% of organisations store secrets outside of secrets managers in vulnerable locations including code, config files, and CI/CD tools.

👉 Read Pomerium's analysis of why traditional access controls fail in LLM deployments

Context

Prompt-driven applications create an identity problem that older access models were not designed to solve. When a single prompt can drive multiple tool calls, access to the model is only one part of the control plane. The primary keyword here is LLM deployments, and the governance gap is the lack of identity-aware policy between the user request and the backend systems it can reach.

The issue is familiar to IAM and NHI teams even when the technology is new: static credentials and coarse roles assume the request path is stable and predictable. In LLM deployments, that assumption breaks because the prompt can be rephrased, routed, or chained into downstream actions before traditional filters or coarse authorisation checks see the full context.

Key questions

Q: How should security teams enforce access control in LLM deployments?

A: Security teams should enforce access control before the prompt reaches the model or any downstream tool. The control should bind user identity, session claims, and policy to each request, so the system can decide whether retrieval, execution, or data access is allowed in context. Static API keys are too broad for that job.

Q: Why do static API keys create risk in prompt-driven applications?

A: Static API keys create risk because they usually grant broad, persistent reach across multiple backend services. In prompt-driven applications, that means one credential can unlock the model, the vector store, and tool APIs. If the prompt is misused, the key turns an input problem into unauthorized access and execution.

Q: What breaks when prompt output is trusted without validation?

A: When prompt output is trusted without validation, downstream systems can execute commands that were never separately authorised. That breaks the boundary between model generation and operational action. The failure is especially dangerous when the model can trigger database queries, tool calls, or workflow steps that carry sensitive access.

Q: How do organisations know whether LLM access controls are actually working?

A: They should verify that every request is evaluated with identity context, that tool access is logged, and that rephrased prompts cannot reach data outside the caller's scope. If a user can change phrasing and still cross an access boundary, the control is not working as intended.

Technical breakdown

Why static API keys fail in LLM deployments

A bearer token gives the caller whatever the backend decides that token can reach, which is too broad for prompt-driven systems. In LLM deployments, the same credential may expose the model endpoint, the retrieval layer, and tool APIs. That means a single compromised or overbroad token becomes an authorization bridge across multiple assets. The problem is not just secret leakage, but the mismatch between coarse credential scope and the fine-grained decisions required at runtime. Identity-aware policy must therefore sit closer to the request path than the model itself.

Practical implication: replace long-lived bearer access with request-time identity checks before prompts can reach model or tool endpoints.

How prompt injection turns access control into an execution problem

Prompt injection works because the model can be induced to produce or trigger actions that were never intended by the operator. If downstream systems trust model output as if it were a vetted command, access control has effectively been moved from policy to inference. OWASP’s LLM risk categories highlight this failure mode in both input and output handling. The core issue is that the system no longer knows whether the action came from a legitimate user intent or from content embedded in the prompt.

Practical implication: validate model outputs as untrusted input and enforce policy before any downstream execution.

Why identity must travel with every LLM request

LLM traffic is stateful even when the underlying APIs look stateless. A single session may traverse chat, retrieval, database queries, and external tools, and each step can carry different risk. If identity and session context are stripped at the first hop, every later decision becomes blind. That creates a shallow gateway problem: the system permits a request once and then loses the ability to distinguish safe from unsafe follow-on actions. Continuous verification is the only way to preserve authorisation context across the chain.

Practical implication: bind user and session claims to every request, and log each decision for audit and incident response.

Threat narrative

Attacker objective: The objective is to turn a normal prompt interaction into unauthorised access to internal data or tool execution without tripping the intended policy boundary.

Entry occurs when an attacker or untrusted prompt reaches an LLM workflow through a broadly scoped bearer token or unauthenticated tool path.
Escalation happens when the model is induced to call retrieval systems or downstream tools that were not meant to be reachable from that prompt context.
Impact follows when the workflow exposes sensitive internal data or executes an unintended database action, creating data leakage or unauthorised command execution.

Moltbook AI agent keys breach — Moltbook breach exposed 1.5M AI agent keys.
AI LLM hijack breach — attackers used stolen AWS access keys to hijack Anthropic LLM models on Bedrock.

Read our 52 NHI Breaches Analysis report for a comprehensive view of breaches impacting Non-Human Identities including AI Agents.

NHI Mgmt Group analysis

Traditional IAM assumes a request is evaluated once and then safely reused. That assumption was designed for bounded API traffic, not prompt-driven sessions that fan out into multiple tools and data sources. In LLM deployments, the same identity can be reinterpreted through retrieval, tool invocation, and output handling in ways coarse roles do not model. The implication is that the access decision must move from static entitlement thinking to per-request identity enforcement.

Prompt-driven systems create an identity blast radius that is larger than the model endpoint. The article shows that the real control problem is not whether a user can call the model, but whether that call can reach a vector store, database, or tool chain with inherited privilege. That makes the prompt boundary a governance boundary. Practitioners need to treat every downstream hop as part of the same authorization problem, not as separate technical layers.

Identity-aware policy is the missing control plane for LLM deployments. Prompt filters can inspect content, but they cannot by themselves tell whether a request should be allowed to query PHI-tagged documents, call an MCP-style tool, or execute a downstream action. NHI Mgmt Group sees this as a policy placement issue, not a model safety issue. The practitioner conclusion is that access control must be evaluated before the prompt reaches any model or backend service.

OWASP LLM risks and NHI governance now intersect at the same failure point. Prompt injection, insecure output handling, and overprivileged service access all collapse into one problem when credentials are too broad and policy is too late. This is where identity governance stops being an admin function and becomes runtime risk management. The practitioner conclusion is to align authorization, logging, and tool access around the request path, not around the application label.

Prompt workflows expose a new named concept: identity-aware prompt routing. This is the discipline of binding user identity, session claims, and policy enforcement to each LLM request before the model or tools act on it. It matters because the route, not just the model, now determines what data and actions are reachable. The practitioner conclusion is that routing and authorization must be designed together.

From our research:
80% of identity breaches involved compromised non-human identities such as service accounts and API keys, according to the Ultimate Guide to NHIs.
Only 5.7% of organisations have full visibility into their service accounts, which makes prompt-adjacent credential sprawl harder to govern at scale.
For a broader control baseline, OWASP Non-Human Identity Top 10 is the natural next reference for secret scope and overprivilege.

What this signals

Identity-aware prompt routing: LLM programmes now need a control plane that evaluates identity before a prompt is allowed to reach retrieval or execution. The operational signal to watch is whether a user can rephrase a request and still cross the same boundary, because that is where coarse IAM starts to fail. For the underlying access model, align the design with OWASP Non-Human Identity Top 10 and keep the policy decision at the edge.

The governance signal is broader than GenAI alone. With 97% of NHIs carrying excessive privileges in our research base, the pattern here is familiar: broad standing access invites unintended reach once a new workload style appears. That means LLM deployments should be reviewed alongside service account scope, tool permissions, and secret handling, not as a separate innovation project.

Practitioners should expect identity review cycles to move closer to runtime for AI-enabled workflows. If the same prompt can trigger multiple backend actions, then access review evidence has to include request logs, identity claims, and downstream tool decisions. That is a stronger operating model than perimeter filtering because it preserves attribution across the whole request path.

For practitioners

Move authorization to the edge of the prompt path Evaluate identity and policy before the request reaches the model, retrieval layer, or tool endpoint. Use signed claims, route-level rules, and audit logging so the backend never sees an unqualified prompt.
Separate model access from data access Do not let one token implicitly cover the model, vector store, and downstream tools. Split privilege so retrieval and execution require their own checks and cannot be inherited from a single bearer credential.
Treat model output as untrusted input Validate commands and structured outputs before any downstream system acts on them. If the model can emit executable instructions, those instructions need policy validation just like external user input.
Log every request decision for auditability Record the user identity, session attributes, policy result, and tool access outcome for each LLM call. That creates the evidence trail needed to investigate prompt abuse, data exposure, and policy drift.

Key takeaways

LLM deployments fail when static credentials and coarse roles are asked to govern prompt-driven workflows that can fan out across multiple backend systems.
The control gap is structural, not cosmetic, because identity context must survive every hop from prompt to retrieval to tool execution.
Practitioners should move authorization to the edge, validate model output, and log every access decision so prompt abuse cannot become implicit privilege.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Non-Human Identity Top 10 and OWASP Agentic AI Top 10 address the attack and risk surface, while NIST Zero Trust (SP 800-207) set the governance and control requirements practitioners need to meet.

Framework	Control / Reference	Relevance
OWASP Non-Human Identity Top 10	NHI-03	Static credentials and overbroad access are central to the article's risk model.
NIST Zero Trust (SP 800-207)	PR.AC-4	The article centers on continuous identity-aware decisions at the request edge.
OWASP Agentic AI Top 10	LLM01	Prompt injection is explicitly cited as a core failure mode in the article.

Reduce standing access for LLM workflows and scope credentials to the minimum callable resource.

Key terms

Identity-aware prompt routing: A control pattern that ties each LLM request to verified identity and policy before the prompt reaches a model or tool. It prevents access decisions from being made too late in the workflow and keeps retrieval, execution, and logging tied to the original caller context.
Prompt injection: A manipulation technique that uses crafted input to alter what an LLM says or does. In operational systems, the risk is not just bad output but unintended access or execution when downstream services trust the model's response without a separate policy check.
Standing privilege: Persistent access that remains available without needing a fresh, task-scoped approval. In LLM deployments, standing privilege becomes more dangerous because a single prompt session can reach multiple assets, turning broad entitlement into a larger and harder-to-see blast radius.
Bearer token: A credential that grants access to whoever presents it, without rechecking who is using it at each step. In prompt-driven systems, bearer tokens are risky when they implicitly cover the model, retrieval layer, and tools, because they collapse multiple access decisions into one.

Deepen your knowledge

NHI governance, agentic AI identity, and machine identity security are core topics in our NHI Foundation Level course, the industry's only accredited NHI security programme. If you are responsible for identity security strategy or access governance in your organisation, it is worth exploring.

This post draws on content published by Pomerium: Why traditional access controls fail in LLM deployments. Read the original.

NHIMG Editorial Note
Published by the NHIMG editorial team on 2025-08-12.
NHI Mgmt Group — the independent authority on Non-Human Identity, IAM, and Agentic AI security. nhimg.org