LLM security hinges on runtime controls, not static policy

By NHI Mgmt Group Editorial TeamPublished 2025-09-30Domain: Agentic AI & NHIsSource: Cyera

TL;DR: Governance and runtime protection are paired with a 2025 State of AI Data Security Report, highlighting prompt injection, sensitive data disclosure, excessive agency, and unbounded consumption as the main enterprise risks, according to Cyera. The security model is shifting from policy intent to continuous enforcement because static controls do not reliably contain LLM behaviour in production.

At a glance

What this is: This is an analysis of OWASP’s 2025 LLM Top Ten and the claim that LLM security now depends on continuous runtime control across data, prompts, outputs, and agent actions.

Why it matters: It matters because IAM, NHI, and AI governance teams must decide where policy ends and runtime enforcement begins when AI systems can access data, initiate actions, and expose sensitive information.

By the numbers:

Only 14 percent have implemented automated blocking for autonomous agents.
86 percent of respondents say they’re concerned about AI leaking sensitive data.

👉 Read Cyera's analysis of the 2025 OWASP LLM Top Ten and AI risk controls

Context

LLM security is no longer just a model-quality problem. Once an AI system can read prompts, touch external data, and trigger downstream actions, the governance issue becomes identity, access, and control across the full request path. In practice, the risk sits in the gap between what policy says the system should do and what the model can still do at runtime.

That gap is why OWASP’s 2025 LLM Top Ten keeps recurring around the same failure pattern: untrusted inputs, exposed data, weak output handling, excessive agency, and missing runtime enforcement. For IAM and NHI teams, the question is not whether the AI stack is modern. It is whether the organisation can still define and constrain trust once LLM behaviour is dynamic.

Cyera’s article uses its AI Guardian framing to argue that policy alone is not enough. The operational challenge is to track where data flows, what the model can access, and which actions can be blocked or reviewed before the system turns a prompt into an event.

Key questions

Q: How should security teams implement LLM governance without slowing adoption?

A: Start by governing the model’s access path, not just the application wrapper. Limit what the LLM can read, what it can output, and what actions it can initiate. Then add runtime controls for prompt filtering, data loss prevention, and approval gates so adoption can continue without turning every session into an uncontrolled trust decision.

Q: Why do LLMs create more identity risk than traditional automation?

A: LLMs create more identity risk because they can decide which instructions to follow, what context to use, and which tools to call in real time. Traditional automation follows predefined logic, but an LLM can vary its behaviour from one session to the next, which makes access governance and audit trails harder to predict.

Q: What do security teams get wrong about prompt injection?

A: They often treat prompt injection as a content-filtering problem when it is really an instruction-boundary problem. The weak point is allowing untrusted text to influence control text inside the same execution flow. Strong governance separates sources, labels trust levels, and blocks suspicious content before the model can act on it.

Q: What should organisations do when an LLM can trigger downstream actions?

A: Require explicit policy checks before any action is executed, especially if the action can move data, modify systems, or invoke another tool. Organisations should define allowlists, approval thresholds, and termination conditions so the model cannot turn a simple request into a chain of uncontrolled operations.

Technical breakdown

Prompt injection and instruction isolation

Prompt injection is an input-confusion problem. A malicious user prompt, or hidden instructions inside retrieved content, can compete with the system prompt and steer model behaviour in ways the operator did not intend. The core weakness is that LLMs do not naturally distinguish trusted control text from untrusted content unless the surrounding architecture creates hard separation. That is why runtime filtering, content labelling, and prompt boundary enforcement matter more than simple keyword scanning. The attack surface expands further when external content is passed into the model without provenance checks or source-level trust marking.

Practical implication: separate system instructions from user and retrieved content, and block untrusted inputs before they reach model execution.

Sensitive information disclosure in LLM pipelines

Sensitive information disclosure occurs when training data, context windows, logs, or generated outputs reveal data that should have remained confined. The problem is not only exfiltration. It is also over-broad access, where the model can see more data than it needs to complete the task. In LLM systems, access scope and output handling are inseparable because the model can leak what it can read. DLP, access scoping, and output redaction each address a different point in the path, but none works alone if the model is fed unrestricted context from upstream systems.

Practical implication: reduce model-readable data to the smallest workable set and inspect both prompts and outputs for disclosure risk.

Excessive agency and unbounded consumption

Excessive agency means the model is allowed to initiate actions that should remain gated, such as modifying systems, sending data, or chaining tool calls. Unbounded consumption is the related control failure where a model can run too long, too often, or too expensively without hard ceilings. Together, they show that LLM security is partly an authorisation problem and partly a resource-control problem. Once the model can act, security teams need explicit approval logic, action limits, and termination conditions. Without those, the system can create business impact even when no traditional compromise has occurred.

Practical implication: define action ceilings, approval gates, and stop conditions before enabling any LLM to trigger downstream systems.

Moltbook AI agent keys breach — Moltbook breach exposed 1.5M AI agent keys.
AI LLM hijack breach — attackers used stolen AWS access keys to hijack Anthropic LLM models on Bedrock.

Read our 52 NHI Breaches Analysis report for a comprehensive view of breaches impacting Non-Human Identities including AI Agents.

NHI Mgmt Group analysis

LLM security is now an identity governance problem as much as a model safety problem. Once an LLM can touch data, tools, and downstream workflows, the question becomes who or what is authorised to see, transform, and release information. That shifts the centre of gravity from static configuration to runtime control, because the model’s effective permissions are revealed only when it executes. Practitioners should treat LLM access as a governed identity path, not a feature toggle.

Prompt injection is a boundary failure, not merely a content-filtering issue. The article’s threat model shows that the attack works when untrusted text is allowed to compete with trusted instructions inside the same execution path. That means the real control gap is weak separation between instruction layers, retrieval sources, and output handling. The implication is that teams must review where the system still assumes text can be trusted because it was received inside the session.

Excessive agency is the clearest example of AI identity behaving like an NHI with unstable privilege. The article shows that when models can schedule, execute, or modify systems, privilege no longer behaves like a fixed entitlement. That makes least privilege harder to define at provisioning time and harder to audit after the fact, because the relevant decision happens at runtime. Practitioners should re-think whether the actor is being governed as a passive workload or an action-capable identity.

Runtime visibility is becoming the primary control plane for LLM governance. The report’s emphasis on alerts, quarantines, and policy enforcement reflects a broader market shift: organisations need to observe prompts, outputs, and tool use in motion, not just approve the application in advance. That aligns with OWASP-NHI and zero trust thinking, but the important change is operational. Security teams must measure what the model actually did, not what the configuration said it could do.

LLM governance needs a named concept for the gap between declared policy and executed behaviour: runtime enforcement debt. The article shows that organisations can define policies for data access, outputs, and agency while still leaving the model free to bypass them in practice. That debt accumulates when monitoring is weak, approvals are absent, and control logic sits outside the execution path. Practitioners should treat that gap as a first-class governance risk, not a tuning problem.

From our research:
Only 14 percent have implemented automated blocking for autonomous agents, according to AI Agents: The New Attack Surface report.
Another finding from the same research shows that 80 percent of organisations report their AI agents have already performed actions beyond their intended scope.
For a broader governance lens, the OWASP NHI Top 10 helps teams connect runtime behaviour to control design across AI and non-human identities.

What this signals

The practical signal for identity programmes is that LLM controls need to shift from policy publishing to runtime enforcement. If the model can read sensitive data, generate output, and invoke tools in one session, governance has to follow the session, not just the approval record. The useful concept here is runtime enforcement debt: the gap between what policy says and what the execution path actually prevents.

With 80 percent of organisations already reporting AI agents acting beyond intended scope in the linked research, the direction of travel is clear: AI and NHI controls are converging around continuous verification, action gating, and evidence capture. Teams that already use NIST Cybersecurity Framework 2.0 can map these capabilities into govern, protect, detect, and respond without treating AI as a separate island.

For practitioners building out AI governance, the next step is to connect model observability to identity and data controls in the same workflow. That means tying prompt handling, access scoping, DLP, and approval gates to one operating model rather than four disconnected tools. The organisations that do this will be able to scale LLM use without expanding their breach surface in parallel.

For practitioners

Map model-readable data to minimum necessary access Inventory which datasets, documents, and memory stores the LLM can see, then cut access until the model can complete the task with the smallest workable context. Review retrieval paths, logging, and training inputs together, because disclosure can occur at any of those points.
Separate instructions from untrusted content Design prompt pipelines so system instructions, user prompts, and retrieved text remain logically isolated. Add source labelling, input validation, and quarantine rules for suspicious content before it reaches the model.
Gate model-initiated actions before execution Require explicit policy checks for any action that can modify data, send messages, call tools, or trigger workflows. Use approval paths, action allowlists, and hard stop conditions for higher-risk operations.
Instrument outputs for leakage and misuse Inspect generated text for hidden payloads, secret disclosure, and schema violations, then block or redact responses that exceed policy. Pair this with logging that preserves enough evidence for audit without overexposing sensitive content.

Key takeaways

LLM risk is now an identity and access problem because the model can read, generate, and act within the same execution path.
The strongest evidence in the article is not theoretical vulnerability but operational exposure, including broad concern about data leakage and weak automated blocking.
Teams need runtime controls, not just policy statements, if they want LLM adoption without turning every request into an uncontrolled action chain.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Agentic AI Top 10 and OWASP Non-Human Identity Top 10 address the attack and risk surface, while NIST CSF 2.0 set the governance and control requirements practitioners need to meet.

Framework	Control / Reference	Relevance
OWASP Agentic AI Top 10		Covers prompt injection, tool misuse, and excessive agency in LLM systems.
OWASP Non-Human Identity Top 10	NHI-06	LLMs acting with tool access behave like governed non-human identities.
NIST CSF 2.0	PR.AC-4	Access control and governance map directly to model data and action permissions.

Treat model permissions as NHI entitlements and review them for least privilege and runtime abuse.

Key terms

Prompt Injection: Prompt injection is the manipulation of an LLM by hostile or misleading text so it follows attacker intent instead of operator intent. In practice, it exploits weak separation between trusted instructions and untrusted content, especially when the model can act on retrieved or user-supplied inputs.
Excessive Agency: Excessive agency is the condition where an AI system can initiate actions beyond what the business case actually requires. It becomes an identity and governance issue when the model can schedule, call tools, or modify systems without enough human review or policy gating.
Runtime Protection: Runtime protection is the control layer that watches an AI system while it is actually processing prompts, data, and outputs. It matters because many LLM failures are invisible at design time and only become governable when the session is observed, constrained, and, if needed, blocked in motion.

Deepen your knowledge

LLM governance, prompt injection, and runtime control are core topics in our NHI Foundation Level course, the industry's only accredited NHI security programme. If you are building controls for AI systems that can read data and trigger actions, this is the right starting point.

This post draws on content published by Cyera: Securing LLMs: Cyera's AI Guardian and the OWASP Top Ten 2025. Read the original.

NHIMG Editorial Note
Published by the NHIMG editorial team on 2025-09-30.
NHI Mgmt Group — the independent authority on Non-Human Identity, IAM, and Agentic AI security. nhimg.org