LLM app security is shifting from leakage to manipulation risk

By NHI Mgmt Group Editorial TeamPublished 2026-01-05Domain: Agentic AI & NHIsSource: Lasso Security

TL;DR: LLM applications introduce prompt injection, model drifting, and resource abuse risks that traditional deterministic AppSec was not built to handle, according to Lasso Security. The governance gap is not just data leakage but application manipulation, where system prompts become a security boundary that must be treated as part of the control plane.

At a glance

What this is: This is an analysis of why LLM-based applications create a new AppSec problem, with system prompts and dynamic behavior emerging as core attack surfaces.

Why it matters: It matters to IAM and security teams because AI applications now behave like governed identities in production, and the controls around access, scope, and oversight must adapt accordingly.

👉 Read Lasso Security's analysis of LLM app security and system prompt risk

Context

LLM-based applications do not behave like traditional deterministic software. Their outputs, tool use, and control flow can shift based on human-readable prompts and external inputs, which means the application boundary is no longer defined only by code. For identity and access teams, that changes how application trust, privilege, and oversight need to be modeled across AI systems.

The practical issue is not only whether sensitive data can leak. It is whether an attacker can steer an AI application into revealing configuration, changing behavior, or consuming resources in ways that create operational and governance exposure. That makes prompt handling, runtime control, and monitoring part of the identity security conversation, not just an AppSec concern.

Key questions

Q: How should security teams secure LLM system prompts in production applications?

A: Security teams should treat system prompts as governed runtime assets, not informal configuration. That means version control, change approval, least-privilege editing, and security review before release. Prompts should be tested for injection, leakage, and unsafe instruction handling, because prompt text can change behavior without changing code. The control objective is to keep model behavior within approved boundaries.

Q: Why do LLM applications create risk beyond data leakage?

A: They create risk because attackers can manipulate the model’s behavior, not just try to extract information. A compromised prompt or untrusted input can alter decisions, trigger unsafe outputs, or force expensive execution paths. That means confidentiality is only one part of the problem. Security teams also need integrity controls for instructions, outputs, and runtime actions.

Q: What breaks when prompt injection is not tested in AI applications?

A: Without prompt-injection testing, applications can accept hostile instructions as if they were legitimate input. That can override guardrails, expose configuration, or redirect the model into unsafe behavior. The failure is usually not obvious in normal testing because the application appears functional. Real risk emerges when attackers control the input context and the model follows it.

Q: How can organisations tell whether AI application controls are actually working?

A: Look for reductions in unsafe output, fewer unexpected tool invocations, lower variation in behaviour across similar prompts, and cleaner incident traces when the model is challenged. If prompts are well governed, the system should fail safely, not improvise. Monitoring should show whether the application stays inside its intended operating envelope under stress.

Technical breakdown

Why system prompts act like security controls

System prompts define how an LLM application behaves, what it prioritises, and which boundaries it is supposed to respect. Because they are editable, human-readable instructions rather than compiled logic, they can be influenced by prompt injection or malformed inputs. That makes the prompt layer part of the application’s control surface. In practice, a weak prompt can expose configuration, change response behavior, or route the model into actions that were never intended by the operator. Security teams should treat prompts as governed runtime assets, not as harmless configuration text.

Practical implication: inventory system prompts as controlled assets and review them with the same discipline used for sensitive application logic.

Prompt injection and model manipulation

Prompt injection is the act of supplying instructions that override or redirect the model’s intended behavior. In an LLM application, that can mean extracting prohibited information, altering decision-making, or bypassing safety logic through untrusted user input. The article’s key point is that the more serious risk is manipulation, not just leakage. Once a model can be steered mid-session, the application becomes non-deterministic in a security sense, because the attacker is no longer only requesting output but influencing the rules by which output is generated.

Practical implication: build tests for instruction override, output steering, and safety bypass conditions into application security validation.

Denial-of-service and wallet abuse in AI applications

LLM applications can be forced into expensive loops, repeated inference calls, or resource-heavy chains of reasoning that drive up latency and cost. These are not classic volumetric attacks. They are workload abuse patterns tied to how the model processes prompts, tools, and responses. If prompts lack guardrails or usage bounds, the application can be pushed into unintended computational paths. That creates an operational risk that sits between AppSec, platform reliability, and identity governance, especially where AI agents can act on behalf of users or systems.

Practical implication: set usage limits, cost thresholds, and runtime guardrails for every AI application that can trigger downstream execution.

Moltbook AI agent keys breach — Moltbook breach exposed 1.5M AI agent keys.
AI LLM hijack breach — attackers used stolen AWS access keys to hijack Anthropic LLM models on Bedrock.

Read our 52 NHI Breaches Analysis report for a comprehensive view of breaches impacting Non-Human Identities including AI Agents.

NHI Mgmt Group analysis

LLM application security is now an identity problem as much as an AppSec problem. Once prompts and model behavior can be influenced at runtime, the question is not only whether code is secure but whether the application can still be trusted to act within its intended scope. That shifts control ownership toward governance of inputs, execution paths, and delegated behavior. Security teams should treat AI application identity as part of the control model.

System prompts are becoming the new policy boundary for generative applications. In traditional software, policy lives in code, access rules, and architecture. In LLM systems, the prompt can shape the operational blueprint of the application, which means a compromised prompt can alter business logic without changing source code. Practitioners should recognise that prompt integrity is now a governance concern, not just a development concern.

Prompt injection exposes a runtime trust gap that conventional AppSec testing often misses. The attack succeeds because the application accepts untrusted language as if it were instruction. That breaks the assumption that inputs can be safely classified ahead of time. The implication is that security validation must move from static review to adversarial runtime testing of instructions, tool use, and model responses.

Denial-of-service and wallet attacks show that AI abuse is also cost abuse. Generative systems can be manipulated into expensive inference patterns that exhaust resources without looking like traditional exploitation. That changes how teams should think about abuse prevention across AI platforms. Practitioners should align security controls with operational spend, not just confidentiality and integrity.

The emergence of non-deterministic application behavior means AI security programmes must converge AppSec, IAM, and runtime governance. The source article is right to move beyond data leakage, because the real control question is whether the system can be made to behave predictably enough to govern. Teams that keep treating prompts as a narrow engineering concern will miss the broader identity and access implications. Practitioners should build cross-functional ownership now.

From our research:
92% agree governing AI agents is critical to enterprise security, yet only 44% have implemented any policies to do so, according to AI Agents: The New Attack Surface report.
80% of organisations report their AI agents have already performed actions beyond their intended scope, including accessing unauthorised systems, sharing sensitive data, and revealing credentials.
That gap between concern and control makes the next step clear, so review OWASP Agentic AI Top 10 alongside your prompt security programme.

What this signals

Prompt governance is becoming a baseline control for AI application programmes. With 92% of organisations saying AI agents must be governed but only 44% having policies in place, the field is moving faster than the control stack. Teams should expect prompt review, runtime monitoring, and model behaviour testing to become standard procurement and audit questions, not specialist extras.

LLM-specific abuse will increasingly be measured as operational and financial exposure. Prompt injection, model drift, and cost-amplifying loops turn AI security failures into reliability and spend problems as well as governance problems. That means security leaders should align application telemetry with usage controls, incident response, and ownership models that span engineering and IAM.

System prompt integrity is the near-term named concept practitioners should track. It describes the idea that the prompt itself has become a policy boundary, and once that boundary is mutable, traditional application assumptions no longer hold. Practitioners should use that lens when deciding where security review ends and AI governance begins, and map it against OWASP Agentic AI Top 10.

For practitioners

Classify prompts as governed application assets Assign ownership for system prompts, version them, and subject changes to review because they define behavior, safety boundaries, and operational scope.
Test for instruction override conditions Run adversarial validation for prompt injection, hidden instruction leakage, and user input that can redirect model outputs or tool calls.
Set cost and execution guardrails Apply token, loop, tool-use, and spend limits so AI applications cannot be driven into excessive computation or unbounded downstream actions.
Bring security into prompt design Require security, AI engineering, and product teams to co-review prompts before release so safety assumptions are not left to non-security authors.
Monitor for behavior drift in production Track changes in output patterns, refusal rates, and tool invocation paths so deviations from intended behavior are visible before they become incidents.

Key takeaways

LLM applications expand AppSec into governance of prompts, runtime behavior, and model-led execution paths.
The core risk is not only leakage but manipulation, where attackers steer models into unsafe decisions or costly loops.
Security teams should treat prompts as controlled assets, test them adversarially, and monitor behaviour drift in production.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Agentic AI Top 10 address the attack and risk surface, while NIST AI RMF and NIST CSF 2.0 set the governance and control requirements practitioners need to meet.

Framework	Control / Reference	Relevance
OWASP Agentic AI Top 10	A3	Prompt injection and tool steering are core agentic application threats.
NIST AI RMF		AI RMF addresses governance for dynamic AI system behaviour and accountability.
NIST CSF 2.0	PR.DS-1	Prompt and model integrity are part of protecting sensitive system assets.

Assign ownership for AI behaviour, validate controls, and monitor for drift in production.

Key terms

System Prompt: A system prompt is the instruction layer that shapes how an LLM application behaves before user input is processed. In security terms, it functions like policy text for the model, which means changes to it can alter behaviour, safety boundaries, and downstream actions without changing source code.
Prompt Injection: Prompt injection is an attack method that uses crafted input to override, redirect, or confuse an AI system’s intended instructions. In practice, it can make the model reveal hidden content, ignore safeguards, or follow attacker-controlled logic instead of the application’s approved rules.
Model Drift: Model drift is the gradual or sudden change in an AI system’s outputs, behavior, or decision patterns over time. For practitioners, it matters because a system that once behaved safely can become inconsistent, harder to validate, and more difficult to govern without continuous monitoring.
Runtime Guardrail: A runtime guardrail is a control that constrains what an AI application can do while it is operating, such as limiting actions, output types, tool calls, or compute usage. It is used to keep model behavior inside an approved operating envelope even when inputs are unpredictable.

Deepen your knowledge

LLM application security and prompt governance are core topics in our NHI Foundation Level course, the industry's only accredited NHI security programme. If you are building controls for AI-driven applications in a similar starting point, it is worth exploring.

This post draws on content published by Lasso Security: New Challenges for AppSec: Securing LLM-based Applications and System Prompts. Read the original.

NHIMG Editorial Note
Published by the NHIMG editorial team on 2026-01-05.
NHI Mgmt Group — the independent authority on Non-Human Identity, IAM, and Agentic AI security. nhimg.org