TL;DR: Prompt injection is the top OWASP LLM vulnerability because attackers can override model behavior with plain language, and indirect injections in documents or retrieved data can redirect chatbots and agents without code exploits, according to WitnessAI. The real failure is assuming natural-language systems can be governed like structured applications when their input and instruction boundaries are not technically separable.
NHIMG editorial — based on content published by WitnessAI: Prompt injection as the number one vulnerability in LLM applications
Questions worth separating out
Q: How should security teams reduce prompt injection risk in AI assistants?
A: Security teams should place controls outside the model, not just inside the prompt.
Q: Why do prompt injections create more risk for AI agents than for chatbots?
A: AI agents can turn a malicious instruction into a real action, such as a query, file transfer, or API call.
Q: What do teams get wrong about indirect prompt injection?
A: Teams often treat retrieved documents as safe because they are business content, not code.
Practitioner guidance
- Inspect prompts before model execution Place a runtime inspection layer in front of every chatbot, copilot, or agent so untrusted instructions are screened before they reach the model or downstream action handler.
- Separate retrieved content from authority Tag documents, emails, and knowledge-base records as untrusted inputs during retrieval and summarisation so hidden instructions cannot inherit system-level authority.
- Constrain agent tool scope tightly Limit each agent to the smallest action set it needs, and block database queries, exports, or endpoint calls that are outside the current business task.
What's in the full article
WitnessAI's full article covers the operational detail this post intentionally leaves for the source:
- Step-by-step examples of direct and indirect prompt injection across customer chat, internal copilots, and agent workflows.
- Detailed runtime defence mechanics for scanning incoming prompts and filtering outgoing responses before action triggers.
- Specific control design patterns for limiting tool calls, data exposure, and high-risk workflow execution.
- The article's own examples of prompt engineering, input validation, and red-teaming approaches for AI systems.
👉 Read WitnessAI's analysis of prompt injection and runtime AI defence →
Prompt injection risk: are your AI controls keeping up?
Explore further
Prompt injection is a governance failure, not just an application flaw. The attack succeeds because enterprises often treat LLM input as if it were ordinary text rather than a live control surface. That is a structural problem for OWASP-NHI and zero-trust governance, because the same session can contain instructions, data, and actions with no trustworthy separation. The implication is that AI governance has to assume hostile language at runtime, not just malicious users at login.
A few things that frame the scale:
- Only 52% of companies can track and audit the data their AI agents access, leaving 48% with a complete blind spot for compliance and breach investigation, according to AI Agents: The New Attack Surface report.
- 80% of organisations report their AI agents have already performed actions beyond their intended scope, including accessing unauthorised systems and revealing access credentials.
A question worth separating out:
Q: Should organisations rely on model safety features alone to stop prompt injection?
A: No. Model-level guardrails reduce risk, but they do not define enterprise context, data boundaries, or action permissions. Organisations need their own enforcement layer for prompts, responses, and tool calls because the provider cannot know which business process is safe, which data is sensitive, or which action is out of bounds.
👉 Read our full editorial: Prompt injection exposes the shared-responsibility gap in AI security