TL;DR: Prompt injection has become a top-tier LLM risk because models cannot reliably separate trusted instructions from untrusted text, and attacks now span direct, indirect, multimodal, and agentic tool abuse, according to WorkOS and the OWASP Top 10 for LLM Applications. The deeper issue is not just filtering malicious prompts, but governing systems that treat language itself as executable context.
NHIMG editorial — based on content published by WorkOS: Prompt injection attacks and how to defend against them
By the numbers:
- Prompt injection has been ranked the number one vulnerability on the OWASP Top 10 for LLM Applications since 2025.
- Research has shown that as few as five strategically poisoned documents in a RAG knowledge base can manipulate AI responses 90% of the time.
- Joint research from OpenAI, Anthropic, and Google DeepMind found that sophisticated attackers can bypass published defenses with over 90% success rates when given enough attempts.
Questions worth separating out
Q: How should security teams handle prompt injection in LLM applications?
A: Security teams should treat prompt injection as an application and identity boundary problem, not just a content filtering problem.
Q: Why do prompt injection attacks create so much risk for AI agents?
A: Prompt injection is risky for AI agents because the model can be steered into using tools and credentials that already exist in the workflow.
Q: What do organisations get wrong about defending against prompt injection?
A: The common mistake is relying on input filtering alone.
Practitioner guidance
- Separate trusted instructions from untrusted content Use structured delimiters, server-side system prompts, and explicit content tagging so retrieved documents, email bodies, and tool outputs are never treated as instructions.
- Minimise agent permissions to the narrowest action set Issue short-lived credentials with only the API scopes and resource relationships the workflow actually needs.
- Gate destructive actions behind deterministic approval Require explicit human confirmation before deleting data, modifying configurations, sending messages, or initiating external transfers.
What's in the full article
WorkOS's full article covers the operational detail this post intentionally leaves for the source:
- Concrete code examples for structured system prompts and delimiter patterns that reduce instruction confusion.
- Step-by-step guidance on input scanning, output validation, and model-critic workflows for production pipelines.
- Implementation detail for least-privilege tool design, including approval gates and scoped credentials.
- Discussion of regulatory mapping to the EU AI Act, NIST AI RMF, and OWASP guidance in applied deployments.
👉 Read WorkOS's guide to prompt injection attacks and defences →
Prompt injection in LLM apps: are your controls keeping up?
Explore further
Prompt injection is not just an LLM vulnerability, it is an identity boundary failure. The article shows that mixed-trust text becomes executable context once an LLM can act on it. That means the security problem is not only content safety, but whether the system can distinguish instruction sources before a tool call is made. Practitioners should stop treating the model as a neutral processor and start treating it as a policy-bearing execution point.
A few things that frame the scale:
- 91.6% of secrets remain valid five days after the targeted organisation is notified, showing a critical gap in remediation procedures, according to the Ultimate Guide to NHIs.
- Only 5.7% of organisations have full visibility into their service accounts, which means most identity teams cannot reliably see every machine credential in circulation.
A question worth separating out:
Q: How do you know if an LLM workflow is too privileged?
A: An LLM workflow is too privileged when a successful injection could reach systems or actions the user should not control in the first place. If the assistant can modify settings, access broad data, or trigger destructive API calls without explicit approval, the permission scope is too wide for safe operation.
👉 Read our full editorial: Prompt injection exposes the trust model behind LLM applications