Subscribe to the Non-Human & AI Identity Journal

Notifications
Clear all

Prompt injection in LLM apps: are your controls keeping up?


(@nhi-mgmt-group)
Member Moderator
Joined: 1 year ago
Posts: 2364
Topic starter  

TL;DR: Prompt injection has become a top-tier LLM risk because models cannot reliably separate trusted instructions from untrusted text, and attacks now span direct, indirect, multimodal, and agentic tool abuse, according to WorkOS and the OWASP Top 10 for LLM Applications. The deeper issue is not just filtering malicious prompts, but governing systems that treat language itself as executable context.

NHIMG editorial — based on content published by WorkOS: Prompt injection attacks and how to defend against them

By the numbers:

Questions worth separating out

Q: How should security teams handle prompt injection in LLM applications?

A: Security teams should treat prompt injection as an application and identity boundary problem, not just a content filtering problem.

Q: Why do prompt injection attacks create so much risk for AI agents?

A: Prompt injection is risky for AI agents because the model can be steered into using tools and credentials that already exist in the workflow.

Q: What do organisations get wrong about defending against prompt injection?

A: The common mistake is relying on input filtering alone.

Practitioner guidance

  • Separate trusted instructions from untrusted content Use structured delimiters, server-side system prompts, and explicit content tagging so retrieved documents, email bodies, and tool outputs are never treated as instructions.
  • Minimise agent permissions to the narrowest action set Issue short-lived credentials with only the API scopes and resource relationships the workflow actually needs.
  • Gate destructive actions behind deterministic approval Require explicit human confirmation before deleting data, modifying configurations, sending messages, or initiating external transfers.

What's in the full article

WorkOS's full article covers the operational detail this post intentionally leaves for the source:

  • Concrete code examples for structured system prompts and delimiter patterns that reduce instruction confusion.
  • Step-by-step guidance on input scanning, output validation, and model-critic workflows for production pipelines.
  • Implementation detail for least-privilege tool design, including approval gates and scoped credentials.
  • Discussion of regulatory mapping to the EU AI Act, NIST AI RMF, and OWASP guidance in applied deployments.

👉 Read WorkOS's guide to prompt injection attacks and defences →

Prompt injection in LLM apps: are your controls keeping up?

Explore further

View Full Forum →  |  NHI Foundation Course →



   
Quote
(@mr-nhi)
Member Moderator
Joined: 4 weeks ago
Posts: 924
 

Prompt injection is not just an LLM vulnerability, it is an identity boundary failure. The article shows that mixed-trust text becomes executable context once an LLM can act on it. That means the security problem is not only content safety, but whether the system can distinguish instruction sources before a tool call is made. Practitioners should stop treating the model as a neutral processor and start treating it as a policy-bearing execution point.

A few things that frame the scale:

  • 91.6% of secrets remain valid five days after the targeted organisation is notified, showing a critical gap in remediation procedures, according to the Ultimate Guide to NHIs.
  • Only 5.7% of organisations have full visibility into their service accounts, which means most identity teams cannot reliably see every machine credential in circulation.

A question worth separating out:

Q: How do you know if an LLM workflow is too privileged?

A: An LLM workflow is too privileged when a successful injection could reach systems or actions the user should not control in the first place. If the assistant can modify settings, access broad data, or trigger destructive API calls without explicit approval, the permission scope is too wide for safe operation.

👉 Read our full editorial: Prompt injection exposes the trust model behind LLM applications



   
ReplyQuote
Share: