TL;DR: Prompt injection in agentic systems is an action problem, not just an output problem: attackers can steer AI agents to query internal data, perform unauthorized actions, or propagate malicious instructions across other agents, and a January 2026 meta-analysis found adaptive attacks succeed against state-of-the-art defenses more than 85% of the time, according to WorkOS. The governing assumption has shifted from “can the model be tricked?” to “what can the agent do if it is tricked?”
NHIMG editorial — based on content published by WorkOS: Securing agentic apps, with a focus on containing AI agent prompt injection
By the numbers:
- A meta-analysis of 78 studies published in January 2026 found that adaptive attack success rates against state-of-the-art defenses exceed 85%.
- In April 2026, researchers at Pillar Security demonstrated that a prompt injection in Google's Antigravity, an AI developer tool for filesystem operations, could be combined with the tool's permitted file-creation capability to achieve remote code execution.
- In April 2026, a Cursor AI coding agent running Claude deleted a startup's entire production database and backups in a single API call, nine seconds after receiving an instruction the agent interpreted as legitimate.
Questions worth separating out
Q: How should security teams contain prompt injection in agentic systems?
A: Containment should start with delegated identity, not prompt wording.
Q: Why do agentic apps make prompt injection more dangerous than chatbots?
A: Agentic apps can turn manipulated text into real action.
Q: What breaks when prompt injection reaches a tool-using AI agent?
A: What breaks is the assumption that the model's output is low impact.
Practitioner guidance
- Scope agent credentials to the minimum actionable set Give each agent only the permissions required for its narrow workflow, and separate read-only from state-changing entitlements so a hijacked prompt cannot expand into unrelated systems.
- Treat untrusted content as adversarial input Tag emails, documents, web pages, and tool outputs by source before they enter the context window.
- Enforce invocation policy at the tool boundary Validate arguments, inspect call sequences, and block dangerous combinations such as read-then-send exfiltration or filesystem writes outside the workspace.
What's in the full article
WorkOS' full article covers the operational detail this post intentionally leaves for the source:
- Detailed examples of argument validation, chain analysis, and circuit breaker patterns for agent tool calls
- Code samples for validating generated code before execution in filesystem and deployment workflows
- Practical prompt structure guidance for separating system instructions, retrieved content, and user input
- A closer walkthrough of how scoped credentials and RBAC bound the blast radius of a hijacked agent
👉 Read WorkOS' analysis of securing agentic apps against prompt injection →
Agentic prompt injection: are your controls containing the blast radius?
Explore further