Passive prompt injection is when malicious instructions are embedded in content that an AI system will later process automatically. The attacker does not need direct chat access. The risk comes from treating untrusted text as operational guidance inside a workflow that can trigger tool use, file access, or network requests.
Expanded Definition
Passive prompt injection is a form of indirect instruction hijacking in which malicious guidance is hidden inside content an AI agent will later read, summarize, classify, or act on. Unlike direct prompt injection, the attacker does not need to converse with the model; the payload can live in an email, document, webpage, ticket, or retrieved record.
In NHI and agentic AI workflows, the risk appears when untrusted text is allowed to influence execution logic, especially when the system can call tools, open files, send messages, or fetch additional data. This makes passive prompt injection different from ordinary content moderation problems, because the concern is not only what the model says, but what the model does after interpreting the content. Guidance across vendors is still evolving, but the defensive principle is consistent: treat externally sourced text as data, not authority, and separate retrieval from execution. The OWASP Agentic AI Top 10 frames this as a core agentic application risk.
The most common misapplication is allowing model-generated interpretations from untrusted content to trigger downstream actions without an authorization checkpoint, which occurs when retrieval and tool use share the same trust boundary.
Examples and Use Cases
Implementing defenses against passive prompt injection rigorously often introduces extra filtering, review, and permission checks, requiring organisations to weigh automation speed against the risk of delegated action on hostile content.
- An AI email assistant reads an inbound message that contains hidden instructions to expose calendar details or draft a fraudulent reply.
- A support copilot summarizes a ticket with embedded adversarial text and then proposes a tool action that changes account settings.
- A retrieval-augmented agent ingests a webpage or knowledge base page that includes instructions designed to steer file access or network requests.
- A document-processing workflow passes contract text into an AI reviewer, and the embedded payload attempts to redirect the model toward an unsafe approval path.
- A procurement bot ingests vendor-submitted content and is tricked into requesting internal records or generating a misleading risk classification.
These scenarios align closely with the OWASP Agentic Applications Top 10, which NHI Management Group uses to map agent exposure across workflows where untrusted content crosses into action. They also reflect the broader concern described in the OWASP Agentic AI Top 10, where instruction hierarchy and tool permissions must be treated separately.
Why It Matters in NHI Security
Passive prompt injection matters because NHI-driven systems often hold the exact privileges attackers want: API keys, service account access, delegated tokens, and tool permissions. If a model is allowed to reason over untrusted content and then act on it, the impact can extend beyond bad output into credential exposure, unauthorized changes, and lateral movement. NHI Management Group research shows that 90% of IT leaders say properly managing NHIs is essential for a successful zero-trust implementation, which is relevant here because the same zero-trust logic should govern agentic decision paths.
This is also where secret hygiene and privilege boundaries intersect with prompt safety. If an injected instruction can cause an agent to read stored secrets, call a sensitive API, or retrieve more context than necessary, the failure is no longer only linguistic. It becomes an access control problem. Organisations that ignore this distinction often discover it after an unexpected action, a suspicious API call, or a leaked artifact reveals that the agent followed hostile instructions.
Practitioners typically encounter passive prompt injection only after an agent has already executed an unsafe tool action or exfiltrated sensitive context, at which point the term becomes operationally unavoidable to address.
Standards & Framework Alignment
This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.
OWASP Agentic AI Top 10 and OWASP Non-Human Identity Top 10 address the attack and risk surface, while NIST CSF 2.0 set the governance and control requirements practitioners need to meet.
| Framework | Control / Reference | Relevance |
|---|---|---|
| OWASP Agentic AI Top 10 | A1 | Covers indirect prompt injection and instruction hierarchy failures in agentic systems. |
| OWASP Non-Human Identity Top 10 | NHI-05 | Agent abuse often exploits NHI permissions, secrets, and over-privileged workflows. |
| NIST CSF 2.0 | PR.AC-4 | Least-privilege access is central when agents process untrusted content and invoke tools. |
Isolate untrusted content from tool execution and require policy checks before any agent action.
Related resources from NHI Mgmt Group
- What is the difference between prompt injection risk and identity abuse in agents?
- What is the difference between prompt injection and credential theft for agents
- What is the difference between prompt injection and tool poisoning?
- How should security teams reduce indirect prompt injection risk in AI systems?
Deepen Your Knowledge
Reviewed and updated by the NHIMG editorial team on June 9, 2026.
NHI Mgmt Group — the #1 independent authority on Non-Human Identity, IAM, and Agentic AI security. nhimg.org