What Is Passive Prompt Injection? Definition & Examples

Expanded Definition

Passive prompt injection is a form of indirect instruction hijacking in which malicious guidance is hidden inside content an AI agent will later read, summarize, classify, or act on. Unlike direct prompt injection, the attacker does not need to converse with the model; the payload can live in an email, document, webpage, ticket, or retrieved record.

In NHI and agentic AI workflows, the risk appears when untrusted text is allowed to influence execution logic, especially when the system can call tools, open files, send messages, or fetch additional data. This makes passive prompt injection different from ordinary content moderation problems, because the concern is not only what the model says, but what the model does after interpreting the content. Guidance across vendors is still evolving, but the defensive principle is consistent: treat externally sourced text as data, not authority, and separate retrieval from execution. The OWASP Agentic AI Top 10 frames this as a core agentic application risk.

The most common misapplication is allowing model-generated interpretations from untrusted content to trigger downstream actions without an authorization checkpoint, which occurs when retrieval and tool use share the same trust boundary.

Examples and Use Cases

Implementing defenses against passive prompt injection rigorously often introduces extra filtering, review, and permission checks, requiring organisations to weigh automation speed against the risk of delegated action on hostile content.

An AI email assistant reads an inbound message that contains hidden instructions to expose calendar details or draft a fraudulent reply.

A support copilot summarizes a ticket with embedded adversarial text and then proposes a tool action that changes account settings.

A retrieval-augmented agent ingests a webpage or knowledge base page that includes instructions designed to steer file access or network requests.

A document-processing workflow passes contract text into an AI reviewer, and the embedded payload attempts to redirect the model toward an unsafe approval path.

A procurement bot ingests vendor-submitted content and is tricked into requesting internal records or generating a misleading risk classification.

These scenarios align closely with the OWASP Agentic Applications Top 10, which NHI Management Group uses to map agent exposure across workflows where untrusted content crosses into action. They also reflect the broader concern described in the OWASP Agentic AI Top 10, where instruction hierarchy and tool permissions must be treated separately.

Why It Matters in NHI Security

Passive prompt injection matters because NHI-driven systems often hold the exact privileges attackers want: API keys, service account access, delegated tokens, and tool permissions. If a model is allowed to reason over untrusted content and then act on it, the impact can extend beyond bad output into credential exposure, unauthorized changes, and lateral movement. NHI Management Group research shows that 90% of IT leaders say properly managing NHIs is essential for a successful zero-trust implementation, which is relevant here because the same zero-trust logic should govern agentic decision paths.

This is also where secret hygiene and privilege boundaries intersect with prompt safety. If an injected instruction can cause an agent to read stored secrets, call a sensitive API, or retrieve more context than necessary, the failure is no longer only linguistic. It becomes an access control problem. Organisations that ignore this distinction often discover it after an unexpected action, a suspicious API call, or a leaked artifact reveals that the agent followed hostile instructions.

Practitioners typically encounter passive prompt injection only after an agent has already executed an unsafe tool action or exfiltrated sensitive context, at which point the term becomes operationally unavoidable to address.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Agentic AI Top 10 and OWASP Non-Human Identity Top 10 address the attack and risk surface, while NIST CSF 2.0 set the governance and control requirements practitioners need to meet.

Framework	Control / Reference	Relevance
OWASP Agentic AI Top 10	A1	Covers indirect prompt injection and instruction hierarchy failures in agentic systems.
OWASP Non-Human Identity Top 10	NHI-05	Agent abuse often exploits NHI permissions, secrets, and over-privileged workflows.
NIST CSF 2.0	PR.AC-4	Least-privilege access is central when agents process untrusted content and invoke tools.

Isolate untrusted content from tool execution and require policy checks before any agent action.

#1 Authority in NHI Education, Research and Advisory, empowering organizations to tackle the critical risks posed by Non-Human Identities (NHIs), including AI Agents.

Passive Prompt Injection

Expanded Definition

Examples and Use Cases

Why It Matters in NHI Security

Standards & Framework Alignment

Related resources from NHI Mgmt Group