What Is Indirect Instruction? Definition & Examples

Expanded Definition

Indirect instruction occurs when an AI agent takes action based on context it did not receive from the immediate operator, such as email text, shared documents, tickets, chat threads, or workflow metadata. In NHI security, that matters because the agent often has execution authority and can turn ambient content into real-world changes.

This pattern is different from a direct prompt injection because the influence is mediated through surrounding material rather than an obvious malicious command. Definitions vary across vendors, but the security concern is consistent: the agent may treat untrusted third-party text as if it were policy, instructions, or task context. That creates a governance gap between what a person intended and what the system actually executes. For a broader NHI governance lens, the Ultimate Guide to NHIs frames how automation, identity, and privileged access must be controlled across the full lifecycle, while NIST Cybersecurity Framework 2.0 provides a practical language for identifying and reducing such operational risk.

The most common misapplication is treating all retrieved or shared content as trustworthy task input, which occurs when an agent is allowed to read unvalidated third-party text without strict context separation.

Examples and Use Cases

Implementing protections against indirect instruction often introduces friction, because the system must distinguish useful context from adversarial context without blocking legitimate work.

An agent summarises an inbox thread and then follows a hidden instruction embedded in a vendor message, causing it to draft an approval or expose data outside the intended workflow.

A support agent reads a ticket containing attacker-controlled text and uses that text to trigger a reset, export, or escalation action through an attached NHI.

A document-processing agent ingests a shared file and interprets commentary or metadata as operational guidance, even though the file was never meant to steer execution.

A code assistant connected to repositories and CI context treats comments or README content as authority, then proposes unsafe secret handling or deployment changes.

A workflow agent in a procurement or finance process follows instructions hidden in a third-party attachment, causing unauthorised data movement or approval routing.

These scenarios align with the broader NHI exposure patterns described in the Ultimate Guide to NHIs, where third-party exposure and excessive privilege amplify the blast radius. The same control problem is why implementation guidance in the NIST Cybersecurity Framework 2.0 emphasizes asset awareness, access governance, and protective safeguards.

Why It Matters in NHI Security

Indirect instruction is dangerous because it exploits the gap between identity and intent. An NHI can authenticate correctly and still behave unsafely if its context channel is compromised. That means the issue is not only access control, but also instruction provenance, context validation, and execution boundaries. When agents are connected to messaging, repositories, tickets, or document stores, the attack surface expands beyond secrets and credentials into the content the agent consumes.

This is especially relevant in environments where NHIs already carry excessive privilege. NHI Mgmt Group reports that Ultimate Guide to NHIs found 97% of NHIs carry excessive privileges, which turns a single deceptive instruction into a much larger operational event. Controls from NIST Cybersecurity Framework 2.0 help organisations map the problem to governance, protection, detection, and response, but no single standard yet fully resolves indirect instruction risk for agentic systems.

Organisations typically encounter the consequence only after an agent has already forwarded data, changed records, or invoked a privileged tool on the basis of hostile surrounding content, at which point indirect instruction becomes operationally unavoidable to address.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Agentic AI Top 10 and OWASP Non-Human Identity Top 10 address the attack and risk surface, while NIST CSF 2.0 set the governance and control requirements practitioners need to meet.

Framework	Control / Reference	Relevance
OWASP Agentic AI Top 10	LLM-02	Indirect instruction is a context-injection risk for autonomous agents.
OWASP Non-Human Identity Top 10	NHI-02	Agents acting on hostile context often fail because secrets and execution paths are loosely governed.
NIST CSF 2.0	PR.AC-4	Least privilege reduces the blast radius when agents misread third-party content as instructions.

Restrict NHI permissions so a mistaken context interpretation cannot trigger broad access or changes.

#1 Authority in NHI Education, Research and Advisory, empowering organizations to tackle the critical risks posed by Non-Human Identities (NHIs), including AI Agents.

Indirect Instruction

Expanded Definition

Examples and Use Cases

Why It Matters in NHI Security

Standards & Framework Alignment

Related resources from NHI Mgmt Group