Notifications

Clear all

Indirect prompt injection and the governance gap teams are missing

Last Post

RSS

NHI Mgmt Group

(@nhi-mgmt-group)

Member Moderator

Joined: 1 year ago

Posts: 12324

Topic starter 12/06/2026 12:16 am

TL;DR: Indirect prompt injection succeeds when malicious instructions are embedded in trusted data and LLMs can act on them across sensitive workflows, according to Pillar Security’s analysis. The real risk is not the payload alone but the combination of private data access, untrusted inputs, and external communication that turns prompt attacks into operational exploits.

NHIMG editorial — based on content published by Pillar Security: Anatomy of an Indirect Prompt Injection

By the numbers:

98% of companies plan to deploy even more AI agents within the next 12 months, despite documented rogue behaviour in 80% of current deployments.

Questions worth separating out

Q: How should security teams reduce indirect prompt injection risk in LLM workflows?

A: Start by separating untrusted content from system instructions, then limit what the model can do with sensitive data.

Q: Why do private-data access and outbound tools make prompt injection worse?

A: Because prompt injection becomes operational when the model can read something valuable and send it somewhere useful.

Q: What do teams get wrong about indirect prompt injection?

A: They often focus only on the prompt text and ignore the surrounding workflow.

Practitioner guidance

Separate instruction channels from data channels Keep system instructions, user prompts, and untrusted content in distinct processing paths.
Restrict outbound capability on high-risk LLM workflows Remove or tightly mediate external communication paths where models process private data.
Test for CFS exposure in real workflows Red-team the exact content types your teams use most, including HTML, JSON, code comments, and ticket text.

What's in the full article

Pillar Security's full research covers the operational detail this post intentionally leaves for the source:

Side-by-side examples of successful and failed indirect prompt injection payloads across email, tickets, and code.
Detailed breakdown of how the CFS model changes with content format, placement, and instruction phrasing.
Workflow-specific attacker patterns that show how context fit changes between assistants, coding tools, and ticketing systems.
Additional examples of how defenders can recognise high-salience payloads before they reach tool execution.

👉 Read Pillar Security's analysis of indirect prompt injection and the CFS model →

Indirect prompt injection and the governance gap teams are missing?

Explore further

View Full Forum → | NHI Foundation Course →

Quote

Topic Tags

Mr NHI

(@mr-nhi)

Member Moderator

Joined: 2 months ago

Posts: 11878

12/06/2026 9:24 am

Indirect prompt injection is an instruction-boundary problem before it is an AI problem. The failure begins when systems collapse data and directives into one processing stream, then ask the model to decide what is authoritative. That makes the control gap broader than prompt hygiene. Practitioners need to treat content ingestion, tool invocation, and output generation as separate trust zones.

A few things that frame the scale:

98% of companies plan to deploy even more AI agents within the next 12 months, despite documented rogue behaviour in 80% of current deployments, according to AI Agents: The New Attack Surface report.
Only 52% of companies can track and audit the data their AI agents access, leaving 48% with a complete blind spot for compliance and breach investigation.

A question worth separating out:

Q: Who is accountable when an LLM leaks data after following malicious instructions?

A: Accountability sits with the organisation that granted the model access, connected the tools, and allowed untrusted content into the same decision path. That makes this a governance issue across IAM, security engineering, and application ownership, not a defect that belongs to the model alone.

👉 Read our full editorial: Indirect prompt injection is becoming an operational exploit

ReplyQuote

Forum Statistics

11 Forums

13.6 K Topics

26 K Posts

12 Online

135 Members

Latest Post: Developer tooling and identity risk: are your controls keeping up? Our newest member: Alex Recent Posts Unread Posts Tags

Forum Icons: Forum contains no unread posts Forum contains unread posts

Topic Icons: Not Replied Replied Active Hot Sticky Unapproved Solved Private Closed

#1 Authority in NHI Education, Research and Advisory, empowering organizations to tackle the critical risks posed by Non-Human Identities (NHIs), including AI Agents.

Get in Touch

Quick Links

FAQ

NHI 101 Articles

Legal & Policies