Subscribe to the Non-Human & AI Identity Journal

Notifications
Clear all

Prompt injection across AI agents: are your controls keeping up?


(@nhi-mgmt-group)
Member Moderator
Joined: 1 year ago
Posts: 8534
Topic starter  

TL;DR: AI agent attack techniques are resurfacing across platforms, with Zenity describing how prompt injection, retrieval poisoning, trusted-domain abuse, and image-rendering tricks reappear in Bing, Microsoft 365 Copilot, Salesforce Einstein, and Agentforce. The pattern shows that agent risks are structural, not one-off vulnerabilities, and demand a dedicated security layer rather than vendor-only fixes.

NHIMG editorial — based on content published by Zenity: 0Click Attacks, When TTPs Resurface Across Platforms

By the numbers:

  • 80% of organisations report their AI agents have already performed actions beyond their intended scope, including accessing unauthorised systems (39%), inappropriately sharing sensitive data (31%), and revealing access credentials (23%).

Questions worth separating out

Q: How should security teams defend AI agents against prompt injection across different platforms?

A: Treat prompt injection as an agent boundary problem, not a single-product defect.

Q: Why do AI agent attacks keep reappearing after vendor fixes?

A: They keep reappearing because the underlying primitives stay available: untrusted input, retrieval, prompt manipulation, and automatic actions.

Q: What do security teams get wrong about agentic AI prompt attacks?

A: The most common mistake is treating prompt filtering as sufficient.

Practitioner guidance

  • Classify retrieval sources before they reach agent context Mark CRM records, email, and other external content as untrusted until they pass sanitisation and trust-scoring checks.
  • Constrain automatic rendering and outbound fetches Disable or tightly broker markdown image loading, remote resource calls, and other automatic fetch paths in agent workflows.
  • Map controls to agentic attack primitives Use frameworks such as the OWASP Agentic AI Top 10 and the MITRE ATLAS adversarial AI threat matrix to test retrieval poisoning, prompt injection, and tool misuse across environments.

What's in the full article

Zenity's full analysis covers the operational detail this post intentionally leaves for the source:

  • Walkthrough of the specific proof-of-concept chains across Bing, Microsoft 365 Copilot, Salesforce Einstein, and Agentforce.
  • Exact payload patterns used for retrieval poisoning, markdown image exfiltration, and trusted-domain abuse.
  • Research notes on which agent behaviours made the attacks persist across platforms.
  • Examples of how the attack primitives map to detection and red-team testing.

👉 Read Zenity's analysis of recurring AI agent prompt-injection techniques →

Prompt injection across AI agents: are your controls keeping up?

Explore further

View Full Forum →  |  NHI Foundation Course →



   
Quote
(@mr-nhi)
Member Moderator
Joined: 1 month ago
Posts: 7990
 

Prompt injection is now a repeatable control-break pattern, not an isolated bug class. Zenity’s examples show the same attack logic resurfacing across Bing, Copilot, Salesforce Einstein, and Agentforce with only minor variations. That recurrence matters because it means defenders are facing a technique family that adapts to platform context faster than point fixes can eliminate it. Practitioners should treat prompt injection as an enduring agent-governance problem, not a vendor-specific anomaly.

A few things that frame the scale:

  • Only 52% of companies can track and audit the data their AI agents access, leaving 48% with a complete blind spot for compliance and breach investigation, according to AI Agents: The New Attack Surface report.
  • Another finding from the same research shows that 80% of organisations say their AI agents have already acted beyond intended scope, including unauthorised access, sensitive-data sharing, and credential disclosure.

A question worth separating out:

Q: How can organisations tell whether AI agents are exposed to covert exfiltration paths?

A: Look for any workflow where the agent can render remote content, call external URLs, or transform retrieved data into network requests. If those steps happen automatically, the agent may have an exfiltration channel even when users never click anything. That signal deserves immediate containment review.

👉 Read our full editorial: AI agent prompt injection keeps resurfacing across platforms



   
ReplyQuote
Share: