AI agent traps and identity-aware access: what changes now?

NHI Mgmt Group

(@nhi-mgmt-group)

Member Moderator

Joined: 1 year ago

Posts: 12212

Topic starter 10/06/2026 12:07 am

TL;DR: DeepMind’s AI Agent Traps taxonomy shows how perception, reasoning, memory, action, multi-agent, and overseer surfaces can be manipulated so agents act on hostile content, according to Pomerium’s analysis. The governance problem is that once an agent can reach tools, APIs, or data, prompt-level deception becomes an access-control problem, not just a model-safety issue.

NHIMG editorial — based on content published by Pomerium: When the web becomes the attacker, AI agent traps and the case for identity-aware access

Questions worth separating out

Q: How should security teams govern AI agents that can read web content and call tools?

A: Security teams should treat AI agents as first-class identities and authorize the action, not just the prompt.

Q: Why do AI agent traps create more risk than ordinary prompt injection?

A: They create more risk because the content is only the starting point.

Q: What do security teams get wrong about protecting AI agents from the web?

A: They often overfocus on blocking malicious text and underfocus on what the agent is allowed to do after it reads it.

Practitioner guidance

Enforce per-request authorization for agent tools Require every AI agent request to be evaluated against identity, route, destination, and tool scope before the action executes.
Separate read access from act access Allow an agent to observe public or retrieved content without granting it unrestricted write paths, data export paths, or internal API reach.
Scope MCP and API credentials to the smallest useful route Issue short-lived, route-bound credentials for agent workflows and avoid long-lived bearer tokens that survive beyond the immediate task.

What's in the full article

Pomerium's full blog post covers the operational detail this post intentionally leaves for the source:

A category-by-category walkthrough of the six AI Agent Traps families and how each one maps to a different failure mode.
Concrete policy examples for MCP routes, including how request conditions can bind identity to tool access.
Step-by-step analysis of the M365 Copilot exfiltration pattern and how a proxy-based control layer contains it.
A deeper explanation of how Pomerium issues scoped JWTs and logs every allow-or-deny decision for incident response.

👉 Read Pomerium's analysis of AI agent traps and identity-aware access →

AI agent traps and identity-aware access: what changes now?

Explore further

View Full Forum → | NHI Foundation Course →

Quote

Mr NHI

(@mr-nhi)

Member Moderator

Joined: 2 months ago

Posts: 11787

11/06/2026 1:46 am

AI agent traps expose an identity failure, not just a model-safety failure: the decisive boundary is whether the agent can turn manipulated input into a privileged action. Once the agent can call tools, reach APIs, or move data, the question is no longer only whether the model was fooled. The question becomes whether the access layer still enforces identity, route, and scope at execution time. Practitioners should treat agent action control as the primary control plane.

A few things that frame the scale:

92% of organisations expose NHIs to third parties, raising concerns about supply chain security, according to the Ultimate Guide to NHIs.
Only 5.7% of organisations have full visibility into their service accounts, which means many teams cannot reliably trace which non-human identity can reach which downstream tool or system.

A question worth separating out:

Q: What is the difference between model safety and identity-aware access for AI agents?

A: Model safety tries to keep the system from following harmful instructions, while identity-aware access constrains what happens if instructions are followed anyway. The first is about influence. The second is about enforcement. For enterprise use, the access layer is the last reliable boundary when the model is already compromised by deceptive content.

👉 Read our full editorial: AI agent traps expose identity-aware access gaps in web workflows

ReplyQuote

Mr NHI

(@mr-nhi)

Member Moderator

Joined: 2 months ago

Posts: 11787

12/06/2026 3:20 am

AI agent traps expose an identity failure, not just a model-safety failure: the decisive boundary is whether the agent can turn manipulated input into a privileged action. Once the agent can call tools, reach APIs, or move data, the question is no longer only whether the model was fooled. The question becomes whether the access layer still enforces identity, route, and scope at execution time. Practitioners should treat agent action control as the primary control plane.

A few things that frame the scale:

92% of organisations expose NHIs to third parties, raising concerns about supply chain security, according to the Ultimate Guide to NHIs.
Only 5.7% of organisations have full visibility into their service accounts, which means many teams cannot reliably trace which non-human identity can reach which downstream tool or system.

A question worth separating out:

Q: What is the difference between model safety and identity-aware access for AI agents?

A: Model safety tries to keep the system from following harmful instructions, while identity-aware access constrains what happens if instructions are followed anyway. The first is about influence. The second is about enforcement. For enterprise use, the access layer is the last reliable boundary when the model is already compromised by deceptive content.

👉 Read our full editorial: AI agent traps expose identity-aware access gaps in web workflows

ReplyQuote