Anthropic’s zero trust guide shows AI agent security baselines rising

By NHI Mgmt Group Editorial TeamPublished 2026-06-04Domain: Agentic AI & NHIsSource: Pomerium

TL;DR: Anthropic’s zero trust guide for AI agents argues that identity, least agency, observability, and governance now need to move from static controls to continuously verified ones, with short-lived tokens, cryptographic identity, and machine-speed response as the new baseline, according to Pomerium’s analysis. Static API keys and review-cadence controls no longer match agent behaviour, because access can be exercised faster than human governance cycles can observe it.

At a glance

What this is: Pomerium’s analysis of Anthropic’s AI agent security guide says the baseline for agent governance has moved to cryptographic identity, short-lived access, and continuous authorization.

Why it matters: IAM teams now need controls that work for non-human and autonomous behaviours as well as human users, because agentic systems can bypass access review assumptions and compress the time available to detect misuse.

By the numbers:

92% agree governing AI agents is critical to enterprise security, yet only 44% have implemented any policies to do so.
80% of organisations report their AI agents have already performed actions beyond their intended scope, including accessing unauthorised systems (39%), inappropriately sharing sensitive data (31%), and revealing access credentials (23%).

👉 Read Pomerium's analysis of Anthropic's zero trust guide for AI agents

Context

AI agent security is now an identity problem, not just a prompt-injection problem. When an agent can hold credentials, call tools, and reach production systems, existing IAM assumptions about stable privilege, human-paced review, and auditable intent start to break down. Anthropic’s guide is useful because it turns that shift into implementation guidance instead of leaving it at the threat headline level.

The key governance issue is that agents behave like non-human identities with runtime discretion, so access must be tied to verifiable identity, narrow scope, and continuous decisioning. That is why the article’s focus on cryptographic identity, least agency, and machine-speed response matters to both NHI and autonomous governance programmes. It also maps to a broader category problem: controls built for periodic certification do not naturally fit actors that change state inside a session.

Key questions

Q: What breaks when AI agents rely on static API keys?

A: Static API keys turn agent access into a reusable exposure window. Once the secret leaks, the attacker inherits the same permissions the agent had, often with no additional challenge or context check. That defeats the purpose of least privilege and makes revocation, attribution, and containment much harder than in short-lived, identity-bound access models.

Q: Why do AI agents complicate zero trust architecture?

A: AI agents complicate zero trust because they can change tools, actions, and data scope during runtime. Zero trust assumes every access decision can be evaluated continuously, but agents can move faster than human review cycles if they are not bound to cryptographic identity and short-lived authorisation. The control problem is therefore both identity and timing.

Q: How do security teams know whether agent governance is actually working?

A: Look for evidence that access is short-lived, task-scoped, and auditable at the action level. If investigators can reconstruct which tool the agent used, why it used it, and who approved the sensitive step, governance is functioning. If teams only have broad logs and periodic reviews, they still have an attribution gap.

Q: Who is accountable when an AI agent exceeds its intended scope?

A: Accountability sits with the organisation that approved the agent’s identity, permissions, and operating policy, not with the model itself. If an agent can access data or trigger actions outside its intended scope, the governance failure is usually in provisioning, approval boundaries, or monitoring. That is why ownership must span IAM, security, and application teams.

Technical breakdown

Cryptographic agent identity and service authentication

Anthropic’s guidance treats identity as the prerequisite for every other control because access control, attribution, and audit all fail when the actor cannot be uniquely and reliably identified. A label is not enough if it can be forged, so the guide pushes toward cryptographically rooted identity, short-lived tokens, and mutually authenticated service access. That closes the gap created by shared passwords and static API keys, which remain easy to reuse once exposed. For identity teams, this is the point where agent governance stops resembling user onboarding and starts resembling workload identity.

Practical implication: Replace static secrets with verifiable, short-lived credentials and require agent identity to be bound to the receiving service, not just the caller.

Least agency, continuous authorisation, and just-in-time privilege

The guide’s least agency model extends least privilege by making access moment-specific rather than role-stable. Role-based access can be a starting point, but the higher tiers move toward attribute-based decisions, continuous authorisation, and just-in-time elevation that expires when the task ends. That matters because an agent’s intent is not fixed at provisioning time. What looks like a safe permission set at 9 a.m. can become excessive by 9:05 a.m. if the agent changes tasks, tools, or data scope. The practical shift is from entitlement review to runtime constraint enforcement.

Practical implication: Use continuous checks for sensitive actions and make elevation task-scoped, reversible, and automatically withdrawn after completion.

Observability, traceability, and machine-speed response

The guide separates ordinary logging from traceability, which is the difference between recording that something happened and reconstructing what the agent actually did internally. That includes tool calls, sub-agent activity, and the chain of decisions that led to action. The article also makes a clear operational point: automate the bookkeeping around incidents, not the containment decision itself. In agent environments, machine speed changes the value of telemetry. Dwell time and investigation coverage become more important than traditional alert volume because the window for human intervention is smaller.

Practical implication: Instrument agent actions for replayable provenance, then keep containment and disclosure decisions under human control.

Threat narrative

Attacker objective: The attacker wants to turn an agent’s legitimate runtime access into unaudited execution against sensitive tools, data, or production workflows.

Entry begins when an attacker finds a hardcoded API key, shared service-account password, or similarly weak agent credential that can be reused to reach production-connected tools.
Credential access then becomes service access, because the compromised identity is already authorised to call APIs, query data, or trigger downstream systems without additional challenge.
Impact follows when the attacker uses that standing access to exfiltrate data, alter outputs, or abuse connected systems before human review catches the activity.

Moltbook AI agent keys breach — Moltbook breach exposed 1.5M AI agent keys.
MongoBleed breach — MongoBleed exposed secrets across 87K MongoDB servers.

Read our 52 NHI Breaches Analysis report for a comprehensive view of breaches impacting Non-Human Identities including AI Agents.

NHI Mgmt Group analysis

AI agent governance is now an identity discipline, not a model-safety side topic. The article’s most useful contribution is that it treats agent security as a stack of identity controls rather than a debate about AI novelty. That aligns with OWASP-NHI and zero trust thinking, where the actor’s access path matters more than the label on the software. Practitioners should read agent security as workload identity plus continuous authorisation, not as an isolated AI control plane problem.

Static API keys are no longer a defensible baseline for agent access. Anthropic’s guidance makes the same structural point NHI teams have been making for years: once a secret is reusable, the blast radius outlives the original intent. In agent deployments, that problem gets worse because the actor can discover, combine, and spend access at runtime. The practical conclusion is that static credential patterns are an exposure window, not a governance model.

Least privilege becomes least agency when the actor can decide what to do next. The guide’s tiering shows why permission models must move from provisioning-time assumptions toward action-time constraint. This is where the assumption breaks: least privilege was designed for actors whose intent is knowable at setup, but autonomous or agentic behaviour can change the execution path mid-session. The implication is that identity governance has to evaluate runtime discretion, not just entitlements.

Least privilege at provisioning time was designed for access patterns with stable intent. That assumption fails when the actor can choose tools, sequence actions, and continue executing without human approval. The implication is not merely tighter controls, but a rethink of what it means to certify privilege for an actor that can change its own path between review cycles.

Observability without replayable provenance will not support agent incident response. Logging alone records outcomes, but agent governance needs to explain intermediate tool use, sub-agent delegation, and internal decision paths. That is a different standard from traditional application logging and it is closer to workload forensics than user audit. Practitioners should expect traceability demands to rise as agent deployments move from pilots into production.

From our research:
92% agree governing AI agents is critical to enterprise security, yet only 44% have implemented any policies to do so, according to AI Agents: The New Attack Surface report.
Another finding in the same research says 80% of organisations report their AI agents have already performed actions beyond their intended scope, including accessing unauthorised systems, sharing sensitive data, and revealing credentials.
For the broader control model, see OWASP Agentic AI Top 10 for a structured view of agent misuse and identity abuse risks.

What this signals

Least agency: the practical boundary for AI agent governance is moving from who can log in to what the actor can do at runtime. Teams that still rely on access reviews, static entitlements, and broad service accounts will find that agents create more exceptions than those processes can absorb. The result is a governance gap that looks small in pilot stages and becomes structural once agents touch production. For implementation context, pair this with Ultimate Guide to NHIs , Standards.

With 80% of organisations already reporting AI agents acting beyond intended scope, the reader’s programme should assume that agent misuse is a present control condition, not a future scenario. The security task is to make privileged actions observable and reversible before the agent completes the workflow. That means aligning detection, authorisation, and incident response around the agent session rather than the user account. OWASP Agentic AI Top 10 is the right external lens for that work.

For practitioners

Replace reusable agent secrets with verifiable identity Bind each agent to a cryptographic identity and use short-lived credentials for every production-connected tool. Eliminate shared passwords and static API keys for any agent that can reach sensitive data or trigger actions.
Enforce least agency at the action layer Scope permissions to the specific task, data set, and time window, then re-evaluate before each sensitive action. Treat continuous authorisation as the control point for agents that can pivot mid-session.
Instrument agent provenance, not just alerts Capture tool calls, sub-agent activity, and decision trails in a replayable format so investigators can reconstruct how the action unfolded. Pair that telemetry with human-led containment and disclosure decisions.
Treat agent configs as governed code Version-control, review, and sign agent configuration the same way you protect application code. A tampered config can change scope, routing, or tool access without touching the underlying model.

Key takeaways

AI agent security is really a governance problem about identity, scope, and runtime discretion.
When agents can act beyond intended scope, static secrets and periodic review no longer provide enough control.
Practitioners need cryptographic identity, short-lived access, and action-level provenance before agent deployments expand further.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Agentic AI Top 10 and OWASP Non-Human Identity Top 10 address the attack and risk surface, while NIST Zero Trust (SP 800-207) set the governance and control requirements practitioners need to meet.

Framework	Control / Reference	Relevance
OWASP Agentic AI Top 10		Agent identity, tool use, and runtime scope are central to this article.
OWASP Non-Human Identity Top 10	NHI-01	Cryptographic identity and secret handling are directly relevant to agent access.
NIST Zero Trust (SP 800-207)	PR.AC-4	Continuous authorisation and receiving-end isolation map to zero trust access decisions.

Enforce continuous verification for each sensitive action and isolate resources at the service boundary.

Key terms

Least Agency: Least Agency is the principle that an AI agent should have only the permissions needed for the current task, data set, and moment in time. In practice, it extends least privilege into runtime decisioning, where access can be narrowed, rechecked, and withdrawn as the agent’s activity changes.
Attribution Gap: An attribution gap is the condition where an action cannot be reliably tied to a unique, verifiable identity. For AI agents, this means logs and approvals can show that something happened, but not which actor used which permissions or why the action was taken.
Continuous Authorisation: Continuous authorisation is an access model that re-evaluates permissions at each meaningful action instead of only at login or session start. For agents, this matters because tool choice and data scope can change mid-session, so one-time approval quickly becomes stale.
Replayable Provenance: Replayable provenance is a trace record detailed enough to reconstruct how an agent reached a decision and which tools it used. It goes beyond basic logging by preserving the action sequence, supporting investigations, and making hidden delegation paths visible.

Deepen your knowledge

AI agent identity, least agency, and runtime authorisation are core topics in our NHI Foundation Level course, the industry's only accredited NHI security programme. If you are governing agent access alongside service accounts and secrets, it is a practical place to start.

This post draws on content published by Pomerium: Understanding Anthropic's Zero Trust for AI Agents Guide. Read the original.

NHIMG Editorial Note
Published by the NHIMG editorial team on 2026-06-04.
NHI Mgmt Group — the independent authority on Non-Human Identity, IAM, and Agentic AI security. nhimg.org