What breaks when bearer tokens are forwarded between AI agents?

Bearer tokens break down because possession alone becomes enough to reuse access. If one agent can forward a token to another component, the original authority can travel far beyond the intended session. That is why token binding matters: it turns a reusable credential into one that is only valid for the intended holder.

Why This Matters for Security Teams

Forwarding a bearer token between AI agents changes the security model from “who is this workload?” to “whoever holds the token can act.” That is a poor fit for autonomous systems, where one agent may chain tools, hand off work, or invoke downstream services outside the original context. In practice, this is how a narrow permission becomes a portable capability. Guidance from the OWASP Agentic AI Top 10 and NHI research such as AI Agents: The New Attack Surface report both point to the same risk: agent autonomy expands the blast radius of reusable credentials.

Bearer tokens are especially dangerous in multi-agent workflows because possession is enough, and possession is easy to copy, forward, log, cache, or leak. Once a token is detached from its intended holder, the receiving agent may inherit rights it was never meant to have. That undermines least privilege, auditability, and revocation timing at the same time. This is also why token binding, workload identity, and request-time authorization are becoming central design requirements rather than optional hardening.

In practice, many security teams discover token forwarding only after an agent has already reused access in a downstream system, rather than through intentional control design.

How It Works in Practice

The practical fix is not just “use better tokens.” It is to stop treating the token as the identity of the agent and instead treat it as one part of a broader, bound, runtime decision. For agentic systems, current guidance suggests combining workload identity, short-lived credentials, and policy evaluation at request time. That means the agent proves what it is, what task it is performing, and under what context it is allowed to proceed.

In a mature setup, the agent authenticates with a workload identity primitive such as SPIFFE or an OIDC-backed service identity, then receives a just-in-time token scoped to one task and one audience. The token should be ephemeral, audience-restricted, and ideally bound to the caller so forwarding it does not preserve authority. The decision point should also check tool, data, and session context before every sensitive action. This aligns with the direction described in the NIST AI Risk Management Framework and the operational guidance in CSA MAESTRO agentic AI threat modeling framework.

Issue credentials per task, not per environment, and revoke them immediately after completion.
Bind the token to the intended holder or execution context so forwarding fails.
Evaluate policy at runtime using policy-as-code rather than precomputing broad roles.
Separate identity proof from authorization so the agent cannot inherit extra rights through transit.

NHIMG’s OWASP NHI Top 10 research repeatedly shows that static credential handling becomes brittle once agents can pass work across components. These controls tend to break down in high-throughput orchestration layers because token exchange, caching, and retry logic can silently reintroduce bearer reuse.

Common Variations and Edge Cases

Tighter token binding often increases integration overhead, requiring organisations to balance stronger misuse resistance against interoperability and developer friction. That tradeoff is real, especially where legacy APIs, third-party SaaS tools, or message queues expect plain bearer semantics. Best practice is evolving, and there is no universal standard for every agent-to-agent path yet.

Some environments will need a hybrid model. For example, an internal orchestration service may use bound tokens for sensitive tools while still allowing limited bearer tokens for low-risk reads. Others may need token exchange patterns, where an upstream agent delegates only a narrow capability rather than forwarding the original credential. This is also where audit trails matter: if the same token appears across multiple agents, investigators lose the ability to distinguish delegation from theft. The Salesloft OAuth token breach and Guide to the Secret Sprawl Challenge show how quickly reusable credentials spread once they are exposed.

Where this guidance breaks down most often is in multi-agent systems that rely on shared caches, asynchronous retries, or long-lived service accounts, because those patterns make token provenance hard to preserve and revocation hard to enforce.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Agentic AI Top 10 and CSA MAESTRO address the attack and risk surface, while NIST AI RMF set the governance and control requirements practitioners need to meet.

Framework	Control / Reference	Relevance
OWASP Agentic AI Top 10	A01	Covers agent misuse of reusable credentials and tool chaining risks.
CSA MAESTRO	M1	Addresses runtime authorization and delegation risks in agent workflows.
NIST AI RMF	GOVERN	Supports accountability and controls for autonomous AI identity behavior.

Evaluate each agent action at runtime and limit delegation to narrowly scoped capabilities.

What breaks when bearer tokens are forwarded between AI agents?

Why This Matters for Security Teams

How It Works in Practice

Common Variations and Edge Cases

Standards & Framework Alignment

Related resources from NHI Mgmt Group