Authorization for AI agents: why token-based controls matter

By NHI Mgmt Group Editorial TeamPublished 2026-05-05Domain: Agentic AI & NHIsSource: Curity

TL;DR: Enterprises need short-lived, least-privilege OAuth 2.0 access tokens for AI agents because natural language commands can otherwise trigger unauthorized backend actions, according to Curity’s analysis of an Azure Developer CLI template. The practical shift is from permanent agent access to token-driven authorization, with human approval and token exchange as the control points.

At a glance

What this is: This is Curity’s analysis of an Azure Developer CLI template showing how AI agent workflows can be secured with token-based authorization, short-lived access tokens, and protocol-aware controls.

Why it matters: It matters to IAM and NHI practitioners because AI agents behave like non-human identities, so access decisions must be scoped, ephemeral, and enforced at the API boundary.

👉 Read Curity's analysis of authorization for AI agents in Azure and MCP workflows

Context

AI agent authorization is becoming a governance problem, not just a development pattern. When an agent can translate a natural language request into tool calls, the main failure mode is not model quality alone. It is whether the agent can be constrained to the right customer, region, scope, and approval path before it touches backend APIs. That is an NHI control issue because the agent is acting with execution authority, not merely producing text.

The Curity article frames Azure AI Foundry, A2A, and MCP as a practical enterprise stack, but the real architectural question is who issues the access token, how long it lives, and what attributes it carries. For teams already working through AI agent governance, that makes this a close fit with the broader NHI lifecycle and authorization model discussed in the OWASP NHI Top 10.

The template’s starting position is typical of where many enterprises are heading: developers want a simple integration path, while identity teams need to prevent overbroad access. That split between developer velocity and security enforcement is now the default condition for agentic AI programmes.

Key questions

Q: How should security teams implement authorization for AI agents in enterprise workflows?

A: Start by treating the agent as a non-human identity with task-scoped access, not as a trusted application component. Enforce authorization at each API and tool boundary using short-lived tokens, explicit claims, and approval gates for higher-risk actions. The goal is to let the agent act only within a narrowly defined operational envelope.

Q: Why do AI agents complicate existing IAM and authorization models?

A: AI agents complicate IAM because they turn natural language into execution, which can cross systems faster than human review can intervene. Traditional standing access models assume stable actors and predictable workflows. Agents are more dynamic, so the control point must shift to ephemeral authorization, contextual claims, and continuous validation.

Q: What breaks when AI agents are given permanent API credentials?

A: Permanent credentials create standing privilege, which expands the blast radius of a compromised prompt, misrouted tool call, or malicious workflow. Once those credentials are embedded in an agent path, they are difficult to contain and easy to reuse. Short-lived tokens and scoped claims reduce that exposure significantly.

Q: How do organisations safely let AI agents perform higher-risk actions?

A: Use human approval and just-in-time privilege before issuing a higher-privilege token. The agent can prepare the request, but the final action should only occur after a person or policy gate authorizes the escalation. That keeps accountability intact and prevents autonomous overreach.

Technical breakdown

How A2A and MCP shape AI agent authorization

The template uses A2A to carry a natural language command from an external application to a backend agent, then uses MCP to let that agent invoke an MCP server and related tools. That matters because each hop creates a distinct trust boundary. The application may be authenticated, but the agent still needs to be authorised for the specific operation, and the tool server must verify the token context before executing anything. In practice, protocol chaining increases the chance of confused-deputy behaviour unless every component enforces its own authorization decision.

Practical implication: Treat every agent-to-tool hop as a separate authorization point, not a single trust decision.

Why short-lived OAuth tokens are the control primitive

Curity’s example uses short-lived least-privilege OAuth 2.0 access tokens instead of permanent API keys. That is the right direction because AI agents are inherently task-scoped and should not hold standing credentials that outlive the workflow. Token attributes such as customer_id, region, scope, and client_type allow APIs to decide whether a request is valid in context. If the token is integrity protected, the agent cannot silently rewrite those claims. This shifts security from static secret possession to verifiable, attribute-based authorization.

Practical implication: Issue ephemeral tokens with the minimum claims needed for one task, then reject any request outside that token context.

Human approval and token exchange for higher-risk actions

The article also points to a future state where an agent may start with low privilege, then trigger human approval before receiving a higher-privilege access token through token exchange. That pattern is important because many AI workflows are conditional: the agent can prepare, classify, or recommend, but not complete the sensitive step until a human or policy gate authorizes escalation. Architecturally, this is closer to just-in-time privilege than to open-ended delegation. It is also the cleanest way to preserve accountability when multiple components and even multiple organisations are involved.

Practical implication: Design escalation paths so privileged tokens are minted only after explicit approval and only for the exact downstream action.

Threat narrative

Attacker objective: The attacker aims to turn an AI agent’s tool access into unauthorized data access or unsafe backend execution.

Entry occurs when a malicious or untrusted user supplies a prompt that causes the agent to invoke a backend tool with unintended parameters.
Escalation occurs if the agent is given permanent credentials or overly broad tokens that let it exceed the intended customer, region, or scope.
Impact occurs when the backend API honours the request and exposes unauthorized data or executes an unauthorized action.

Moltbook AI agent keys breach — Moltbook breach exposed 1.5M AI agent keys.
Salesloft OAuth token breach — hackers stole OAuth tokens to access Salesforce data via Salesloft.

Read our 52 NHI Breaches Analysis report for a comprehensive view of breaches impacting Non-Human Identities including AI Agents.

NHI Mgmt Group analysis

AI agent security now sits inside the IAM control plane, not beside it. Natural language is only the trigger. The real security question is whether identity systems can bind intent to a narrowly scoped token, enforce claims at each hop, and revoke access fast enough to prevent agent drift. Practitioners should stop treating agent security as an application feature and treat it as an authorization design problem.

Token attributes are becoming the policy surface for agentic workflows. Customer, region, purpose, and client_type are not decorative claims. They are the mechanism that lets APIs distinguish an intended request from an unsafe one. This shifts the centre of gravity from secret possession to claim validation, which is where NHI governance needs to be for autonomous systems. Practitioners should design for attribute-driven enforcement rather than static role assignment alone.

Ephemeral agent access creates an identity blast radius problem. The more an organisation relies on short-lived tokens for agents, the more important it becomes to model where those tokens can be replayed, delegated, or exchanged. That is the named concept here: identity blast radius, the maximum scope of damage an agent can cause before access expires or is revoked. Practitioners should map that blast radius before expanding agent autonomy.

Human approval remains a governance control, not a workflow annoyance. The article’s approval-and-escalation model reflects a broader truth: some AI actions should be preparatory only until a person authorises the final token. That is how enterprises preserve accountability when autonomous components are allowed to assist but not to decide everything. Practitioners should reserve privileged token issuance for bounded, auditable escalations.

Platform engineering and identity engineering must converge for agentic AI to scale safely. The deployment story here is not just about infrastructure reuse. It is about separating base infrastructure, developer code, and token policy so security can scale with AI adoption. Practitioners should align platform teams and IAM teams on one operating model before agent deployments multiply.

From our research:
64% of valid secrets leaked in 2022 are still valid and exploitable today, proving that detection alone is not enough without automated revocation, according to The State of Secrets Sprawl 2026.
28.65 million new hardcoded secrets were detected in public GitHub commits in 2025 alone, a 34% year-over-year increase and the largest single-year jump ever recorded, according to The State of Secrets Sprawl 2026.
See also Guide to the Secret Sprawl Challenge for practical steps to reduce secret exposure across code and runtime paths.

What this signals

Identity blast radius is the right lens for agentic AI programmes. Once agents can call tools, the question is no longer whether authentication exists, but how far a token can travel, what claims it can carry, and how quickly it can be revoked. With 24,008 unique secrets exposed in MCP configuration files in 2025 alone, the governance gap is already appearing in the very protocol layer many teams are adopting. Align the rollout of agent workflows with policy design, revocation automation, and token telemetry.

Programmes that let developers wire AI assistants directly to APIs will need tighter ownership boundaries between platform teams and IAM teams. The operational model should assume that agent autonomy will expand before governance maturity does. That means standardising token issuance patterns, approval workflows, and internal endpoint validation now, while using the NIST AI Risk Management Framework to anchor accountability and control ownership.

The practical signal for readers is simple: if an agent can reach multiple systems through one token path, the organisation has created a reusable trust corridor. That corridor should be narrowed with context-rich claims, rotation discipline, and explicit approval for any step that changes privilege. The more multi-component the workflow becomes, the more important it is to keep access ephemeral and auditable.

For practitioners

Map agent trust boundaries end to end Document every hop from user intent to agent execution to MCP server call, then identify where authentication, authorization, and approval must be enforced separately. Treat each boundary as a point where overbroad access can leak into backend APIs.
Replace permanent agent credentials with short-lived tokens Use OAuth 2.0 access tokens with the smallest practical scope and lifetime for each agent task. Avoid API keys for autonomous workflows, and ensure internal endpoints validate the same claims as internet-facing ones.
Define token claims for customer, region, and purpose Require APIs to enforce contextual claims such as customer_id, region, scope, and purpose so an agent cannot cross tenants or expand its own rights. Keep the policy logic close to the API so token validation is unavoidable.
Build human approval into privilege escalation paths Use just-in-time privilege for sensitive actions so the agent receives higher access only after explicit approval. Keep escalation tokens narrowly scoped and auditable, with clear expiration and revocation rules.
Separate developer integration from identity policy ownership Let developers focus on application logic while an identity team owns token design, claim structure, and authorization rules. That separation makes it easier to scale AI initiatives without handing control decisions to each project team.

Key takeaways

AI agents should be governed as non-human identities with tightly bounded authorization, not as ordinary application integrations.
Short-lived OAuth tokens with contextual claims reduce the risk of agent overreach, but only if every API enforces them consistently.
Human approval and just-in-time escalation remain necessary when an agent’s next action can change privilege or expose sensitive data.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Agentic AI Top 10 address the attack and risk surface, while NIST AI RMF and NIST Zero Trust (SP 800-207) set the governance and control requirements practitioners need to meet.

Framework	Control / Reference	Relevance
OWASP Agentic AI Top 10	NHI-01	Agent tool use and privilege boundaries are central to this template.
NIST AI RMF		Agent approval, ownership, and escalation fit AI governance requirements.
NIST Zero Trust (SP 800-207)	PR.AC-4	Internal and external endpoints both need continuous authorization checks.

Map each agent tool call to a least-privilege control and block unauthorized actions by default.

Key terms

AI Agent: An AI agent is software that can act on a goal, call tools, and make execution decisions with some level of autonomy. In security terms, it behaves like a non-human identity because it may hold credentials, initiate requests, and change state without direct human action.
Short-Lived Access Token: A short-lived access token is a time-bound credential that authorizes a specific action or scope for a limited period. It reduces standing privilege by limiting how long an NHI can use access and by forcing systems to re-evaluate trust frequently.
Token Exchange: Token exchange is a pattern for trading one credential for another with different scope, audience, or privilege. For agentic workflows, it enables a low-trust action to be upgraded only after policy checks or human approval, which helps preserve accountability and limit blast radius.
Identity Blast Radius: Identity blast radius is the maximum scope of damage an identity can cause before access expires, is revoked, or is blocked. For AI agents and other NHIs, it is shaped by token lifetime, privilege scope, claim design, and how many systems trust the same credential path.

Deepen your knowledge

AI agent authorization and token design are core topics in our NHI Foundation Level course, the industry's only accredited NHI security programme. If you are building controls for agentic workflows, it is worth exploring.

This post draws on content published by Curity: authorization for AI agents in Azure and MCP workflows. Read the original.

NHIMG Editorial Note
Published by the NHIMG editorial team on 2026-05-05.
NHI Mgmt Group — the independent authority on Non-Human Identity, IAM, and Agentic AI security. nhimg.org