Token spend is an identity problem for AI agent governance

By NHI Mgmt Group Editorial TeamPublished 2026-06-22Domain: Agentic AI & NHIsSource: WorkOS

TL;DR: AI agent rollouts can consume 5 to 30 times the tokens of comparable chatbot interactions, and some enterprises have already blown through budgets by 4 to 11 times within 90 days, according to WorkOS and Gartner analysis. The governance gap starts in authorization: without per-agent identity, tool-level scoping, and session boundaries, cost attribution stays invisible and unusable.

At a glance

What this is: This analysis argues that AI token costs are not primarily a finance problem, but an identity and authorization problem because agent activity cannot be attributed, scoped, or terminated cleanly without dedicated controls.

Why it matters: IAM, NHI, and autonomous governance teams need the identity layer to produce the audit trail, boundary control, and session visibility that make AI spend governable before usage outruns budgets.

By the numbers:

A single user request can fan out into 10 or 20 model calls, with input tokens, not output, driving most of the bill.
Gartner analysis puts agentic tasks at roughly 5 to 30 times the token consumption of equivalent chatbot interactions.
Most enterprise agent rollouts exceed their pilot budget by 4 to 11 times within the first 90 days of broad deployment.
Gartner projects worldwide AI spending to grow roughly 47 percent this year, with agentic inference as the largest contributor.

👉 Read WorkOS's analysis of why AI token spend is an identity problem

Context

AI agent spend becomes hard to govern when the system can make repeated model calls, call tools in sequence, and continue a task without a clear identity boundary. In practice, that means the unit of cost is not the user, not the model, and not the workflow alone, but the agent session that sits between them.

That is why token governance belongs in IAM and NHI design, not only in FinOps dashboards. When an agent inherits a human credential or shares a service account, the organisation loses the ability to attribute usage, enforce scope, and shut off access cleanly at the end of the task.

Key questions

Q: How should security teams govern AI agent token spend without losing accountability?

A: Treat token spend as an identity control problem. Give each agent its own credential, scope its tool access tightly, and use session boundaries so every model call can be attributed to a specific executor and task. That makes cost governance, access review, and revocation part of the same control plane.

Q: Why do shared credentials make AI cost controls fail in practice?

A: Shared credentials collapse multiple agents into one identity, so finance cannot tell which workflow consumed which tokens and security cannot revoke one agent without disrupting others. The result is invisible usage, weak accountability, and broken chargeback. Distinct identities are the prerequisite for both governance and cost allocation.

Q: When should organisations use session-scoped tokens for AI agents?

A: Use session-scoped tokens whenever an agent performs multi-step or tool-using work that can continue after the visible user action is complete. Session boundaries prevent orphaned access, reduce runaway spend, and create a clean audit line between task completion and continued execution.

Q: What do IAM teams need to measure to know whether agent governance is working?

A: Measure whether every agent action can be tied to a unique identity, a scoped permission set, and a closed session. If any one of those joins is missing, attribution is incomplete and cost governance is still dependent on manual reconciliation rather than control evidence.

Technical breakdown

Why agentic token usage explodes beyond chatbot economics

A chatbot usually maps to one prompt and one response. An agentic workflow can branch through multiple reasoning cycles, tool calls, retrieval steps, and retries before the task is complete. That recursion is what makes token demand non-linear. The billing pattern is therefore shaped less by user volume than by task complexity, retrieval breadth, and repeated model invocation. Once tool use becomes chained and parallelised, the cost curve separates from the simple interaction model most enterprise controls were built around.

Practical implication: budget models need per-task and per-agent consumption thresholds, not just monthly model spend limits.

Why identity is the control plane for token attribution

Token attribution depends on knowing which agent acted, which credential it used, and which tool it was authorised to call. If multiple agents share a human login or a generic service account, audit logs collapse into one indistinguishable identity and cost chargeback becomes guesswork. Distinct agent identities turn model calls into attributable security events. Tool-level authorisation also gives finance a usable boundary because access scope and cost scope become the same thing.

Practical implication: issue unique identities to agents before scale-out, or cost attribution will remain structurally incomplete.

How session-scoped access changes cost governance

Session-scoped tokens create a task boundary that mirrors how agent work actually happens. When the task ends, access ends, and the credential cannot keep generating spend in the background. This matters because persistent access is what makes runaway recursion expensive. A closed session also gives operations a natural join point between audit logs, model calls, and chargeback records. That linkage is what transforms consumption data into governance evidence.

Practical implication: align agent access with explicit session boundaries so every spend event has a start, end, and accountable owner.

NHI Mgmt Group analysis

AI token governance is an identity problem before it is a finance problem. The article shows that cost visibility collapses when a system cannot tell which agent made the call, on whose behalf it acted, and what it was allowed to do. That is an IAM failure mode, not a budgeting nuance. Practitioners should treat attribution as an access-control outcome, not a downstream reporting project.

Per-agent identity is the named concept that makes chargeback possible. When every agent runs under a shared human account or a reusable service account, spend, action, and accountability merge into one record. Separate agent identity restores the ability to bind usage to a specific executor, which is essential for governance across NHI and autonomous workloads. The practical conclusion is simple: if identity is not distinct, cost is not governable.

Tool-level scoping matters because cost and privilege are now coupled. An agent that can call any connected tool can generate spend across systems that finance cannot see and IAM cannot easily separate. Fine-grained authorisation limits both blast radius and token blast radius at the same time. Teams that leave scope broad are not just over-permissioned, they are over-billable.

Session boundaries are becoming the operational dividing line for autonomous work. The article makes clear that persistent credentials let agents continue spending after the visible task has finished. That means access review, offboarding, and revocation logic must operate at the session level for AI workloads, not only at the user or application level. Without that shift, budget overruns will keep arriving faster than the review cycle can react.

MCP and adjacent agent tooling do not solve governance on their own. Protocols can move messages, but they do not create attribution, scoping, or revocation discipline. The market is heading toward a stack where identity primitives, not transport alone, determine whether agent adoption is controllable. Practitioners should evaluate agent platforms by what identity evidence they emit, not by how quickly they can wire up a tool call.

From our research:
91.6% of secrets remain valid five days after the targeted organisation is notified, showing a critical gap in remediation procedures, according to Ultimate Guide to NHIs.
Only 5.7% of organisations have full visibility into their service accounts, which is why identity attribution problems persist even when teams believe they have control.
For lifecycle context, see Ultimate Guide to NHIs for how visibility, rotation, and offboarding form the control base for machine and agent identities.

What this signals

Token governance will increasingly sit inside identity operations, not beside them. As agent fleets grow, the organisations that can answer who acted, under what scope, and within which session will be able to build chargeback and approval controls faster than those relying on finance-only telemetry. The practical signal is that IAM teams should expect ownership of attribution evidence for AI workloads.

Per-agent identity is becoming a durable governance pattern for autonomous systems. The market is moving toward controls that treat agent credentials as first-class assets with lifecycle, scope, and revocation semantics. Teams that wait for a universal standard will likely inherit a backlog of untagged spend and unreviewable access.

With only 5.7% of organisations having full visibility into their service accounts, the identity visibility gap that already affects NHI programmes will also limit AI agent cost control unless teams redesign the access layer first.

For practitioners

Issue distinct credentials to every agent Stop allowing AI agents to authenticate as the human who configured them or as a shared service account. Distinct credentials make audit logs useful, enable per-agent spend limits, and preserve accountability when multiple agents operate in parallel.
Scope agent access at the tool level Grant each agent only the tools it genuinely needs for a task, rather than broad application or environment access. Tool-level scoping constrains both privilege and token consumption, which is what makes the cost boundary enforceable.
Tie every task to a session boundary Use session-scoped tokens that expire when the task ends, and require explicit renewal for any continued operation. That prevents orphaned credentials from generating spend after the user-visible workflow has finished.
Join finance telemetry to identity logs Correlate model usage, credential issuance, and tool invocation in the same audit stream so chargeback is based on identity evidence rather than model API totals alone. This is the fastest path to trustworthy cost attribution.

Key takeaways

AI agent spend becomes unmanageable when identity, authorisation, and cost attribution are separated from each other.
Non-linear token usage makes traditional budgeting unreliable because one request can expand into many model calls and tool interactions.
Distinct agent identities, tool-level scope, and session boundaries are the practical controls that turn token spend into governed behaviour.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Non-Human Identity Top 10 address the attack and risk surface, while NIST CSF 2.0 and NIST Zero Trust (SP 800-207) set the governance and control requirements practitioners need to meet.

Framework	Control / Reference	Relevance
OWASP Non-Human Identity Top 10	NHI-01	Agent identities and shared credentials are central to the article's attribution problem.
NIST CSF 2.0	PR.AC-4	Least-privilege authorisation is required to bound tool access and cost.
NIST Zero Trust (SP 800-207)		Session boundaries and continuous verification support agent task scoping.

Map agent permissions to least privilege and review them as part of access governance.

Key terms

Agent Identity: A distinct credential and audit identity assigned to a software agent so its actions can be traced and governed separately from a human user or shared service account. In autonomous and NHI programmes, agent identity is the foundation for attribution, revocation, and scope control.
Tool-Level Authorisation: A permission model that grants an agent access only to specific tools or actions rather than broad application access. It is more precise than service-level authorisation and is essential when task cost, privilege, and auditability must stay aligned.
Session-Scoped Token: A credential that exists only for the duration of a defined task session and expires when the work is complete. For AI agents, session scoping reduces orphaned access, limits runaway spend, and creates a clean boundary for audit and chargeback.
Token Attribution: The process of linking model consumption to the identity, task, and permission set that produced it. When attribution is weak, finance sees spend but not cause, and security cannot connect access decisions to cost behaviour.

Deepen your knowledge

NHI governance, agentic AI identity, and machine identity lifecycle are core topics in our NHI Foundation Level course, the industry's only accredited NHI security programme. If you are responsible for identity security strategy or programme maturity, it is worth exploring.

This post draws on content published by WorkOS: The token bill is an identity problem. Read the original.

NHIMG Editorial Note
Published by the NHIMG editorial team on 2026-06-22.
NHI Mgmt Group — the independent authority on Non-Human Identity, IAM, and Agentic AI security. nhimg.org