They often treat cost as a finance-only issue and overlook the identity layer that drives usage. Without attribution, shadow AI discovery, and policy enforcement at the request path, teams can reduce waste in one area while leaving the real source of token growth untouched.
Why This Matters for Security Teams
AI cost control is usually framed as token spend, model choice, or procurement discipline, but the real waste often begins with identity. When teams cannot tell which workload, user, or agent caused a request, they cannot separate legitimate automation from shadow usage, repeated retries, or misconfigured tool calls. That turns budgeting into a reactive exercise instead of a control problem. The same blind spot shows up in NHI programs, where visibility gaps undermine both cost and risk management, as discussed in the State of Non-Human Identity Security.
Security teams also miss that AI usage is not static. A single prompt can fan out into multiple tool calls, retrieval steps, and chained actions, each with its own cost profile and identity trail. Without policy at the request path, throttling invoices after the fact does little to stop runaway usage. Current guidance from the NIST Cybersecurity Framework 2.0 points toward continuous governance, not periodic cost cleanup. In practice, many security teams discover excessive AI spend only after shared keys, unmanaged agents, or duplicated workflows have already consumed budget.
How It Works in Practice
Effective AI cost control starts by treating identity as the billing boundary. Every model request should be attributable to a workload, service, agent, or approved human workflow, with that attribution carried through logs, policy decisions, and chargeback reports. This is where non-human identity governance and AI governance converge: if the caller cannot be identified, the organisation cannot enforce per-task limits, per-app quotas, or differentiated rules for sanctioned versus unsanctioned use. The State of Secrets in AppSec shows how fragmented control and poor secret hygiene keep organisations from seeing where usage really originates.
Practitioners usually need four controls working together:
- Attribution at the identity layer, so each request maps to a specific app, agent, or team.
- Shadow AI discovery, so unsanctioned endpoints, keys, and brokered access paths are found early.
- Policy enforcement at the request path, so costly or risky calls are blocked before they run.
- Short-lived secrets or ephemeral credentials, so overuse cannot persist through long-lived static access.
For autonomous workloads, this is especially important because one “user” can actually be an AI agent that chains multiple actions without human pause. The operational model should therefore combine workload identity, runtime authorization, and usage telemetry, rather than relying on monthly budget reviews. The DeepSeek breach is a useful reminder that hidden dependencies and unclear access paths can quickly turn into broader exposure when controls are not enforced where the request originates. These controls tend to break down in environments with shared API keys and informal developer sandboxes because attribution disappears the moment a single credential is reused across teams.
Common Variations and Edge Cases
Tighter AI cost controls often increase friction for developers and product teams, so organisations have to balance spend reduction against workflow speed. Best practice is evolving, but there is no universal standard for how aggressively to block, throttle, or approve AI requests across every use case. A research assistant, customer-support copilot, and autonomous agent should not be governed with the same thresholds, even if they call the same model.
One common mistake is assuming all cost overruns are abuse. In reality, legitimate spikes can come from inefficient prompting, duplicated retrieval, or poorly bounded agent loops. Another edge case is cross-tenant or third-party usage, where the enterprise owns the bill but not the runtime. In those scenarios, the strongest control may be contractual logging requirements combined with identity-aware policy and a review path for exceptions. Current NHI guidance suggests using the Ultimate Guide to NHIs as a standards reference for aligning identity governance with operational controls.
For teams implementing chargeback or showback, the practical test is simple: can cost be traced to a specific identity and action without manual investigation? If not, the organisation is managing AI spend as an accounting issue rather than a security control issue, and the hidden waste will keep reappearing in different places.
Standards & Framework Alignment
This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.
OWASP Non-Human Identity Top 10 and OWASP Agentic AI Top 10 address the attack and risk surface, while NIST AI RMF set the governance and control requirements practitioners need to meet.
| Framework | Control / Reference | Relevance |
|---|---|---|
| OWASP Non-Human Identity Top 10 | NHI-03 | Identity attribution and secret control are central to stopping unmanaged AI spend. |
| OWASP Agentic AI Top 10 | A-05 | Agentic workflows can chain calls and inflate cost without human visibility. |
| NIST AI RMF | AI RMF governance covers accountability, monitoring, and controls for AI usage risk. |
Assign ownership for AI cost governance and continuously monitor usage, exceptions, and policy outcomes.