TL;DR: 84% of companies are already seeing more than 6% gross margin erosion from AI costs, while only 15% can forecast AI spend within ±10% accuracy, making fragmented consumption a finance and governance problem, not just an efficiency issue, according to Kong. Cost visibility is now a prerequisite for sustainable AI investment and monetisation.
At a glance
What this is: Kong frames agentic AI cost management as a control problem where fragmented usage, untracked consumption, and weak attribution drive margin erosion and block monetisation.
Why it matters: For IAM practitioners, this matters because the same governance gaps that hide AI spend also hide who or what is consuming access, whether the subject is an API key, workload, or agent identity.
By the numbers:
- 84% of companies report more than 6% gross margin erosion from AI costs.
- Only 15% of companies can forecast AI costs within ±10% accuracy.
- 61% of companies run AI workloads across a combination of public and private environments.
👉 Read Kong's analysis of agentic AI cost management and the fragmentation tax
Context
Agentic AI cost management is the discipline of measuring, attributing, and enforcing spend across models, agents, APIs, and data paths. In Kong's framing, the immediate problem is not raw AI adoption but fragmented consumption that erodes margin and makes unit economics impossible to trust.
That matters to identity teams because the same operational fragmentation that obscures spend also obscures access. When tokens, service accounts, MCP calls, and agent workflows are scattered across teams, governance loses the ability to explain who accessed what, when, and for what business purpose.
The core issue is not simply more AI traffic. It is the absence of shared controls for attribution, visibility, and enforcement across the agentic stack.
Key questions
Q: How should organisations control AI costs in agentic environments?
A: Organisations should control AI costs by combining metering, attribution, and enforcement across the full request path. That means tracking model calls, tool usage, and data movement, then applying caps or routing rules when consumption exceeds policy. Without identity-linked attribution, cost control remains reactive and finance cannot trust the numbers.
Q: Why do fragmented AI deployments create margin risk?
A: Fragmented AI deployments create margin risk because each team can consume premium models, duplicate capabilities, and move data without shared visibility. The result is hidden spend that accumulates across many small decisions. Once the quarter closes, the organisation discovers that AI growth has outpaced its ability to explain or recover the cost.
Q: How do you know if AI cost visibility is actually working?
A: AI cost visibility is working when finance can forecast within a narrow error band, product teams can price features before launch, and operations can trace unusual spend to a specific workload or identity. If the organisation still needs forensic accounting after the fact, the visibility layer is incomplete.
Q: Who should own AI FinOps in a security-led programme?
A: AI FinOps should be jointly owned by finance, platform engineering, product, and security because the controls span cost, access, and usage. Security brings policy and identity context, finance brings measurement discipline, and engineering owns implementation. A siloed model creates the same fragmentation the programme is trying to remove.
Technical breakdown
Why AI FinOps differs from traditional cloud cost control
Traditional FinOps assumes deterministic infrastructure where storage, compute, and reserved capacity can be forecast with reasonable stability. Agentic AI breaks that model because usage depends on prompt length, reasoning loops, tool calls, retrieval paths, and model choice at runtime. That makes cost per task variable rather than fixed. If an agent retries, branches, or chains multiple tool invocations, spend rises without any new feature being shipped. In practice, the unit of control is no longer the instance. It is the action chain.
Practical implication: move cost controls from infrastructure-only reporting to task-level attribution and runtime enforcement.
How fragmentation tax accumulates across models, MCP, and APIs
The fragmentation tax appears when teams build the same capability in parallel, route traffic through different models, or move data between disconnected environments without common oversight. Kong points to LLMs, MCP servers, agent-to-agent communication, APIs, and event streams as part of the same cost surface. That is important because costs can be created far from the user-facing feature, especially when one team owns the model gateway, another owns the workflow, and finance only sees the bill after the quarter closes. Fragmentation is therefore a governance failure, not a line-item problem.
Practical implication: map the full AI data path and assign ownership across engineering, finance, and product before spend fragments further.
Why monetisation fails when consumption is not metered
You cannot price AI capabilities reliably if you cannot measure how they are used. Kong's article ties weak metering to a revenue problem: usage-based billing, tiered pricing, and caps all depend on consumption data at the feature or customer level. Without that visibility, organisations either give value away or set prices by instinct. For agentic systems, this also means the same access path that enables a feature can become an ungoverned economic leak if it is not tied back to an accountable unit.
Practical implication: connect metering, billing hooks, and access attribution before launching AI-powered products at scale.
NHI Mgmt Group analysis
Agentic AI cost management is becoming an identity governance problem, not only a finance problem. Once AI consumption is distributed across model calls, APIs, MCP servers, and agent workflows, the organisation has lost a single control point for attribution. That is the same structural weakness IAM teams already recognise in shadow access and unowned service identities. The implication is that AI governance must treat usage, access, and accountability as one control plane.
Fragmentation tax is a useful named concept because it captures the hidden access layer behind AI spend. The article shows that waste does not come only from expensive models, but from duplicated capabilities, idle infrastructure, and untracked consumption spread across teams. Those are symptoms of broken ownership boundaries. Practitioners should read that as evidence that programme maturity depends on cross-domain visibility, not isolated cost dashboards.
Cost visibility is the prerequisite for enforceable policy in agentic environments. If an organisation cannot attribute a prompt chain, it cannot reliably bound the economic impact of the identity executing it. That affects AI routing, customer billing, and internal chargeback, but also the ability to detect anomalous or abusive access patterns. The implication is that identity and cost telemetry now need to be governed together.
Traditional controls built for deterministic systems do not survive probabilistic agent behaviour. The article's point about forecast error is not just financial. It shows that runtime variance has become a governance variable, which means static approval models and coarse reporting cycles lag behind the actual decision pace of agentic systems. Practitioners need to treat the operating model itself as the control surface.
Agentic AI economics will increasingly shape the identity stack selection criteria. As teams try to fund AI through consumption-based pricing and tighter metering, the market will reward platforms that can expose, attribute, and enforce usage across the entire request path. That will force IAM, API, and AI control owners to converge on shared visibility standards rather than separate reporting islands.
From our research:
- 92% agree governing AI agents is critical to enterprise security, yet only 44% have implemented any policies to do so, according to AI Agents: The New Attack Surface report.
- Only 52% of companies can track and audit the data their AI agents access, leaving 48% with a complete blind spot for compliance and breach investigation.
- For a broader control model, read OWASP Agentic AI Top 10 for the runtime risks that make attribution and enforcement necessary.
What this signals
Fragmentation tax is now a governance signal, not just a finance signal: once AI consumption spreads across multiple owners, the programme loses the ability to attribute access, enforce policy, and explain spend. That is why cost visibility, identity context, and usage enforcement need to sit in the same operating model rather than separate dashboards.
The next maturity jump will come from teams that can connect billing hooks to identity telemetry and then prove which workloads, agents, and customers drove the cost. Without that linkage, usage-based pricing remains guesswork and audit evidence stays incomplete.
For practitioners building the control stack, start with the access path, not the invoice. The invoice is the symptom. The access path is where fragmentation, duplicate capability, and waste begin.
For practitioners
- Build end-to-end AI consumption attribution Trace every AI request from user or application identity through model calls, tool use, retrieval, and downstream APIs so finance and security can see where cost and access both accumulate.
- Define ownership for each AI traffic path Assign a named owner for model gateways, MCP integrations, agent workflows, and billing hooks so duplicate spend and untracked access cannot hide between team boundaries.
- Introduce enforcement where spend becomes risky Use caps, model routing, anomaly alerts, and consumption thresholds to stop runaway loops, repeated retries, and unnecessary premium-model usage before they distort margin.
- Tie AI billing to identity and business context Connect prompt chains and agent actions to customer, feature, or department identifiers so pricing, chargeback, and audit evidence are based on measured usage rather than estimates.
Key takeaways
- Agentic AI creates a cost-governance problem because fragmented usage can erode margin long before leadership sees a single obvious failure.
- Kong's figures show that most organisations already struggle to forecast AI spend accurately, which makes unit economics and monetisation unreliable.
- Practitioners need shared visibility across identity, usage, and billing so AI programmes can scale without turning into hidden expense centres.
Standards & Framework Alignment
This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.
OWASP Agentic AI Top 10 address the attack and risk surface, while NIST AI RMF and NIST CSF 2.0 set the governance and control requirements practitioners need to meet.
| Framework | Control / Reference | Relevance |
|---|---|---|
| OWASP Agentic AI Top 10 | Agentic workflows, tool use, and runtime variance drive the cost and governance problem here. | |
| NIST AI RMF | AI governance and measurement align with AI RMF GOVERN and MAP functions. | |
| NIST CSF 2.0 | PR.AC-4 | Identity and access context is needed to attribute usage and enforce policy. |
Map AI request paths to agentic risk controls and enforce policy where runtime behavior changes spend.
Key terms
- AI FinOps: AI FinOps is the discipline of measuring, attributing, and controlling the cost of AI systems at the level of usage and business value. It extends FinOps into probabilistic workloads where spend depends on model choice, prompt shape, retries, and tool execution rather than fixed infrastructure alone.
- Fragmentation Tax: Fragmentation tax is the hidden cost created when multiple teams duplicate AI capabilities, route traffic differently, and move data without shared visibility. It is not a formal fee. It is the accumulation of waste, blind spots, and inconsistent ownership that turns AI growth into margin erosion.
- Agentic Workflow: An agentic workflow is a sequence of AI-driven actions where the system can choose steps, tools, and timing during runtime. In governance terms, that makes cost and access harder to predict because the path is not fully fixed at provisioning time.
- Consumption Attribution: Consumption attribution is the practice of linking AI usage to a customer, feature, team, or workload so cost, billing, and accountability can be measured consistently. It is the control that turns raw usage data into something finance and security can govern.
Deepen your knowledge
NHI governance, agentic AI identity, and machine identity security are core topics in our NHI Foundation Level course, the industry's only accredited NHI security programme. If you are responsible for identity security strategy or NHI governance in your organisation, it is worth exploring.
This post draws on content published by Kong: Agentic AI Cost Management: Stopping Margin Erosion and the Fragmentation Tax. Read the original.
Published by the NHIMG editorial team on 2026-01-30.
NHI Mgmt Group — the independent authority on Non-Human Identity, IAM, and Agentic AI security. nhimg.org