TL;DR: 84% of companies are already seeing more than 6% gross margin erosion from AI costs, while only 15% can forecast AI spend within ±10% accuracy, making fragmented consumption a finance and governance problem, not just an efficiency issue, according to Kong. Cost visibility is now a prerequisite for sustainable AI investment and monetisation.
NHIMG editorial — based on content published by Kong: Agentic AI Cost Management: Stopping Margin Erosion and the Fragmentation Tax
By the numbers:
- 84% of companies report more than 6% gross margin erosion from AI costs.
- Only 15% of companies can forecast AI costs within ±10% accuracy.
- 61% of companies run AI workloads across a combination of public and private environments.
Questions worth separating out
Q: How should organisations control AI costs in agentic environments?
A: Organisations should control AI costs by combining metering, attribution, and enforcement across the full request path.
Q: Why do fragmented AI deployments create margin risk?
A: Fragmented AI deployments create margin risk because each team can consume premium models, duplicate capabilities, and move data without shared visibility.
Q: How do you know if AI cost visibility is actually working?
A: AI cost visibility is working when finance can forecast within a narrow error band, product teams can price features before launch, and operations can trace unusual spend to a specific workload or identity.
Practitioner guidance
- Build end-to-end AI consumption attribution Trace every AI request from user or application identity through model calls, tool use, retrieval, and downstream APIs so finance and security can see where cost and access both accumulate.
- Define ownership for each AI traffic path Assign a named owner for model gateways, MCP integrations, agent workflows, and billing hooks so duplicate spend and untracked access cannot hide between team boundaries.
- Introduce enforcement where spend becomes risky Use caps, model routing, anomaly alerts, and consumption thresholds to stop runaway loops, repeated retries, and unnecessary premium-model usage before they distort margin.
What's in the full article
Kong's full article covers the operational detail this post intentionally leaves for the source:
- A deeper breakdown of AI FinOps metrics, including the specific cost categories teams should track across model, compute, storage, and egress.
- Practical examples of metering and billing hooks for agentic workflows, including how consumption can be tied back to products and customers.
- The article's own framework for deciding when to route work to smaller models, when to cap usage, and how to surface anomalies early.
- More detail on the cost dashboard structure Kong recommends for finance and platform teams.
👉 Read Kong's analysis of agentic AI cost management and the fragmentation tax →
Agentic AI cost management: what the fragmentation tax means?
Explore further