TL;DR: Modern enterprise traffic now splits across external APIs, internal microservices, and AI calls, and separate tools create policy drift, fragmented observability, and security gaps, according to Kong. Its analysis frames unified control planes as the practical response to governance sprawl.
At a glance
What this is: This is Kong’s analysis of why API, microservice, and AI traffic increasingly need one control plane, with the key finding that fragmented tooling creates policy drift, observability gaps, and inconsistent security.
Why it matters: It matters because identity, access, and policy enforcement are now being applied across APIs, services, and AI workloads at once, and practitioners need one governance model rather than three disconnected ones.
By the numbers:
- The global API management market was valued at USD 7.44 billion in 2024 and is projected to reach USD 108.61 billion by 2033, a CAGR of 34.7%.
- Kubernetes production adoption has surged from 66% in CNCF's 2023 survey to 82% in 2025.
- Semantic caching can reduce API calls by up to 68.8% with cache hit rates ranging from 61.6% to 68.8%.
👉 Read Kong’s analysis of unified control planes for API, microservice, and AI traffic
Context
Enterprise architecture now has three traffic classes that look similar on the wire but behave differently in governance terms: external APIs, east-west microservices, and AI or LLM requests. The identity problem is that teams often secure each pattern with separate tools, which makes policy enforcement inconsistent and observability incomplete.
For IAM, NHI, and platform teams, the important shift is not just that traffic has multiplied. It is that access control, authentication, and auditability now need to span human-driven API use, service-to-service calls, and AI workloads without creating separate trust models for each.
Kong’s argument is that a unified control plane can reduce the operational sprawl created by duplicated gateways and fragmented policy engines. That is a familiar pattern for security teams: once governance splits by traffic type, the programme starts optimising tools instead of control.
Key questions
Q: How should security teams govern API, service, and AI traffic together?
A: Security teams should govern these traffic types through a shared control plane that centralises policy enforcement, observability, and auditability. The goal is consistent identity and access handling across external APIs, east-west service calls, and AI requests. Without that alignment, teams inherit policy drift and fragmented evidence that make operations and compliance harder to defend.
Q: Why does gateway sprawl create identity governance risk?
A: Gateway sprawl creates identity governance risk because each gateway often introduces its own policy language, logging model, and exception handling. Over time, the same request can be authorised differently depending on the path it takes. That inconsistency weakens accountability, complicates audits, and makes it harder to prove that controls are working uniformly.
Q: When should organisations centralise control for AI traffic?
A: Organisations should centralise control for AI traffic when model access, prompt handling, or data exposure decisions are being managed by multiple teams or tools. If the same organisation must prove who accessed which model, with what content, and under what rules, a shared governance layer becomes necessary rather than optional.
Q: What is the difference between an API gateway and a unified control plane?
A: An API gateway primarily enforces traffic controls for request routing, authentication, and throttling. A unified control plane extends that idea across multiple traffic types and governance layers, so APIs, service mesh traffic, and AI requests can share policy, observability, and audit logic. The difference is scope and consistency.
Technical breakdown
Unified control plane for API, service, and AI traffic
A unified control plane centralises policy enforcement, observability, and governance across multiple traffic types while leaving the data plane to handle requests. In practice, that means one layer can apply authentication, rate limiting, routing, and audit policy to external APIs, internal service calls, and AI requests. The architectural value is consistency: teams define governance once and reuse it across traffic patterns that would otherwise drift apart. The identity implication is that policy becomes portable across workloads rather than locked inside separate gateway stacks.
Practical implication: map your policy decisions to one shared control layer before traffic-specific implementations diverge.
Why gateway sprawl creates identity and security drift
Gateway sprawl is what happens when each traffic class gets its own platform, language, and operating model. That creates duplicated policy logic, inconsistent logging, and different enforcement semantics for the same business process. Over time, the organisation no longer has one access model. It has several partial models that do not line up during audit, incident response, or change management. For identity teams, this is not just tooling complexity. It is a governance defect that weakens assurance across API, service, and AI traffic.
Practical implication: inventory where access rules are duplicated across gateways and remove any policy logic that cannot be governed centrally.
AI gateway controls and the identity layer for LLM traffic
AI gateway controls extend API management patterns to LLM traffic, but the risk profile changes because prompts may carry sensitive data, cost can scale per token, and model routing can vary by provider. Controls such as PII scrubbing, semantic caching, and multi-model routing reduce operational friction, yet they also create an identity question: who is allowed to invoke which model, on what terms, and with what data exposure? That makes AI traffic governance a policy and attribution problem as much as a transport problem.
Practical implication: treat model access, prompt handling, and audit trails as governed entitlements, not just application features.
Threat narrative
Attacker objective: The objective is to exploit inconsistent governance so that sensitive traffic can move through the stack with weaker inspection, weaker attribution, and less reliable enforcement.
- Entry occurs when fragmented gateways and per-team controls allow traffic to bypass a consistent enforcement path across APIs, services, and AI calls.
- Escalation occurs when duplicated policy implementations drift, so access decisions, logging, and data handling no longer match from one layer to the next.
- Impact is inconsistent control over sensitive requests, which increases exposure, slows incident response, and makes compliance evidence incomplete.
Breaches seen in the wild
- Moltbook AI agent keys breach — Moltbook breach exposed 1.5M AI agent keys.
- AI LLM hijack breach — attackers used stolen AWS access keys to hijack Anthropic LLM models on Bedrock.
Read our 52 NHI Breaches Analysis report for a comprehensive view of breaches impacting Non-Human Identities including AI Agents.
NHI Mgmt Group analysis
Unified traffic governance is becoming an identity problem, not just an infrastructure problem. Once APIs, microservices, and AI requests are all routed through separate control planes, the organisation loses one coherent policy story. That fragments auditability, complicates accountability, and forces identity teams to maintain different trust assumptions for systems that are functionally part of the same business flow. Practitioners should treat unified control as a governance model, not a platform preference.
Gateway sprawl creates control drift faster than most teams can reconcile it. Each additional gateway introduces a new policy language, a new logging model, and a new place for exceptions to accumulate. That is why the risk is cumulative rather than additive. A team may believe it has consistent enforcement until a request path crosses systems and the gaps become visible. Practitioners should look for duplicated policy logic as an early sign of governance decay.
AI traffic forces the control plane to carry identity and data-handling decisions together. AI calls are still API-like, but the governance stakes are different because prompts may include sensitive content, and model choice can change cost and exposure instantly. Policy enforcement across AI traffic was designed for stable request patterns. That assumption weakens when model routing, caching, and data handling are being decided dynamically at runtime. Practitioners should re-evaluate whether their current governance model can represent that mix cleanly.
Operational simplicity is now a security requirement. The more teams have to correlate across dashboards, policy engines, and log formats, the less defensible their access story becomes. Unified control plane thinking matters because it reduces the number of places where identity and policy can diverge. Practitioners should judge platforms by whether they improve governance coherence, not by how many features they add.
Unified observability is the real benefit behind the platform narrative. The value is not merely one console. It is the ability to trace a request from caller identity through routing, policy decision, and response handling without stitching evidence together after the fact. That is the difference between operational visibility and forensic reconstruction. Practitioners should prioritise request traceability as a design requirement.
From our research:
- 92% agree governing AI agents is critical to enterprise security, yet only 44% have implemented any policies to do so, according to AI Agents: The New Attack Surface report.
- Only 52% of companies can track and audit the data their AI agents access, leaving 48% with a complete blind spot for compliance and breach investigation.
- For a deeper identity lens, see OWASP Agentic AI Top 10 for the controls most likely to fail when runtime decisions and tool access converge.
What this signals
Unified control plane thinking will increasingly be judged by whether it reduces governance entropy, not just tool count. For platform and IAM teams, the practical question is whether one policy layer can carry the same identity, audit, and data-handling rules across all traffic classes. If it cannot, the organisation will keep paying the hidden cost of reconciliation after the fact.
The broader signal is that AI traffic is being folded into the same governance stack as APIs and microservices, which raises the bar for traceability and policy coherence. Teams should expect more scrutiny on how model access is authorised, how prompts are handled, and whether request paths can be reconstructed without manual log stitching.
Policy drift is now a measurable architecture risk. When 92% of organisations say governing AI agents is critical but only 44% have policies in place, the gap is already structural, according to our research on AI agents as an attack surface. Practitioners should assume that any multi-gateway environment will leak inconsistency unless governance is explicitly centralised.
For practitioners
- Inventory duplicated policy engines Map where the same authentication, rate-limiting, or data-handling logic exists across API gateways, meshes, and AI gateways. Remove any control that cannot be enforced or audited from a shared governance layer.
- Standardise request traceability across traffic types Require a common request identifier, logging schema, and audit trail from API, microservice, and AI paths so incident response does not depend on manual log correlation.
- Separate model access from application logic Treat AI model invocation as a governed entitlement so policy, routing, and data-handling rules are not hardcoded into individual applications.
- Review prompt handling as sensitive-data governance Apply the same data classification and redaction discipline to AI prompts that you already use for other sensitive request content, especially where PII or secrets can be introduced.
- Use central policy to reduce gateway drift Consolidate rules where possible so security changes, compliance controls, and observability settings are updated once instead of being copied into multiple platforms.
Key takeaways
- The core problem is not API traffic growth on its own, but the governance fragmentation created when APIs, microservices, and AI calls are managed separately.
- Duplicated policy engines, inconsistent observability, and mixed security controls create drift that weakens accountability and complicates compliance evidence.
- Practitioners should evaluate unified control plane strategies by how well they preserve identity, policy, and audit coherence across every traffic path.
Standards & Framework Alignment
This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.
OWASP Agentic AI Top 10 address the attack and risk surface, while NIST CSF 2.0 and NIST Zero Trust (SP 800-207) set the governance and control requirements practitioners need to meet.
| Framework | Control / Reference | Relevance |
|---|---|---|
| OWASP Agentic AI Top 10 | A2 | AI gateway governance must account for tool and routing misuse. |
| NIST CSF 2.0 | PR.AC-4 | Consistent access enforcement across traffic types is a core access-control issue. |
| NIST Zero Trust (SP 800-207) | Zero Trust principles apply to service-to-service and AI traffic verification. |
Apply continuous verification and least privilege across every traffic path, not just external APIs.
Key terms
- Unified Control Plane: A unified control plane is a single management layer that applies policy, observability, and governance across different traffic types. In identity terms, it reduces the chance that access rules, audit logs, and enforcement behaviour diverge as requests move between APIs, services, and AI workloads.
- Gateway Sprawl: Gateway sprawl is the operational condition where multiple gateways or control layers manage similar traffic without a shared policy model. The result is duplicated logic, inconsistent security behaviour, and weak audit consistency, which makes identity governance harder to prove and easier to fragment.
- AI Traffic Governance: AI traffic governance is the set of controls that decide who can invoke models, what content they can send, and how those interactions are logged and protected. It combines access control, data handling, and observability because AI requests can affect both security exposure and operating cost.
- Policy Drift: Policy drift is the gradual divergence between intended controls and what is actually enforced across systems. In multi-gateway environments, it appears when the same rule is implemented differently in separate tools, leaving the organisation with uneven access decisions and incomplete assurance.
Deepen your knowledge
NHI governance, agentic AI identity, and machine identity security are core topics in our NHI Foundation Level course, the industry's only accredited NHI security programme. If you are responsible for identity security strategy or NHI governance in your organisation, it is worth exploring.
This post draws on content published by Kong: From Microservices to AI Traffic: Kong's Unified Control Plane When Architecture Gets Complicated. Read the original.
Published by the NHIMG editorial team on 2026-03-30.
NHI Mgmt Group — the independent authority on Non-Human Identity, IAM, and Agentic AI security. nhimg.org