By NHI Mgmt Group Editorial TeamPublished 2026-05-04Domain: Agentic AI & NHIsSource: Kong

TL;DR: As organizations move from LLM experiments to production AI, Kong says fragmented model, MCP, and API integrations create governance, observability, and security gaps that edge inspection and gateway control are meant to close. The core issue is not model choice but control-plane sprawl, where policy, routing, and data protection are no longer centrally enforceable.


At a glance

What this is: This is a Kong architecture post arguing that production AI needs a unified gateway and edge security layer to govern LLM and MCP traffic safely.

Why it matters: It matters because IAM, security, and platform teams now have to govern AI request flows, tool access, and policy enforcement together rather than as separate control problems.

By the numbers:

👉 Read Kong's analysis of secure AI infrastructure for agentic applications


Context

Production AI is no longer just a model-selection problem. Once LLMs, MCP servers, APIs, and orchestration layers are combined, the security question becomes how to enforce authentication, routing, observability, and policy across a distributed request path without losing control at the edge.

For IAM and platform teams, the shift is from protecting a single application boundary to governing an AI control plane that touches model access, tool invocation, and data exposure. That is the real architectural challenge behind AI gateway discussions, especially as agentic systems begin to sit on top of existing enterprise services.


Key questions

Q: How should security teams govern AI tool access in MCP-based environments?

A: Security teams should treat MCP tools as privileged capabilities and enforce authentication, authorisation, and logging before the request reaches the server. The main risk is not model output alone, but unrestricted tool invocation that lets an agent reach functions or data outside its intended scope. Tool-level policy should sit in the gateway, not be left to each application team.

Q: Why do AI gateways matter for enterprise IAM programmes?

A: AI gateways matter because they create a central point for identity, routing, and observability across model and tool traffic. Without that layer, AI systems tend to sprawl across multiple providers and integrations, making access controls inconsistent and audit evidence incomplete. For IAM teams, the gateway becomes the place where AI privilege is actually enforced.

Q: What breaks when prompt injection is handled only inside the model layer?

A: The control breaks because the malicious input has already entered the session before the model tries to interpret it. By that point, the system may already have exposed context, selected unsafe tools, or generated a harmful response. Effective defence needs inspection at the edge, plus identity and policy enforcement around tool use and data access.

Q: How do security teams decide whether to prioritise gateway controls or edge filtering first?

A: Teams should prioritise gateway controls first when the main issue is inconsistent access, routing, or tool governance, and edge filtering first when the main issue is prompt injection or data exfiltration at the boundary. Most production AI programmes need both. The right sequence depends on whether the failure is privilege sprawl or input abuse.


Technical breakdown

AI gateway architecture for LLM and MCP traffic

An AI gateway normalises traffic between applications, model providers, and tool ecosystems such as MCP. In this pattern, the gateway handles authentication, routing, logging, rate limiting, and policy enforcement before requests reach models or downstream services. That matters because AI applications often mix multiple providers and tools, which makes direct point-to-point integrations hard to govern. The architectural value is not just mediation. It is the ability to apply consistent controls to every model and tool request regardless of backend. This becomes especially important when developers want to swap providers or add new tools without rebuilding the security model.

Practical implication: centralise AI request mediation so model and tool access can be governed consistently instead of through one-off integrations.

MCP security, tool access, and policy enforcement

MCP expands the attack surface because it turns tools and resources into callable capabilities for AI systems. If those tools are exposed too broadly, the agent can reach functions that were never intended for its context, creating privilege and data exposure risk. A gateway sitting in front of MCP servers can enforce OAuth, API keys, mTLS, and tool-level policy before the request reaches the server. That is a significant design point for agentic environments, where tool access is often more dangerous than model access itself. Security depends on constraining what the agent can invoke, not just what prompt it sends.

Practical implication: treat MCP tool exposure as an access-control problem and enforce per-tool authorisation before requests reach the server.

Edge filtering for prompt injection and data exfiltration

Firewall-style inspection at the edge is designed to inspect AI request intent and block common GenAI abuse paths such as prompt injection, toxic output, and exfiltration attempts. The important architectural shift is that this control sits alongside the gateway, not inside the model. That separation matters because the model cannot reliably defend itself once malicious input has already entered the session. In practice, edge filtering becomes a compensating control for unsafe user prompts, compromised tool chains, and model responses that may leak sensitive data. It is a boundary control, not a substitute for governance deeper in the stack.

Practical implication: add edge inspection in front of AI workloads, but do not treat it as a replacement for identity and policy controls upstream.


NHI Mgmt Group analysis

AI gateway architecture is becoming the control plane for agentic access. Once AI systems span multiple models, MCP tools, and REST APIs, the security problem stops being model hardening and becomes orchestration governance. That is a familiar identity pattern in a new form: whoever controls the request path controls the practical privilege boundary. Practitioners should treat the gateway as an enforcement point, not a convenience layer.

MCP introduces an access problem, not just an integration problem. The article’s strongest implication is that tools exposed through MCP must be governed as callable privileges, not as harmless plumbing. If policy is applied only after the agent has selected a tool, the control has already failed. Security teams need to evaluate tool-level authorisation as part of the same governance model they use for API and workload access.

Prompt injection and data exfiltration remain boundary failures, not model failures. The architecture described here assumes edge controls can intercept malicious intent before it reaches the model or leaves it in the response path. That is a sensible placement, but it also shows how quickly AI security becomes a layered control problem across gateway, identity, and data flow. The practitioner takeaway is to stop treating model security as a standalone domain.

The identity question is shifting from who can log in to what can act, call, and route. In AI infrastructure, the important unit is increasingly the request chain, not the user session alone. That means AI governance will look more like NHI and workload control than classic human authentication. Teams that still separate AI security from IAM will miss the point where privilege is actually exercised.

From our research:

  • Only 13% of organisations feel extremely prepared for the reality of agentic AI despite the majority racing toward autonomous adoption, according to The 2026 Infrastructure Identity Survey.
  • 69% of security leaders agree identity management must fundamentally shift to address agentic AI systems, which is why AI gateway decisions now sit inside IAM strategy rather than beside it.
  • For a broader governance lens, see OWASP NHI Top 10 for the control patterns most likely to fail when agents and tools are linked together.

What this signals

AI control-plane sprawl is now an identity problem. As model providers, MCP servers, and API layers multiply, the governance challenge shifts from isolated application security to a single, enforceable privilege boundary for AI traffic. That is why gateway placement, auditability, and policy consistency matter more than any one model choice.

With 70% of organisations granting AI systems more access than they would give a human employee performing the exact same job, the market signal is clear: AI programmes are already exceeding the assumptions baked into traditional IAM reviews. Teams should expect more pressure to prove where AI access is set, inherited, and constrained.

Identity control will follow the request chain, not the user session. The next phase of AI governance will look increasingly like NHI and workload identity management, because the risk sits in what can call what, not just who can authenticate. Practitioners should prepare for gateway and tooling decisions to become part of their core access model.


For practitioners

  • Map the AI request path end to end Document where prompts, model calls, MCP tool invocations, and response filtering occur so policy gaps are visible across the full path, not only at the application edge.
  • Enforce tool-level authorisation for MCP servers Require authentication and per-tool policy checks before an agent can reach an MCP server, especially when the server exposes multiple capabilities by default.
  • Separate model access from data access controls Treat model selection, retrieval access, and downstream data exposure as different control domains so a model switch does not silently expand privilege.
  • Instrument AI traffic for audit and traceability Capture metrics, logs, and traces for model and tool activity so security and platform teams can reconstruct who accessed what, when, and through which route.
  • Validate edge filtering against real abuse cases Test prompt injection, toxic output, and exfiltration scenarios in the same environment where the gateway and firewall will run, not only in isolated labs.

Key takeaways

  • AI infrastructure becomes governable only when model access, MCP tools, and edge inspection are controlled as one path.
  • The main risk is privilege sprawl across AI integrations, where each new model or tool can widen access faster than governance can track it.
  • Practitioners should anchor AI security in gateway policy, tool-level authorisation, and auditable traffic flows before scaling agentic use cases.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Agentic AI Top 10 and OWASP Non-Human Identity Top 10 address the attack and risk surface, while NIST CSF 2.0 set the governance and control requirements practitioners need to meet.

FrameworkControl / ReferenceRelevance
OWASP Agentic AI Top 10A2Covers tool misuse and agent boundary risks in MCP-based AI systems.
NIST CSF 2.0PR.AC-4Access control and authorization are central to AI gateway governance.
OWASP Non-Human Identity Top 10NHI-03Secret and credential exposure are relevant where AI traffic uses API keys and service auth.

Map MCP and gateway controls to agent tool-use risks and restrict callable capabilities before execution.


Key terms

  • AI Gateway: An AI gateway is a control layer that sits between applications and AI services to handle routing, authentication, logging, and policy enforcement. In production environments it becomes a governance point for model access, tool calls, and observability across multiple AI providers and protocols.
  • MCP Server: An MCP server exposes tools, resources, or prompts to AI applications through the Model Context Protocol. In agentic environments it effectively becomes a privilege surface, because the agent can only act through the capabilities the server makes available.
  • Prompt Injection: Prompt injection is an attack technique that manipulates the instructions or context given to an AI system so it behaves in an unintended way. The goal is often to force data leakage, bypass safety controls, or cause the model to take unsafe actions.
  • AI Control Plane: An AI control plane is the management layer that governs how AI workloads are authenticated, routed, observed, and constrained. It is the place where policy becomes enforceable across models, tools, and connected services, rather than being scattered across each integration.

Deepen your knowledge

NHI governance, agentic AI identity, and machine identity lifecycle are core topics in our NHI Foundation Level course, the industry's only accredited NHI security programme. If you are responsible for identity security strategy or NHI governance in your organisation, it is worth exploring.

This post draws on content published by Kong: Building a Secure, Scalable AI Infrastructure with Kong and Akamai: A Technical Introduction. Read the original.

NHIMG Editorial Note
Published by the NHIMG editorial team on 2026-05-04.
NHI Mgmt Group — the independent authority on Non-Human Identity, IAM, and Agentic AI security. nhimg.org