TL;DR: Multi-model routing is simplified by LiteLLM, while identity-aware access control, audit logging, and context-based policy enforcement are added around LLM gateways and MCP-connected services, according to Pomerium. The real issue is not model abstraction but whether AI access is governed with the same identity controls as other production services.
NHIMG editorial — based on content published by Pomerium: LiteLLM vs. Pomerium: What's the Difference and Which One Do You Need?
By the numbers:
- When AWS credentials are exposed publicly, attackers attempt access within an average of 17 minutes.
- When AWS credentials are exposed publicly, attackers attempt access within an average of 17 minutes and as quickly as 9 minutes in some cases.
Questions worth separating out
Q: How should security teams govern AI gateway access in enterprise environments?
A: Security teams should govern AI gateway access with contextual authorisation, short-lived credentials, and full session logging.
Q: Why do LLM gateways create more governance risk than a normal API proxy?
A: LLM gateways can route traffic to multiple models, data sources, and tool chains through a single session path.
Q: What do security teams get wrong about securing MCP-connected AI workflows?
A: Teams often focus on the model while ignoring the delegated access path to internal services.
Practitioner guidance
- Separate model abstraction from authorisation. Place the LLM gateway and the access enforcement layer under different control objectives so that routing, authentication, and policy evaluation can be managed independently.
- Require contextual checks on every AI session. Use identity, device posture, time, and group membership to decide whether a request can reach the gateway or an MCP-connected service.
- Treat MCP-connected tools as privileged integrations. Review every AI workflow that can reach internal systems, then apply short-lived credentials and logged approvals where tool access crosses trust boundaries.
What's in the full article
Pomerium's full blog post covers the operational detail this post intentionally leaves for the source:
- How Pomerium applies authentication and policy enforcement across HTTP-based AI services and LLM gateways
- Examples of dynamic access rules using identity, device, time, and group context for AI requests
- Deployment patterns for securing LiteLLM behind an access proxy without exposing public endpoints
- Audit and logging considerations for teams that need evidence for compliance reviews and investigations
👉 Read Pomerium's comparison of LiteLLM and identity-aware access control →
AI gateway access control gaps: what LiteLLM and Pomerium change?
Explore further
LLM routing and identity enforcement are different control problems. A gateway that normalises model APIs reduces developer friction, but it does not answer the governance question of who is allowed to use the gateway under what conditions. Pomerium's framing exposes a common architecture mistake: organisations treat model aggregation as if it were access governance. The result is policy drift between the application layer and the identity layer. Practitioners should stop assuming that an LLM endpoint is secure because it is convenient to integrate.
A few things that frame the scale:
- 98% of companies plan to deploy even more AI agents within the next 12 months, despite documented rogue behaviour in 80% of current deployments, according to AI Agents: The New Attack Surface report.
- Only 52% of companies can track and audit the data their AI agents access, leaving 48% with a complete blind spot for compliance and breach investigation, according to AI Agents: The New Attack Surface report.
A question worth separating out:
Q: What is the difference between an LLM gateway and identity-aware access control?
A: An LLM gateway normalises how applications reach models, while identity-aware access control decides whether the request should be allowed at all. They solve different problems, and treating them as interchangeable leaves policy enforcement too close to the application layer.
👉 Read our full editorial: LiteLLM vs Pomerium: access control gaps in AI gateway stacks