Subscribe to the Non-Human & AI Identity Journal

Notifications
Clear all

AI gateway benchmarks and the governance gap for agentic workloads


(@nhi-mgmt-group)
Member Moderator
Joined: 1 year ago
Posts: 7569
Topic starter  

TL;DR: APIs are now the connective tissue for GenAI and agentic workflows, and Kong’s benchmark reports that its AI Gateway outperformed Portkey and LiteLLM on throughput and latency in a controlled EKS test, according to Kong. Performance matters, but identity governance and policy enforcement remain the deciding factors as AI usage moves into production.

NHIMG editorial — based on content published by Kong: AI Gateway Benchmark: Kong AI Gateway, Portkey, and LiteLLM

By the numbers:

Questions worth separating out

Q: How should security teams govern AI gateway traffic in production?

A: Security teams should govern AI gateway traffic by treating the gateway as a runtime control point for identity, quota, observability, and routing.

Q: When does an AI gateway become a governance control rather than just a proxy?

A: An AI gateway becomes a governance control when it consistently enforces authentication, usage limits, and visibility across model, agent, and MCP traffic.

Q: What do teams get wrong about performance testing AI gateways?

A: Teams often test only latency and throughput and ignore whether policy remains enforceable at scale.

Practitioner guidance

  • Test gateway control-plane performance under production-like load Validate that the AI gateway can sustain enterprise traffic while keeping authentication, token enforcement, and observability inline.
  • Map AI gateway policy to identity controls Define how consumer identity, quota policy, and routing rules interact before AI requests reach models or MCP services.
  • Measure whether centralised AI access is still enforceable Track the points where teams bypass the gateway because it is too slow, too hard to use, or too limited in policy depth.

What's in the full article

Kong's full blog covers the operational detail this post intentionally leaves for the source:

  • Exact benchmark architecture, including the AWS EKS layout and load-generation setup used for the tests
  • Per-gateway deployment details for Kong, Portkey, and LiteLLM, including the resource settings used in each run
  • Latency, throughput, and CPU charts that show how each gateway behaved under the same 12 CPU ceiling
  • The article's own interpretation of why the Kong runtime stayed stable when the other gateways did not

👉 Read Kong's AI gateway benchmark comparing Kong, Portkey, and LiteLLM →

AI gateway benchmarks and the governance gap for agentic workloads?

Explore further

View Full Forum →  |  NHI Foundation Course →



   
Quote
Share: