LiteLLM gateway authorization exposes the gap in AI agent controls

By NHI Mgmt Group Editorial TeamPublished 2026-06-09Domain: Agentic AI & NHIsSource: Cerbos

TL;DR: AI gateways increasingly centralize model access, tool exposure, and spend control, but Cerbos argues that static scopes cannot answer whether a specific principal should use a specific model or MCP tool with specific inputs, according to Cerbos. Policy-driven, request-time authorization becomes the missing control when agent traffic concentrates in one proxy.

At a glance

What this is: Cerbos argues that AI gateways like LiteLLM centralize access but still leave the hardest authorization decisions unresolved.

Why it matters: That matters because IAM teams now need to govern model calls, tool visibility, and argument-level permissions across NHI and agentic workflows, not just authenticate callers.

👉 Read Cerbos' analysis of LiteLLM gateway authorization for AI agents

Context

AI gateway authorization is becoming a governance problem, not just a routing problem. When one proxy sits between users, agents, models, and MCP tools, static API key scopes no longer tell you whether a principal should be allowed to perform a specific action with specific inputs.

For IAM and NHI teams, the question shifts from who can reach the gateway to what can be decided at request time. That makes policy evaluation, auditability, and attribute-based context central to agent and workload access control.

Key questions

Q: How should security teams authorize AI agents at a gateway?

A: Use request-time policy rather than static model scopes alone. A gateway can authenticate and route traffic, but it should not be the only place where authorisation happens. Security teams should evaluate caller identity, tool exposure, and request attributes together so the decision reflects who is asking, what they want to do, and whether the context permits it.

Q: Why do AI gateways create new authorization risks for NHI governance?

A: Because they concentrate many decisions into one control point while still relying on coarse scopes in many deployments. That leaves a gap between authentication and true authorization. NHI governance has to move beyond “can the caller reach the proxy” and ask whether the caller should be allowed to invoke a particular model, tool, or argument pattern.

Q: What breaks when tool authorization is based only on roles?

A: Role checks stop at broad entitlement and miss the context that makes a tool call safe or unsafe. A user may be allowed to issue refunds in general but not for a ticket they do not own or above a threshold. Without request context, role-based access control becomes too blunt for agent and MCP workflows.

Q: Who should own policy when AI agents are using shared proxies?

A: Identity and security teams should own the policy layer, not the gateway configuration alone. The proxy can enforce decisions, but governance belongs in versioned policy that can be tested, reviewed, and audited. That keeps model access, tool exposure, and argument-level rules consistent across changing agents and protocols.

Technical breakdown

Why static gateway scopes fail for agent authorization

AI gateways commonly handle authentication, routing, and spend control, but those are not the same as authorisation. Static model lists or API key scopes can say which endpoint a caller may reach, yet they cannot evaluate whether the caller may invoke a specific tool or pass a particular argument. Once agents and MCP servers are in the path, the security decision has to move from coarse gateway policy to request-time evaluation against principal attributes, resource attributes, and invocation context. That is why externalised authorization matters: the decision is made when the action is attempted, not when the proxy is configured.

Practical implication: Treat the gateway as an enforcement point, but move the decision logic into policy so the control can inspect caller identity and request context.

Argument-aware MCP authorization at the proxy layer

The article’s key technical point is that tool authorization is more precise than tool discovery. If a model can see a refund tool, it can try to use it, so the safer control is to deny exposure before the model ever sees the tool. Cerbos then evaluates the actual invocation arguments, such as ticket ownership and refund limits, before allowing the MCP call to proceed. This is a classic policy pattern: bind the action to the principal and the resource state, then evaluate the request at runtime. It reduces overbroad delegation without changing the downstream tool server.

Practical implication: Bind tool use to resource attributes such as ticket ownership and amount thresholds instead of relying on coarse role membership.

Audit trails for gateway and policy decisions

Gateway logs and policy decision logs serve different purposes. The gateway shows that a request was denied or that a tool was stripped, while the authorization engine records why the decision was made, including principal attributes, matched roles, policy evaluation, and the invocation arguments. That separation is valuable because investigators need both the transport-layer event and the governance-layer rationale. In AI agent environments, this is especially important because model, tool, and caller decisions often occur in quick succession and need to be reconstructed independently.

Practical implication: Keep decision logs separate from proxy logs so investigators can reconstruct who asked, what was attempted, and why the policy denied it.

NHI Mgmt Group analysis

AI gateway authorization is now an identity governance problem, not a proxy configuration problem. LiteLLM concentrates model access, tool exposure, and spend control into one hop, but concentration does not equal governance. The real control question is whether request-time policy can decide if a specific principal may perform a specific action with specific inputs. Practitioners should treat gateway policy as an identity decision layer, not a routing convenience.

Argument-aware tool control is the right mental model for MCP access. The article shows why coarse allow lists are insufficient once tools carry business meaning. A caller may be allowed to use refunds in general and still be blocked from refunding a ticket they do not own or exceeding a personal cap. That is a strong example of contextual NHI governance, where identity, resource state, and invocation intent must align before execution.

Tool visibility is part of the attack surface. If an agent never sees a tool, it cannot decide to use it, which is a cleaner control than relying on post-decision denial. That shifts policy design toward minimising unnecessary tool exposure at the gateway boundary. Practitioners should review where their current agent stack leaks capability discovery before they focus only on call-time authorization.

Gateway enforcement only works when policy is externally owned and versioned. The article’s policy-deployment model matters because it separates operational proxy changes from authorisation changes. That reduces drift, improves reviewability, and gives IAM teams a stable place to test and audit rules across models, tools, and agents. The practical implication is that agent access governance belongs in policy workflows, not embedded middleware.

Argument-aware authorization is the named concept this architecture makes unavoidable. The join between principal and invocation was designed for static service requests. That assumption fails when agents select tools dynamically and present context-rich arguments at runtime. The implication is that NHI governance must move from permissioning identities to evaluating intent-bearing requests.

From our research:
80% of organisations report their AI agents have already performed actions beyond their intended scope, including accessing unauthorised systems (39%), inappropriately sharing sensitive data (31%), and revealing access credentials (23%), according to AI Agents: The New Attack Surface report.
Only 52% of companies can track and audit the data their AI agents access, leaving 48% with a complete blind spot for compliance and breach investigation.
Forward signal: Use OWASP Agentic Applications Top 10 to map tool exposure, policy drift, and agent misuse before gateway convenience becomes governance debt.

What this signals

Argument-aware authorization will become the default expectation for agent gateways. As agent traffic concentrates in shared proxies, the old distinction between authentication and authorization becomes operationally insufficient. Teams should expect more pressure to evaluate requests at the argument level, especially for tools that move money, change records, or expose downstream systems.

With 80% of organisations already seeing AI agents act beyond intended scope, per the AI Agents: The New Attack Surface report, the control gap is no longer hypothetical. The practical signal is whether your programme can explain not just who connected, but what the agent was allowed to do at the moment it acted.

Tool discovery is becoming a governance boundary. If an agent can see a tool, the organisation has already expanded the attack surface. That pushes NHI and IAM teams toward policies that minimise exposure by default, then grant tool visibility only when the calling context justifies it.

For practitioners

Externalise gateway authorization decisions Keep model access, tool exposure, and tool-call approval in policy rather than static proxy configuration so the decision can inspect caller identity, resource state, and request context.
Bind MCP tool use to business attributes Require attributes such as assigned ticket, owner, amount cap, or case state before allowing high-impact tool calls, especially where agents can invoke tools directly through a shared proxy.
Strip unused tools before the model sees them Deny tool visibility at the gateway when a tool is not relevant to the caller or task, because exposure itself expands the attack surface even if later calls are blocked.
Separate transport logs from policy decision logs Retain the gateway denial record and the policy engine’s rationale together so reviewers can trace who asked, what was attempted, which rule applied, and why the request failed.

Key takeaways

AI gateways centralise access, but they do not by themselves answer the core authorization question for agents and MCP tools.
Request-time policy, not static scopes, is the control model that can bind a caller to a specific action and input context.
Teams that cannot explain tool visibility, denied calls, and policy rationale will struggle to govern agentic access at scale.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Agentic AI Top 10 and OWASP Non-Human Identity Top 10 address the attack and risk surface, while NIST CSF 2.0 set the governance and control requirements practitioners need to meet.

Framework	Control / Reference	Relevance
OWASP Agentic AI Top 10	A1	Agent tool exposure and request-time misuse map to agentic AI authorization risk.
OWASP Non-Human Identity Top 10	NHI-03	Gateway-scoped secrets and policy drift are direct NHI governance concerns.
NIST CSF 2.0	PR.AC-4	Least-privilege access decisions for models and tools align with access governance.

Apply least-privilege rules to model calls and MCP tool access at the authorization layer.

Key terms

AI gateway: An AI gateway is a control point that sits between callers and model or tool services, handling routing, authentication, spend control, and sometimes policy enforcement. In practice, it becomes part of identity governance when it decides which principals can reach which models, tools, or invocation paths.
Externalized authorization: Externalized authorization means the decision to allow or deny an action lives outside the application or proxy code, usually in a policy engine that evaluates identity and context at request time. For AI agents, this matters because model calls and tool invocations change too quickly for static scopes to be enough.
MCP tool invocation: MCP tool invocation is the act of an agent or application calling a tool exposed through the Model Context Protocol. The security issue is not just whether the caller can reach the tool, but whether the caller should be allowed to use that tool with the specific arguments it submits.
Argument-aware authorization: Argument-aware authorization evaluates the actual parameters in a request, not just the identity of the caller or the name of the tool. That makes it useful for AI and NHI governance because business context such as ticket ownership, amount limits, or case state can determine whether execution is safe.

Deepen your knowledge

NHI governance, agentic AI identity, and machine identity security are core topics in our NHI Foundation Level course, the industry's only accredited NHI security programme. If you are responsible for identity security strategy or NHI governance in your organisation, it is worth exploring.

This post draws on content published by Cerbos: LiteLLM gateway authorization for AI agents and MCP tools. Read the original.

NHIMG Editorial Note
Published by the NHIMG editorial team on 2026-06-09.
NHI Mgmt Group — the independent authority on Non-Human Identity, IAM, and Agentic AI security. nhimg.org