TL;DR: AI gateways can authenticate users and route traffic, but they often cannot decide what those identities may do across models, tools, MCP methods, and delegated agent calls, according to Cerbos. Fine-grained authorization shifts that decision to policy, where contextual access can be enforced at every hop instead of drifting into agent logic.
At a glance
What this is: This guide argues that AI gateways solve routing and identity verification, but not the fine-grained authorization decisions AI requests require.
Why it matters: It matters because IAM, PAM, and NHI teams need policy enforcement that follows users, services, and agents through the full request chain, not just the gateway edge.
👉 Read Cerbos's guide on AI gateway authorization and fine-grained policy
Context
AI gateways have become the control point for enterprise AI traffic, but they are only a partial control plane. They can authenticate callers, apply rate limits, cache responses, and route requests, yet still leave authorisation decisions buried in application code or agent logic. For identity teams, that creates a familiar governance gap: the system knows who is calling, but not what that identity is allowed to do in context.
The problem cuts across NHI, agentic AI, and human IAM because the same request can involve a person, an agent, a tool, and a downstream service account. Once delegation starts to flatten the principal chain, traditional access reviews and coarse allowlists stop describing reality. That is why AI gateway policy needs to be evaluated as an identity control, not just an application feature.
Key questions
Q: How should security teams govern AI gateway authorization across models, tools, and agents?
A: Use the gateway as the enforcement point, but evaluate every request against contextual policy before it reaches the model or tool. That policy should combine principal identity, resource attributes, environment, and delegation chain so access is decided in context rather than by static allowlists.
Q: Why do AI gateways create governance gaps for IAM and PAM teams?
A: They verify identity at the edge, but they often do not decide whether that identity may use a model, tool, or downstream service in a specific business context. That leaves privilege decisions scattered across code, which weakens auditability and makes access reviews less meaningful.
Q: What breaks when agent-to-agent delegation is not attenuated?
A: The delegated agent can inherit more authority than the original task justified, especially if tokens are passed downstream unchanged. Without explicit scope reduction and delegation limits, the chain of grants can expand beyond the human’s intended permissions.
Q: Who is accountable when an AI agent acts on a user’s behalf through a gateway?
A: Accountability stays with the originating identity and the policy that authorised the delegation, not with an invisible agent proxy. The gateway should record both the acting agent and the human or service behind it so access decisions are traceable.
Technical breakdown
Why authentication at the gateway is not authorisation
AI gateways typically validate tokens, claims, and identity context before routing a request upstream. That proves the caller is known, but it does not answer whether the caller may use a given model, tool, or dataset in the current business context. Coarse controls such as role allowlists and regex routes can gate basic access, yet they cannot combine principal attributes, resource attributes, and environmental context into one decision. That is the difference between recognising an identity and governing its effective privilege.
Practical implication: treat the gateway as an enforcement point, but place policy logic in a system that can evaluate context, not just identity.
How agent-to-agent delegation changes the authorisation problem
Agent-to-agent delegation turns a single request into a chain of grants. If the parent agent passes its token downstream unchanged, the sub-agent inherits authority that may exceed the original task. The real issue is not just exposure of credentials, but attenuation of privilege across hops. Without explicit checks on delegated scope, sub-delegation rights, and bounded chain depth, the delegation chain becomes broader than the intent that authorised it.
Practical implication: verify every delegated hop against the originating user’s scope and deny onward delegation unless policy explicitly permits it.
Dynamic MCP tool discovery and per-method access control
MCP proxying introduces another layer of policy pressure because the tool catalog itself can become dynamic. If low-privilege users see the same methods as administrators, the discovery surface already leaks capability even before execution. Per-method ACLs work better when the gateway can filter the catalog according to principal, environment, and task context. That prevents destructive or sensitive methods from appearing where they should never be reachable.
Practical implication: expose only the MCP tools a principal can actually invoke, and keep destructive methods behind explicit break-glass policy.
Breaches seen in the wild
- Moltbook AI agent keys breach — Moltbook breach exposed 1.5M AI agent keys.
- AI LLM hijack breach — attackers used stolen AWS access keys to hijack Anthropic LLM models on Bedrock.
Read our 52 NHI Breaches Analysis report for a comprehensive view of breaches impacting Non-Human Identities including AI Agents.
NHI Mgmt Group analysis
The AI gateway has become an identity choke point, but not yet an authorisation authority. Authentication at the edge is useful, but it does not resolve the decision about whether a caller may use a model, tool, or data source in context. That gap pushes policy into agent code, downstream services, and ad hoc rules that identity teams cannot govern consistently. The practical conclusion is that gateway design now sits inside the identity stack, not beside it.
Delegation without attenuation is the failure mode that matters most in agentic workflows. The article shows how agents can act on behalf of a user, then hand authority to sub-agents without shrinking the scope. That is a governance problem because the chain of grants becomes broader than the original authorisation intent. Practitioners should read this as a warning that delegated access must remain bounded at every hop.
Per-method MCP control is becoming the new baseline for AI request governance. Once gateways proxy tool discovery as well as model calls, the question changes from who can connect to what they can discover and invoke. Tool visibility, environment context, and task scope now define whether a request is safe to execute. The implication for IAM and PAM teams is that coarse model allowlists no longer describe real exposure.
Policy drift inside agent logic is the clearest signal that the control plane is broken. When authorisation moves out of the gateway and into code paths that vary by agent or service, enforcement stops being independently testable. That creates a split between identity proof and privilege decision that is hard to audit and harder to certify. Teams should treat that drift as a design defect, not an implementation detail.
From our research:
- 98% of companies plan to deploy even more AI agents within the next 12 months, despite documented rogue behaviour in 80% of current deployments, according to AI Agents: The New Attack Surface report.
- 52% of companies can track and audit the data their AI agents access, leaving 48% with a complete blind spot for compliance and breach investigation, according to the same report.
- For a deeper control perspective, see OWASP Agentic Applications Top 10 for the risk patterns that gateway policy needs to constrain.
What this signals
Identity policy for AI gateways is shifting from perimeter checks to runtime governance. As requests move through models, tools, MCP methods, and sub-agents, the control question becomes whether the policy engine can evaluate context at the moment of use. Teams that still treat gateway controls as simple routing or authentication filters will miss the real exposure surface.
Per-method discovery control is the new proxy for least privilege in agentic systems. If a tool is visible, it is usually close to usable, so the catalog itself becomes part of the privilege model. That is why AI gateway programmes should track not only what agents can call, but what they can discover, delegate, and re-expand over time.
With 52% of companies able to track and audit AI agent data access according to AI Agents: The New Attack Surface report, the governance gap is already operational. Identity teams should expect more requests to arrive through agents that look authenticated but are not yet properly authorisation-scoped.
For practitioners
- Move authorisation decisions out of agent code Keep allow and deny logic in a central policy layer that the gateway can call before routing each AI request. This makes model access, tool use, and delegated calls testable and auditable across the stack.
- Enforce attenuation on every delegated hop Require each sub-agent grant to be a strict subset of the delegator’s authority and the originating user’s permissions. Deny onward delegation unless the original policy explicitly allows it.
- Filter MCP tool discovery by principal and context Return a reduced tool catalog to low-privilege callers and reserve destructive methods for explicit break-glass roles. This prevents discovery from exposing capabilities that the principal cannot actually invoke.
- Fail closed when the policy decision point is unavailable Design the gateway so that high-capability agents cannot continue on stale or implicit grants if the policy service is unreachable. Define a safe-degradation path before deployment, not after an outage.
Key takeaways
- AI gateways solve routing and identity verification, but they do not by themselves solve contextual authorisation across models, tools, and delegated agents.
- Delegation chains and MCP tool exposure are where privilege drift becomes visible, because authority can expand faster than governance can review it.
- Practitioners should centralise policy, enforce attenuation, and treat gateway authorisation as an identity control with audit and fail-closed requirements.
Standards & Framework Alignment
This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.
OWASP Agentic AI Top 10 and OWASP Non-Human Identity Top 10 address the attack and risk surface, while NIST Zero Trust (SP 800-207) set the governance and control requirements practitioners need to meet.
| Framework | Control / Reference | Relevance |
|---|---|---|
| OWASP Agentic AI Top 10 | Covers tool misuse and delegated agent behaviour in AI gateways. | |
| OWASP Non-Human Identity Top 10 | NHI-03 | AI gateway calls rely on non-human identities and delegated credentials. |
| NIST Zero Trust (SP 800-207) | PR.AC-4 | Gateway policy is a zero trust enforcement point for every AI request. |
Map gateway policy to agentic risk controls for tool use, delegation, and request-time enforcement.
Key terms
- AI Gateway Authorization: The policy decision layer that determines what an authenticated AI caller may do after the gateway receives the request. It goes beyond login and routing by evaluating the model, tool, resource, context, and delegation chain before the call is allowed to continue.
- Delegated Scope Attenuation: The practice of reducing authority each time an agent or service passes a task to another identity. In AI systems, the delegated scope must remain a subset of the original grant, or the chain can expand privilege beyond the human or service that authorised it.
- Policy Enforcement Point: The component that blocks or allows access based on a policy decision. In AI gateway designs, the gateway often serves as the enforcement point while a separate policy decision service evaluates the request against identity, resource, and context data.
- MCP Tool Discovery: The process by which an MCP server or proxy exposes available tools to a caller. If discovery is not filtered by principal and context, it can reveal capabilities that the identity is not actually authorised to use, creating an exposure before execution begins.
Deepen your knowledge
AI gateway authorisation and delegated AI access are core topics in our NHI Foundation Level course, the industry's only accredited NHI security programme. If you are designing policy for models, tools, or agent chains, it is worth exploring.
This post draws on content published by Cerbos: AI gateway authorization and fine-grained policy for models, tools, and agents. Read the original.
Published by the NHIMG editorial team on 2026-05-26.
NHI Mgmt Group — the independent authority on Non-Human Identity, IAM, and Agentic AI security. nhimg.org