They should place a central control layer between applications and model providers so authentication, routing, logging, and policy are enforced consistently. That prevents each team from inventing its own access pattern and makes AI usage auditable across the enterprise. A gateway also gives security and platform teams one place to manage trust boundaries.
Why This Matters for Security Teams
AI applications that connect directly to model providers are not just calling an API. They are creating a new trust boundary where prompts, tool calls, routing decisions, and secrets all intersect. When each development team wires its own connection pattern, organisations lose consistent authentication, logging, and policy enforcement, which makes auditability weak and incident response slow. NHI Management Group’s Top 10 NHI Issues shows how fragmented identity handling quickly becomes a security problem, not just an operations issue. That risk is amplified by the broader control expectations in the NIST Cybersecurity Framework 2.0, which emphasises governance, protection, and traceability across systems. In practice, many security teams encounter model abuse and shadow ai paths only after secrets have already been exposed or traffic has already bypassed the intended control layer.
How It Works in Practice
The recommended pattern is a central control layer, often called an AI gateway or model mediation layer, that sits between applications and model providers. Its purpose is to normalise how AI traffic is authenticated, authorised, inspected, logged, and routed. That means a single place to enforce policy instead of every application team inventing its own method for reaching a model endpoint. For organisations dealing with identity sprawl, this is closely aligned with the lifecycle discipline described in the Ultimate Guide to NHIs — Lifecycle Processes for Managing NHIs. It also supports the audit perspective in Ultimate Guide to NHIs — Regulatory and Audit Perspectives, because the gateway can provide a durable record of who requested what, when, and under which policy.
In practical terms, a well-governed gateway should handle:
- Authentication of the application or service identity before any model call is made.
- Request routing to approved model providers, regions, or tenants.
- Policy checks for prompt content, data classification, and tool access before execution.
- Central logging for prompts, responses, refusals, and anomalous behaviour.
- Secret handling so API keys and tokens do not live in application code.
This is especially important because direct-to-model designs often hide the real blast radius. NHI Management Group’s DeepSeek breach coverage and related research on exposed secrets show how quickly sensitive material can become operationally exploitable once control is fragmented. These controls tend to break down when teams bypass the gateway for latency reasons or when legacy apps hold long-lived credentials in code and configuration.
Common Variations and Edge Cases
Tighter gateway control often increases routing complexity and can add latency, so organisations must balance central oversight against application performance and developer friction. Best practice is evolving here, especially for low-latency inference, multi-model failover, and highly distributed workloads. Some teams use a shared gateway only for external model calls, while internal model hosting follows a different control path. That can work, but only if policy, logging, and identity proof remain consistent across both paths.
A second edge case is tool-using AI applications. If the application can call search, databases, or workflow systems, the gateway must govern not only model access but also downstream tool permissions. Otherwise, the model becomes a bridge into unrelated systems. Guidance suggests that organisations should also treat model credentials as high-risk secrets and apply rotation, short-lived tokens, and strong separation of duties, especially where multiple environments share the same provider account. The recurring failure mode is not the model call itself, but the exception path created for one team that gradually becomes the default path for everyone.
Standards & Framework Alignment
This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.
OWASP Non-Human Identity Top 10 address the attack and risk surface, while NIST CSF 2.0 and NIST AI RMF set the governance and control requirements practitioners need to meet.
| Framework | Control / Reference | Relevance |
|---|---|---|
| OWASP Non-Human Identity Top 10 | NHI-01 | Central gateways reduce uncontrolled NHI sprawl and direct model access. |
| NIST CSF 2.0 | PR.AC-4 | Direct-to-model apps need consistent access enforcement and traceability. |
| NIST AI RMF | AI governance must address accountability, transparency, and monitoring. |
Force model access through a governed control point and inventory every non-human identity that can reach a model.
Related resources from NHI Mgmt Group
- How should teams govern identity data when AI systems consume it directly?
- How should teams govern AI models when security reviews sit inside the lifecycle?
- How can organisations govern third-party AI systems without losing accountability?
- How should security teams govern AI trust signals across models, data, and outputs?