Direct integrations spread credentials, logging, and policy decisions across many teams, which creates inconsistent access control and weak auditability. The risk is not only security exposure but also operational and financial opacity. When access is decentralised, it becomes much harder to prove who used what model, for what purpose, and under which policy.
Why This Matters for Security Teams
Direct LLM integrations create governance risk because every product team can make its own choices about credentials, prompts, logging, and approvals, even when those choices affect the same model and the same data. That fragmentation makes it hard to prove least privilege, detect over-collection, or answer basic audit questions about who accessed which content and why. The issue is not just technical sprawl, but accountability sprawl.
This is now a board-level concern as agentic and LLM-enabled systems expand faster than oversight. NHIMG research on the AI Agents: The New Attack Surface report shows that only 52% of companies can track and audit the data their AI agents access, leaving a large compliance blind spot. Current guidance from OWASP Top 10 for Agentic Applications 2026 and the NIST AI Risk Management Framework points to the same operational reality: governance fails when access and decision rights are distributed without a shared control plane.
In practice, many security teams discover the gap only after an application has already sent sensitive data to the wrong model, not through intentional access review.
How It Works in Practice
A safer pattern is to treat model access like a governed workload rather than a developer convenience. Instead of embedding long-lived API keys in each app, organisations centralise identity, policy, and telemetry so every call to an LLM is evaluated at request time. That means the system knows the workload identity, the intended task, the data class involved, and the policy that applies before the request leaves the environment.
This is where workload identity, ephemeral secrets, and policy-as-code become practical controls. Short-lived credentials reduce the blast radius if an integration is copied, logged, or exfiltrated. Runtime policy evaluation also makes it possible to block high-risk prompts, disallow certain data types, or require stronger approval paths for sensitive workflows. Standards-oriented guidance from the NIST Cybersecurity Framework 2.0 supports this kind of shared control model, while implementation patterns discussed in OWASP NHI Top 10 and the CSA MAESTRO agentic AI threat modeling framework emphasise the same point: the control should sit above the integration, not inside each application.
- Centralise model access through a broker or gateway instead of direct app-to-model keys.
- Issue short-lived credentials tied to workload identity and task scope.
- Log prompts, model choice, outputs, and policy decisions in a consistent format.
- Apply data classification and approval rules before the request reaches the model.
This approach breaks down when teams can bypass the broker with local keys or direct SaaS integrations because policy enforcement and audit evidence then fragment again.
Common Variations and Edge Cases
Tighter control usually improves visibility, but it also increases integration overhead, so organisations have to balance speed against governance maturity. That tradeoff is especially visible in analytics teams, rapid prototyping environments, and customer-facing products that mix multiple models and vendors.
There is no universal standard for this yet, but current guidance suggests that exceptions should be explicit, time-bound, and measurable. For example, a low-risk internal summarisation tool may tolerate broader access than a system that can retrieve customer records or trigger downstream actions. In higher-risk workflows, the question is not whether an LLM can call a model directly, but whether that call is routed through a policy layer that can explain the decision. NHIMG coverage such as the Top 10 NHI Issues and the Moltbook AI agent keys breach shows how quickly unmanaged keys and weak ownership become operational failures.
Direct integrations are most defensible only when data is non-sensitive, the model has no tool access, and the business can tolerate limited audit depth. They become unsafe when teams combine multiple vendors, external plugins, or autonomous agents that can chain actions across systems.
Standards & Framework Alignment
This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.
OWASP Agentic AI Top 10 and CSA MAESTRO address the attack and risk surface, while NIST AI RMF set the governance and control requirements practitioners need to meet.
| Framework | Control / Reference | Relevance |
|---|---|---|
| OWASP Agentic AI Top 10 | A2 | Direct integrations expand attack surface and obscure agent access paths. |
| CSA MAESTRO | MAESTRO addresses agent threat modeling and control-plane governance for AI workflows. | |
| NIST AI RMF | AI RMF governance applies to accountability, transparency, and policy consistency. |
Route model calls through governed controls and restrict direct, unlogged access paths.
Related resources from NHI Mgmt Group
Deepen Your Knowledge
Reviewed and updated by the NHIMG editorial team on June 23, 2026.
NHI Mgmt Group — the #1 independent authority on Non-Human Identity, IAM, and Agentic AI security. nhimg.org