How should security teams govern LLM applications that call tools and data sources?

Why This Matters for Security Teams

LLM applications that call tools or data sources should be treated as agentic systems with execution authority, not as passive chat interfaces. Once an LLM can retrieve records, send messages, open tickets, or trigger workflows, it becomes part of the enterprise trust boundary. The key question is no longer only what the model can say, but what it can cause the system to do. That is why governance must cover identity, permissioning, logging, and approval paths across the full tool chain, consistent with OWASP Agentic AI Top 10 and the NIST AI Risk Management Framework.

Security teams often miss that prompt injection, indirect prompt injection, and tool misuse are authorization problems as much as they are model-safety problems. If the model can be induced to call a tool with broader access than the user intended, the system has already crossed into privileged execution. NHIMG research on the OWASP NHI Top 10 and the AI LLM hijack breach shows why tool-connected LLMs are attractive targets: they sit close to secrets, APIs, and business workflows. In practice, many security teams encounter abuse only after an agent has already queried the wrong dataset or executed an unintended action, rather than through intentional testing.

How It Works in Practice

Governance starts by separating user intent from agent capability. The user may ask for an outcome, but the agent should only receive the minimum workload identity and scoped permissions needed for the next step. That means no standing broad tokens, no shared service accounts, and no direct access to sensitive sources unless the action is explicitly approved. For higher-risk operations, current guidance suggests combining real-time policy evaluation with just-in-time credential issuance, so each tool call is checked in context and each secret expires after the task completes. This aligns with the intent of CSA MAESTRO agentic AI threat modeling framework and NIST AI 600-1 Generative AI Profile.

In implementation terms, teams should:

Use workload identity for the agent process, not a human surrogate account.

Issue JIT credentials per tool invocation, with short TTLs and automatic revocation.

Apply policy-as-code at request time, using context such as user, dataset, action type, and risk level.

Require human approval for write actions, external communications, fund movement, or secrets access.

Log both the model decision and the downstream tool execution for audit and incident review.

NHIMG reporting on the Moltbook AI agent keys breach and the DeepSeek breach reinforces a simple lesson: secrets exposed to agentic workflows are often reused faster than governance teams can notice. These controls tend to break down when tools are chained across multiple services with inconsistent identity propagation, because the original user context is lost by the time the final action executes.

Common Variations and Edge Cases

Tighter approval and short-lived credentials often increase operational friction, so organisations must balance velocity against blast-radius reduction. There is no universal standard for this yet, especially for multi-agent pipelines, but best practice is evolving toward layered controls rather than one central policy gate. That means low-risk read-only retrieval may be fully automated, while high-impact write operations, credential access, and cross-domain data movement require stronger verification and step-up approval.

Edge cases matter. In customer-support copilots, an agent may need access to several systems but only for a specific case and a limited time. In engineering workflows, an agent may need repo access, CI access, and deployment tooling, yet each should be separately scoped. In regulated environments, it is often better to deny a tool call and force a narrower workflow than to over-provision the agent “just in case.” Where organisations use MCP or similar tool-routing layers, the security model should still be evaluated as an identity and authorization problem, not merely as a transport problem. That is the operating assumption behind the OWASP Top 10 for Agentic Applications 2026 and NIST Cybersecurity Framework 2.0.

In practice, the safest pattern is to assume the model may make the right language choice but the wrong access choice, so the platform must make the final authorization decision every time.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Agentic AI Top 10 and CSA MAESTRO address the attack and risk surface, while NIST AI RMF set the governance and control requirements practitioners need to meet.

Framework	Control / Reference	Relevance
OWASP Agentic AI Top 10	A1	Agentic tool use creates prompt-injection and action-abuse risk.
CSA MAESTRO		MAESTRO maps agent workflows, trust boundaries, and control points.
NIST AI RMF		AI RMF is the clearest governance lens for accountable agent behaviour.

Model each agent workflow, then add least-privilege and approval controls at each boundary.

#1 Authority in NHI Education, Research and Advisory, empowering organizations to tackle the critical risks posed by Non-Human Identities (NHIs), including AI Agents.

How should security teams govern LLM applications that call tools and data sources?

Why This Matters for Security Teams

How It Works in Practice

Common Variations and Edge Cases

Standards & Framework Alignment

Related resources from NHI Mgmt Group