How should security teams implement LLM security across copilots and agents?

Why This Matters for Security Teams

LLM security across copilots and agents is really an identity, data, and action-control problem. Copilots can leak sensitive prompts or retrieved context, but agents raise the stakes because they can chain tools, call APIs, and trigger downstream workflows without a human in the loop. Guidance from the OWASP Agentic AI Top 10 and NIST AI Risk Management Framework both point to runtime governance, not just model selection, as the control point that matters most.

That framing aligns with NHIMG research showing how quickly abuse follows exposed NHI material: in the LLMjacking: How Attackers Hijack AI Using Compromised NHIs report, attackers attempted access to exposed AWS credentials in an average of 17 minutes. The operational lesson is simple: if an LLM touchpoint can reach secrets, data, or privileged tools, attackers will look for the shortest path through that path. In practice, many security teams discover the gap only after an agent has already been used to exfiltrate data or invoke an unsafe action.

How It Works in Practice

A workable control stack starts with discovery and classification. Security teams should inventory every copilot, chatbot, retrieval pipeline, plugin, and autonomous agent, then map each one to the data it can see and the actions it can trigger. That inventory needs to include the non-human identities behind the system, not just the application name. NHIMG’s AI Agents: The New Attack Surface report shows why this matters: 80% of organisations report agents have already performed actions beyond intended scope, and only 52% can track and audit the data those agents access.

From there, implement controls in three layers:

Input controls: filter prompts, mask secrets, and tokenize sensitive fields before they reach the model or retrieval layer.

Runtime controls: inspect tool calls, enforce allowlists, and block unsafe sequences such as file access followed by outbound transfer.

Identity controls: use workload identity and short-lived credentials so each agent session is tied to a bounded task, not a standing secret.

For agents, current guidance suggests intent-based authorisation at request time rather than static RBAC alone. That means the policy engine evaluates what the agent is trying to do, what data it is requesting, and whether the context matches the approved task. Standards work in this area is still evolving, but policy-as-code patterns from CSA MAESTRO agentic AI threat modeling framework and the NIST AI Risk Management Framework both support this runtime approach. The key is to log the full chain: prompt, retrieval results, tool invocation, policy decision, and downstream effect. These controls tend to break down when legacy apps expose broad APIs to agents because the agent can traverse too many trust boundaries too quickly.

Common Variations and Edge Cases

Tighter LLM control often increases latency, engineering overhead, and false positives, so organisations have to balance safety against developer velocity. That tradeoff is especially visible in internal copilots versus autonomous agents. Copilots can often be governed with stronger redaction, session logging, and human approval. Agents usually need stricter runtime limits, shorter credential lifetimes, and a narrower tool surface because they can act independently.

There is no universal standard for this yet, but best practice is evolving around task-scoped access, ephemeral secrets, and continuous policy evaluation. In higher-risk environments, security teams should treat each agent as a workload identity with tightly bounded permissions, then rotate or revoke access automatically when the task ends. NHIMG’s Ultimate Guide to NHIs — 2025 Outlook and Predictions is useful context here, especially where agent credentials and service identities overlap.

Edge cases include multi-agent systems, delegated tools, and vendor-hosted copilots. Those environments often create blind spots because one agent’s output becomes another agent’s input, and the security team loses direct visibility into the full chain of action. The safer pattern is to keep sensitive retrieval sources behind explicit policy checks, require step-up approval for destructive actions, and preserve immutable audit trails for investigation. The guidance is strongest when the workflow is bounded; it becomes much less reliable when agents are allowed to self-discover tools or negotiate access dynamically across environments.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Agentic AI Top 10 and CSA MAESTRO address the attack and risk surface, while NIST AI RMF set the governance and control requirements practitioners need to meet.

Framework	Control / Reference	Relevance
OWASP Agentic AI Top 10	A01	Covers prompt abuse, tool misuse, and unsafe agent action chains.
CSA MAESTRO	TRM	Directly addresses threat modeling for agentic AI workflows and tool access.
NIST AI RMF	GOVERN	Supports accountability, oversight, and measurable AI risk controls.

Model each agent workflow, then constrain data, tools, and escalation paths at runtime.

#1 Authority in NHI Education, Research and Advisory, empowering organizations to tackle the critical risks posed by Non-Human Identities (NHIs), including AI Agents.

How should security teams implement LLM security across copilots and agents?

Why This Matters for Security Teams

How It Works in Practice

Common Variations and Edge Cases

Standards & Framework Alignment

Related resources from NHI Mgmt Group