Prioritise controls that limit what the model can be induced to do with tools and data. If the agent can browse, call APIs, or execute code, those capabilities should be tested under attack first, because that is where prompt injection becomes an operational incident rather than a theoretical weakness.
Why This Matters for Security Teams
Choosing the wrong AI agent controls means spending time on policies that look mature on paper but do little to stop real misuse. For autonomous agents, the highest-risk paths are usually tool calls, data retrieval, code execution, and credential handling, not the model output alone. That is why control selection should start with what the agent can be induced to do, not with generic application checklists. The OWASP NHI Top 10 and the NIST AI Risk Management Framework both reinforce that AI risk is contextual, not abstract.
NHIMG research shows the scale of the issue: 80% of organisations report their AI agents have already performed actions beyond intended scope, including unauthorised access, sensitive data sharing, and credential exposure, according to AI Agents: The New Attack Surface report. That is a prioritisation signal, not a surprise finding. Security teams should focus first on controls that reduce blast radius when prompt injection, tool abuse, or overbroad permissions turn a helpful agent into an operational incident. In practice, many security teams encounter the control gap only after an agent has already touched data it should never have seen.
How It Works in Practice
Start by mapping each agent to its actual authority. A simple chatbot has a very different risk profile from an agent that can browse the web, call internal APIs, approve workflows, or run code. The most important controls are the ones that restrict those capabilities at runtime, because static RBAC often assumes fixed duties while agent behaviour changes with prompts, tool output, and task context. Current guidance suggests prioritising controls that are evaluated when the request happens, not just when the account is created.
Use a tiered approach:
- Classify agents by tool access, data sensitivity, and execution authority.
- Apply least privilege to each tool, API, and dataset separately.
- Issue short-lived secrets or JIT credentials for tasks that truly need access.
- Require workload identity for the agent, so the system proves what it is before it acts.
- Log prompts, tool calls, policy decisions, and data access for audit and containment.
This is where agent-specific frameworks help. The OWASP Agentic AI Top 10 focuses attention on prompt injection, excessive agency, and insecure tool use, while the CSA MAESTRO agentic AI threat modeling framework helps teams trace how one unsafe action can cascade across connected systems. NHIMG’s Analysis of Claude Code Security is useful here because code-capable agents illustrate how quickly tool access becomes security exposure. Where possible, pair policy-as-code with runtime enforcement so the agent only receives the minimum authority needed for the current task. These controls tend to break down when agents are allowed broad network access, persistent credentials, and unsupervised tool chaining in production.
Common Variations and Edge Cases
Tighter control selection often increases friction for builders and operators, so teams have to balance safety against delivery speed. That tradeoff is real, especially in environments where agents support customer-facing workflows, developer productivity, or incident response. Best practice is evolving, and there is no universal standard for every agent type yet.
One common edge case is an agent that looks low-risk because it cannot write data, but can still read sensitive sources and exfiltrate through summaries, tickets, or API calls. Another is the internal agent with an apparently trusted service account that can laterally move because it inherits broad network and IAM reach. A third is multi-agent orchestration, where no single agent has dangerous power, but the chain of agents creates cumulative privilege that is difficult to see in isolation.
For those scenarios, the priority is not just stronger authentication. It is the combination of runtime policy evaluation, scoped secrets, explicit tool allowlists, and clear approval boundaries for irreversible actions. The MITRE ATLAS adversarial AI threat matrix is helpful for modelling how adversaries chain tactics across model and infrastructure layers, while the DeepSeek breach and LLMjacking: How Attackers Hijack AI Using Compromised NHIs show how exposed secrets and weak identity handling quickly become agent abuse. In practice, the hardest environments are those with legacy IAM, shared service accounts, and agents that span multiple vendors and trust zones.
Standards & Framework Alignment
This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.
OWASP Agentic AI Top 10 and CSA MAESTRO address the attack and risk surface, while NIST AI RMF set the governance and control requirements practitioners need to meet.
| Framework | Control / Reference | Relevance |
|---|---|---|
| OWASP Agentic AI Top 10 | A3 | Prioritises prompt injection and excessive agency, the core control-selection problem here. |
| CSA MAESTRO | Maps how agent decisions, tools, and dependencies create compound risk. | |
| NIST AI RMF | Supports context-based AI governance and risk prioritisation. |
Rank agent controls by tool reach, then test the highest-privilege actions under adversarial prompts first.
Related resources from NHI Mgmt Group
- When should organisations treat an AI agent as a privileged system?
- How can organisations decide when an AI agent needs higher controls?
- How do organisations decide whether an AI agent needs NHI controls, AI controls, or both?
- How do IAM teams decide whether an AI agent needs runtime policy enforcement?
Deepen Your Knowledge
Reviewed and updated by the NHIMG editorial team on July 5, 2026.
NHI Mgmt Group — the #1 independent authority on Non-Human Identity, IAM, and Agentic AI security. nhimg.org