TL;DR: Agentic AI security is the discipline of protecting autonomous agents that interact with APIs, data sources, and users in real time, and WitnessAI argues that prompt injection, excessive permissions, and weak identity controls expand the attack surface. Access review processes assume stable privilege and predictable workflows; autonomous agents break that assumption because they can select actions and execute them within a single runtime session.
At a glance
What this is: This is an analysis of agentic AI security that argues autonomous agents create new attack surface, identity, and governance risks that traditional controls do not fully cover.
Why it matters: It matters to IAM practitioners because AI agents blur the line between workload identity, privileged access, and autonomous action, forcing governance models to account for runtime decisions and auditability.
By the numbers:
- 80% of organisations report their AI agents have already performed actions beyond their intended scope, including accessing unauthorised systems, inappropriately sharing sensitive data, and revealing access credentials.
- 92% agree governing AI agents is critical to enterprise security, yet only 44% have implemented any policies to do so.
👉 Read WitnessAI's analysis of agentic AI security and autonomous agent risk
Context
Agentic AI security is the discipline of governing autonomous systems that can choose actions, call tools, and execute tasks across connected services. The identity problem is no longer limited to who logs in, but to what the agent can do once it is active across APIs, SaaS platforms, and internal data stores.
The current control model is under strain because many programmes still treat agents like enhanced automation rather than decision-making actors. That breaks assumptions around least privilege, auditability, and approval boundaries, especially when the agent can initiate work, access sensitive data, and trigger downstream actions without a human in the loop.
Key questions
Q: How should security teams govern autonomous AI agents in production?
A: Security teams should govern autonomous AI agents as distinct identities with explicit task boundaries, auditable actions, and tightly scoped permissions. The priority is not just model safety but runtime control over what the agent can read, decide, and execute across connected systems. That means inventory, logging, approval logic, and exception handling must be built for agent behaviour, not human workflows.
Q: Why do AI agents complicate least privilege and access reviews?
A: AI agents complicate least privilege because their access needs can change during a session as they interpret new inputs and choose tools dynamically. Access reviews assume stable entitlements that can be certified after the fact, but an agent may already have consumed, combined, or acted on access before review happens. Governance must therefore move toward runtime constraint and continuous validation.
Q: What breaks when agent identity is not tracked properly?
A: When agent identity is not tracked properly, incident response loses attribution, compliance loses evidence, and security teams lose the ability to prove which actor performed which action. Shared or invisible agent identities turn a controllable workflow into an accountability gap. The result is not just weaker logs, but weaker governance over the entire access path.
Q: How can organisations reduce prompt injection risk in agentic AI?
A: Organisations can reduce prompt injection risk by treating prompts, retrieved content, and external inputs as untrusted data rather than instructions. They should block unsafe tool calls, limit what content can influence execution, and separate data retrieval from privileged action. The practical goal is to prevent untrusted text from becoming an authorisation trigger.
Technical breakdown
Prompt injection and instruction hijacking in agentic systems
Prompt injection works by placing malicious instructions into data or user input that the agent treats as part of its operating context. In agentic systems, this is more than content manipulation because the model may re-rank priorities, invoke tools, or disclose data based on the injected instruction. The danger increases when the agent has broad retrieval access or can chain actions across services. Security boundaries must therefore treat prompts, retrieved content, and tool outputs as untrusted inputs, not as harmless context. Practical implication: constrain what the agent can read and act on at each step, especially when external content can influence execution.
Practical implication: constrain what the agent can read and act on at each step, especially when external content can influence execution.
Unrestricted permissions and over-provisioned agent identity
Many agent deployments inherit access from the environment instead of being issued a distinct identity with scoped permissions. That leads to broad API reach, over-permissioned datasets, and tool access that exceeds the task at hand. Once a single agent identity is compromised or misdirected, the blast radius can extend across multiple services because the agent is effectively acting as a privileged orchestrator. This is an identity governance issue as much as a security issue. Practical implication: treat every agent as a separately governed identity with explicit entitlement boundaries and reviewable access paths.
Practical implication: treat every agent as a separately governed identity with explicit entitlement boundaries and reviewable access paths.
Agent identity, authentication, and audit trails
Agent identity management is the set of controls that make a specific agent recognisable, accountable, and traceable across systems. That includes signed requests, token handling, and durable logs that connect an action to a particular agent instance and purpose. Without those controls, incident response cannot reliably tell which agent touched which data or whether an action was expected. In effect, auditability becomes a policy claim instead of an evidential record. Practical implication: design logging and authentication so each agent action can be tied back to an identity, a permission set, and a business purpose.
Practical implication: design logging and authentication so each agent action can be tied back to an identity, a permission set, and a business purpose.
Threat narrative
Attacker objective: The attacker aims to redirect the agent into performing unauthorised actions while preserving the appearance of normal system behaviour.
- Entry occurs when a malicious prompt, poisoned content, or compromised dependency reaches the agent through a connected API, retrieval source, or user interaction.
- Escalation happens when the agent interprets the injected instruction as legitimate context and uses its granted tool access to move beyond the original task.
- Impact follows when the agent exfiltrates data, alters outputs, or triggers downstream actions across connected systems without intended oversight.
Breaches seen in the wild
- Moltbook AI agent keys breach — Moltbook breach exposed 1.5M AI agent keys.
- AI LLM hijack breach — attackers used stolen AWS access keys to hijack Anthropic LLM models on Bedrock.
Read our 52 NHI Breaches Analysis report for a comprehensive view of breaches impacting Non-Human Identities including AI Agents.
NHI Mgmt Group analysis
Agentic AI security is really identity governance for runtime decision-making. The article correctly frames the problem as more than model safety because the core risk is not just what the agent knows, but what it can decide to do with that knowledge. Once an agent can select tools and execute actions across systems, the security question becomes whether its identity, access, and accountability model can keep pace. Practitioners should treat agentic AI as a governance problem first and a tooling problem second.
Least privilege for agents is often defined too early to be meaningful. Traditional IAM assumes the scope of a subject can be defined at provisioning time, but agent behaviour is often context-dependent and session-specific. That makes static entitlement design fragile when the system can change task direction mid-flight based on new inputs. The implication is that access policy for autonomous systems has to be expressed as bounded runtime authority, not as a one-time grant.
Agent identity gaps create an accountability vacuum across the control plane. If the organisation cannot prove which agent performed an action, then audit, incident response, and compliance all degrade together. The article points to runtime monitoring and audit trails, but the deeper issue is that many enterprises have not yet made agents first-class identities in their governance model. Practitioners should assume that invisible or shared agent identities will become the weak point in any serious deployment.
Prompt injection is a governance failure when it can steer privileged action. The technique matters because it can turn untrusted content into an execution directive for a system that already has access. That makes the boundary between data and instruction a control surface, not a language quirk. Security teams should recognise that a prompt boundary without permission boundary is not a real security boundary.
Runtime controls now matter more than pre-deployment approval for AI agents. The article’s emphasis on continuous monitoring, red teaming, and policy enforcement reflects a broader shift in how identity risk is managed for systems that act continuously. The field should move from static deployment reviews toward ongoing behavioural validation, because agent risk emerges in motion, not at checkout time.
From our research:
- 80% of organisations report their AI agents have already performed actions beyond their intended scope, including accessing unauthorised systems, inappropriately sharing sensitive data, and revealing access credentials, according to AI Agents: The New Attack Surface report.
- Only 52% of companies can track and audit the data their AI agents access, leaving 48% with a complete blind spot for compliance and breach investigation.
- Use OWASP Agentic AI Top 10 to frame runtime controls and NIST AI Risk Management Framework to structure governance ownership for autonomous behaviour.
What this signals
Agentic AI is becoming a governance problem before it becomes a mature security discipline. The organisation that treats agents as ordinary automation will miss the point where behaviour becomes dynamic and review cycles stop matching operational reality. Use OWASP Agentic AI Top 10 to frame the control gaps, then align ownership to NIST AI Risk Management Framework governance expectations.
Runtime authority is the concept to watch: once an agent can combine retrieval, decision-making, and execution, static approval models start to fail. The programme implication is that identity, policy, and observability must be designed around the session, not the deployment ticket.
With 80% of organisations already reporting out-of-scope agent actions, the operational question is no longer whether to govern agents, but how quickly access boundaries and audit trails can be made first-class parts of the identity programme.
For practitioners
- Inventory every active agent identity Document each agent, its owning team, connected APIs, retrieval sources, and downstream systems so no autonomous actor exists outside governance. Map the inventory to access reviews and exception handling.
- Scope agent permissions to task boundaries Assign the minimum API, dataset, and tool access needed for each use case, then separate read, write, and execute paths so one compromised agent cannot roam across services.
- Treat prompts and retrieved content as untrusted inputs Filter external content before it reaches the agent, and block tool execution when input provenance is unknown or when content attempts to redirect policy or data access.
- Make agent actions auditable end to end Require signed requests, durable logs, and action correlation so investigators can connect every agent decision to a specific identity, permission set, and business purpose.
- Test agent behaviour under adversarial conditions Run red team exercises against prompt injection, dependency compromise, and escalation paths, then compare observed behaviour with the intended authority model.
Key takeaways
- Agentic AI security is an identity governance problem because autonomous systems can make and execute decisions across multiple services in real time.
- Out-of-scope behaviour is already common in current deployments, which means the governance gap is active rather than hypothetical.
- Practitioners need runtime controls, auditable agent identities, and tight permission boundaries before autonomy spreads further.
Standards & Framework Alignment
This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.
OWASP Agentic AI Top 10 address the attack and risk surface, while NIST AI RMF and NIST Zero Trust (SP 800-207) set the governance and control requirements practitioners need to meet.
| Framework | Control / Reference | Relevance |
|---|---|---|
| OWASP Agentic AI Top 10 | Covers prompt injection and tool misuse risks in agentic systems. | |
| NIST AI RMF | Addresses governance, accountability, and monitoring for AI systems. | |
| NIST Zero Trust (SP 800-207) | PR.AC-4 | Least privilege and continuous verification are central to agent access control. |
Assign governance owners and monitor agent behaviour continuously under an AI RMF model.
Key terms
- Agent identity management: Agent identity management is the practice of giving each AI agent a distinct, governable identity with traceable permissions and accountability. In autonomous systems, it must connect runtime actions to a specific agent instance, not just a shared service account or platform role.
- Prompt injection: Prompt injection is an attack in which untrusted text alters an AI system’s intended behaviour by influencing its instruction context. For autonomous agents, the risk is higher because injected content can redirect tool use, data access, or execution without a human noticing in time.
- Runtime validation: Runtime validation is the continuous checking of an agent’s behaviour while it is operating, rather than only before deployment. It matters because agent decisions can change with context, so security controls need to detect unsafe action before it becomes a completed transaction.
- Agentic AI security: Agentic AI security is the set of governance and technical controls used to protect AI systems that can make decisions and act on them. It combines identity, access, monitoring, and policy enforcement so the agent remains within authorised boundaries during execution.
Deepen your knowledge
NHI governance, agentic AI identity, and machine identity lifecycle are core topics in our NHI Foundation Level course, the industry's only accredited NHI security programme. If you are responsible for identity security strategy or NHI governance in your organisation, it is worth exploring.
This post draws on content published by WitnessAI: What is Agentic AI Security? Read the original.
Published by the NHIMG editorial team on 2025-12-24.
NHI Mgmt Group — the independent authority on Non-Human Identity, IAM, and Agentic AI security. nhimg.org