Agentic AI security risks expose the limits of traditional controls

By NHI Mgmt Group Editorial TeamPublished 2026-06-21Domain: Agentic AI & NHIsSource: WitnessAI

TL;DR: Agentic AI systems can call APIs, query databases, and modify production systems without human approval, creating eight risks that traditional tools often miss because the dangerous action happens inside authorized workflows, according to WitnessAI. The control problem is not just visibility but assumption collapse: review-based governance cannot reliably contain actors that decide, act, and chain tools in one session.

At a glance

What this is: This is an analysis of eight security risks in agentic AI, with the key finding that traditional AI and enterprise controls often miss unsafe actions because agents operate inside apparently authorised workflows.

Why it matters: It matters to IAM practitioners because agent behaviour changes how permissions, auditability, and approval gates work across NHI, autonomous, and human identity programmes.

By the numbers:

Anthropic's internal red team showed how far that goes: a crafted prompt exfiltrated AWS credentials in 24 out of 25 attempts against its own agent, a 96% success rate.
Gartner’s 2026 coverage says identity is becoming more operational, distributed, and intertwined with software delivery, agents, infrastructure, and AI governance.

👉 Read WitnessAI's analysis of eight cybersecurity risks in agentic AI

Context

Agentic AI refers to systems that receive an objective and then select tools, call APIs, read files, and execute multi-step workflows without waiting for a person at each step. The primary governance problem is that these systems do not just generate content, they perform actions, so identity controls must evaluate both access and behaviour.

Traditional controls such as firewalls, CASB, DLP, and SIEM were built to observe traffic and data movement, not to judge whether a valid API call is semantically unsafe. That gap is why agentic AI security now sits at the intersection of NHI governance, IAM, and runtime policy enforcement.

The article is not about one vendor feature set. It is a field-level warning that once agents can pursue goals directly in business systems, identity programmes need to separate authorised from acceptable action, not just authenticated from unauthenticated access.

Key questions

Q: What breaks when AI agents can act inside business systems without human approval?

A: What breaks is the assumption that authorised access is also safe to use. When an agent can choose tools, call APIs, and chain actions at runtime, standard approval and review cycles may never see the unsafe decision in time. Security teams need runtime controls that evaluate behaviour, not just authentication and network access.

Q: Why do AI agents complicate IAM and NHI governance?

A: AI agents complicate IAM and NHI governance because they reuse legitimate credentials while making independent choices about when and how to use them. That means the entitlement may be valid, but the action can still be inappropriate, over-scoped, or harmful. Governance has to cover both permission scope and decision behaviour.

Q: How do security teams know if agentic AI controls are actually working?

A: They know the controls are working when unsafe tool calls are blocked before execution, the audit trail shows which human initiated each action, and delegated workflows cannot silently expand their own scope. If the team can only explain incidents after the fact, the control set is too passive for agentic behaviour.

Q: Who is accountable when an AI agent causes a harmful system action?

A: Accountability should follow the human originator, the agent session, and the owning control process, not stop at the agent boundary. If a team cannot map an agent action back to the person, policy, and approval path that allowed it, the governance model is incomplete and regulatory evidence will be weak.

Technical breakdown

Prompt injection and tool poisoning in agent pipelines

Prompt injection works when attacker-controlled text is inserted into context that the model treats as legitimate instruction. In agentic systems, that context can come from user prompts, retrieved documents, tool outputs, web pages, or persistent memory, so the attack surface is much broader than a chat box. Tool poisoning is the same pattern applied to tool descriptions or return values, steering the agent toward unsafe choices at selection time. The important technical point is that the model is not being “hacked” in the classic sense. Its decision input is being contaminated before the next tool call, file write, or API request.

Practical implication: inspect prompt, retrieval, and tool metadata before execution, not after an agent has already acted.

Excessive agency and privilege escalation via agent tool calls

Excessive agency appears when an agent is granted broader permissions than the task truly needs, and then uses those permissions in a way the initiating user did not intend. Because the agent often operates through legitimate credentials, the escalation may not look like a classic authentication bypass. Instead, the abuse happens through authorised tool-calling interfaces that expand the effective blast radius of a single prompt. In NHI terms, the danger is not only possession of credentials but the scope of actions those credentials can reach once the agent is free to sequence calls at runtime.

Practical implication: scope agent service accounts to the narrowest callable actions and separate high-impact operations from routine workflows.

Multi-agent trust chains and memory poisoning

Multi-agent systems introduce trust propagation problems that single-agent chat systems do not have. A compromised upstream agent can pass malicious instructions downstream, and the downstream system may accept them because they appear to come from a trusted peer in the chain. Memory poisoning extends the problem across sessions by storing false or malicious instructions that later look like prior knowledge. This creates a durable trust defect: the agent inherits corrupted state from its own history, not just from an external attacker. That makes the failure mode both persistent and recursive, especially when multiple agents share context or reuse the same memory store.

Practical implication: segment agent memory and validate upstream agent outputs as untrusted inputs before they can influence later actions.

NHI Mgmt Group analysis

Agentic AI creates an assumption collapse, not just a control gap. Access review processes were designed for identities whose privileges persist long enough to be observed, certified, and revoked on a human cadence. That assumption fails when the actor can choose tools, execute actions, and compound decisions within one runtime session. The implication is that governance models built around stable entitlements no longer describe the behaviour they are trying to control.

Runtime policy is becoming the real perimeter for AI agents. The article correctly shows that firewalls, DLP, CASB, and SIEM see traffic, but not whether a valid action is safe in context. That is the core failure mode of agentic systems: authorised access can still be unsafe when the decision to use it is made by a non-deterministic runtime actor. Practitioners should treat semantic inspection as a governance requirement, not an optimisation.

Identity attribution must extend from human request to agent action. If an agent can retrieve data, call APIs, and modify systems under shared or pooled credentials, audit trails become ambiguous unless every action is tied back to the originating person and the specific agent session. This is where NHI governance and human accountability meet. The practical conclusion is that traceability must survive delegation chains, not stop at the agent boundary.

Multi-agent systems introduce identity blast radius across delegation chains. A single compromised agent can influence downstream agents, which means trust is no longer local to one credential or one approval gate. The field should name this as a trust-chain problem, not an isolated prompt problem. Once state and authority are reused across agents, the governance unit becomes the chain, and practitioners must manage the chain as a security object.

From our research:
80% of organisations report their AI agents have already performed actions beyond their intended scope, including accessing unauthorised systems (39%), inappropriately sharing sensitive data (31%), and revealing access credentials (23%), according to AI Agents: The New Attack Surface report.
Only 52% of companies can track and audit the data their AI agents access, leaving 48% with a complete blind spot for compliance and breach investigation.
For a broader control lens, see OWASP NHI Top 10 for the agentic risks that runtime governance must cover.

What this signals

Intent-based classification is becoming a prerequisite for agent governance. With 80% of organisations already reporting out-of-scope agent behaviour, per AI Agents: The New Attack Surface report, the programme question is no longer whether to monitor agents but how to separate normal automation from risky runtime decisions. The practical next step is to classify sessions by action authority, then enforce controls before execution rather than after logging.

Identity blast radius is the right concept for multi-agent programmes. Once one agent can influence another, the security unit is no longer the single identity but the delegation chain. Teams that already manage service-account sprawl will recognise the pattern, but agentic systems make the spread faster and harder to certify. This is where OWASP NHI Top 10 style control thinking becomes operationally relevant.

Programmes that treat AI governance as a model risk issue alone will miss the access layer. The more durable control model is to combine identity attribution, runtime policy, and pre-execution inspection so agent activity is governable before it becomes a production incident.

For practitioners

Classify every agentic session before it runs Separate chat sessions from tool-using agent sessions so policies can reflect whether the system may call APIs, read files, or modify production systems. Use that classification to decide which workflows require pre-execution inspection and which are limited to read-only actions.
Separate high-impact actions from ordinary tool calls Require an approval boundary for irreversible operations such as database deletion, privilege changes, or external data transfer. Do not let a routine task inherit destructive privileges simply because the same agent framework can reach them.
Trace every agent action back to a human originator Preserve an audit chain from the initiating user to the specific agent session, tool invocation, and downstream system action. This matters when investigators need to determine whether a dangerous action was authorised, coerced, or delegated through a chain of agents.
Treat retrieved text and tool outputs as untrusted inputs Apply the same skepticism to retrievals, tool descriptions, and memory entries that you would apply to external user input. This reduces the chance that poisoned context becomes the next authorised action inside the workflow.

Key takeaways

Agentic AI changes the control problem from who authenticated to what the runtime actor can decide and execute.
The strongest evidence here is behavioural, not theoretical: 80% of organisations already report agents acting beyond intended scope.
Security teams need pre-execution guardrails, traceable delegation, and scope limits that match the speed and autonomy of the agent.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Agentic AI Top 10 and CSA MAESTRO address the attack and risk surface, while NIST AI RMF set the governance and control requirements practitioners need to meet.

Framework	Control / Reference	Relevance
OWASP Agentic AI Top 10	A2	Prompt injection and tool poisoning map directly to agentic input and tool abuse risks.
CSA MAESTRO		Multi-agent trust chains and memory poisoning fit MAESTRO's multi-layer agent threat model.
NIST AI RMF	GV-1	Accountability and governance for autonomous agent actions are central to this article.

Model agent-to-agent dependencies and validate shared state before it can influence downstream actions.

Key terms

Agentic AI: Agentic AI is software that can pursue an objective by selecting actions, using tools, and chaining steps without human approval for each move. In practice, that makes it an identity and governance problem as much as a model problem, because the system can create real-world effects under its own runtime decisions.
Prompt Injection: Prompt injection is the insertion of attacker-controlled instructions into text an AI system reads as context. For agentic systems, the impact is larger than bad output, because the contaminated text can influence tool use, data access, and execution inside business systems.
Tool Poisoning: Tool poisoning is the manipulation of tool descriptions, return values, or tool-adjacent context so an agent chooses an unsafe action. It matters because the attack targets the agent’s decision process at selection time, before any privileged call is made.
Identity Blast Radius: Identity blast radius is the amount of damage an identity can cause when its permissions, delegation paths, or runtime choices are too broad. For agentic AI, the concept is especially important because one agent session can spread risk across APIs, data stores, and downstream agents very quickly.

Deepen your knowledge

NHI governance, agentic AI identity, and machine identity security are core topics in our NHI Foundation Level course, the industry's only accredited NHI security programme. If you are responsible for identity security strategy or governance in your organisation, it is worth exploring.

This post draws on content published by WitnessAI: Agentic AI systems and the eight cybersecurity risks they introduce. Read the original.

NHIMG Editorial Note
Published by the NHIMG editorial team on 2026-06-21.
NHI Mgmt Group — the independent authority on Non-Human Identity, IAM, and Agentic AI security. nhimg.org