AI agent hijacking exposes the identity gap in trusted automation

By NHI Mgmt Group Editorial TeamPublished 2026-06-08Domain: Agentic AI & NHIsSource: SumSub

TL;DR: AI agents that can book travel, manage calendars, and move across accounts are becoming attractive targets because hijacked agents can still look legitimate to the outside world, according to SumSub. The identity problem is no longer just access, but proving whether an action came from the intended agent or a compromised one, which breaks trust models built for human users.

At a glance

What this is: This is a conversation on AI agent hijacking and the emerging Know Your Agent problem, showing how compromised agents can impersonate legitimate user behaviour across systems.

Why it matters: It matters because IAM teams now have to govern agent identity, delegated access, and fraud signals across autonomous workflows, not just human sign-in events.

👉 Read SumSub's analysis of AI agent hijacking and Know Your Agent

Context

AI agent hijacking is the loss of trust in an agent that was allowed to act on a user's behalf, especially when the outside world cannot easily distinguish legitimate action from compromise. In this case, the core problem is identity assurance, because the agent's behaviour may still look valid even after the underlying access has been abused.

For IAM, this pushes agent governance into the same risk zone as service accounts and high-trust delegated workflows. The question is no longer only whether the account can authenticate, but whether the actor behind the action is still the intended one, whether through prompt injection, delegated credential misuse, or fraudulent activity across multiple platforms.

Key questions

Q: How should security teams govern AI agents that act on behalf of users?

A: Security teams should govern AI agents as delegated identities with explicit scope, lifecycle ownership, and runtime monitoring. The key is to verify not only that the agent can authenticate, but that its actions still match the approved purpose. That means mapping every connected service, limiting privileges, and defining a clear revocation process for compromised agents.

Q: Why do AI agents create a new identity risk for IAM programmes?

A: AI agents create a new identity risk because they can perform valid-looking actions across multiple systems while still appearing legitimate to the outside world. IAM controls built for human sessions often stop at login and entitlement assignment. Agents require additional assurance around intent, behaviour, and delegated authority.

Q: What breaks when an AI agent is hijacked but still looks trusted?

A: What breaks is the assumption that a trusted session implies a trusted actor. A hijacked agent can continue to authenticate and execute actions while its underlying instructions or context have been manipulated. That creates a blind spot for both fraud detection and access governance, especially in workflows that span several platforms.

Q: Who should own response when an AI agent is compromised?

A: Ownership should sit across IAM, fraud, and application security, because the compromise can affect identity trust, customer impact, and workflow integrity at the same time. The response should include token revocation, connector shutdown, and review of all actions the agent performed while compromised.

Technical breakdown

How AI agent hijacking works in delegated workflows

AI agents become risky when they are allowed to act across tools and services with a user's trust attached to them. If an attacker can influence prompts, intercept credentials, or exploit weak delegation boundaries, the agent may continue executing tasks while appearing normal to external systems. That creates a fraud problem as much as a technical one, because downstream services may only see valid requests and not the compromise behind them. The agent's automation makes abuse scalable because actions can be repeated quickly across multiple accounts and platforms.

Practical implication: map every delegated action path and identify where a hijacked agent could still be treated as legitimate by downstream systems.

Know Your Agent and the identity trust gap

Know Your Agent is an emerging control idea for proving that an AI agent is the one you intended to authorise, rather than an impersonator or compromised instance. It sits beside existing identity controls because traditional authentication proves a session, not the integrity of the agent's behaviour over time. For organisations, the gap is that a trusted agent can still produce untrusted outcomes if its prompts, tools, or runtime context have been manipulated. That makes behavioural assurance part of identity assurance.

Practical implication: treat agent identity as a lifecycle problem, not a one-time login problem.

Prompt injection and agent misuse across platforms

Prompt injection is a technique for steering an agent away from intended instructions and toward attacker-chosen actions. In multi-platform environments, that can translate into account drainage, data exposure, or impersonation because the agent may be able to carry trust across systems that were never designed to validate its intent at each step. The operational weakness is not just the prompt itself, but the combination of broad access, weak runtime inspection, and excessive trust in automated actions. Once the agent becomes the trusted intermediary, abuse can propagate quickly.

Practical implication: restrict agent permissions to the narrowest task scope and monitor for cross-platform actions that exceed the expected behaviour pattern.

Threat narrative

Attacker objective: The attacker wants to abuse trusted automation so that fraudulent actions appear to come from the legitimate user or agent.

Entry begins when an attacker compromises or manipulates an AI agent that already has trusted access to user accounts, data, or tools.
Escalation happens when the hijacked agent continues to execute valid-looking actions across connected platforms, allowing the attacker to impersonate the user at runtime.
Impact occurs when the attacker uses that trusted automation to drain accounts, expose data, or carry out fraud while the outside world believes the agent is acting legitimately.

Moltbook AI agent keys breach — Moltbook breach exposed 1.5M AI agent keys.
AI LLM hijack breach — attackers used stolen AWS access keys to hijack Anthropic LLM models on Bedrock.

Read our 52 NHI Breaches Analysis report for a comprehensive view of breaches impacting Non-Human Identities including AI Agents.

NHI Mgmt Group analysis

AI agent hijacking is an identity assurance failure, not just a fraud event. The outside world often sees a valid workflow, a trusted account, and a normal-looking action trail, which is exactly why the compromise is hard to detect. That makes agent identity different from human sign-in assurance, because the trust boundary extends into runtime behaviour. Practitioners should treat agent integrity as part of identity governance, not as an adjacent monitoring problem.

Know Your Agent is the right framing because authentication alone does not prove intent. A session can be real while the agent's behaviour is no longer aligned with the user or business purpose. That is the governance gap this article exposes: existing IAM models verify access, but they do not continuously verify that an automated actor is still the intended actor. Practitioners should reframe agent trust as a lifecycle and behavioural assurance problem.

Prompt injection becomes materially more dangerous when it targets an identity that can act across systems. The more platforms an agent can touch, the more a single compromise can spread through legitimate-looking actions. This is why agent governance cannot be siloed inside application security alone. Practitioners should limit cross-platform authority and require tighter review of delegated workflows.

Trusted automation drift: the core failure mode here is that an agent keeps its legitimacy while its runtime behaviour has been redirected. That breaks the assumption that a trusted automation path remains trusted for the duration of the task. The implication is that identity controls for agents must account for behaviour drift, not just initial authorisation.

Fraud controls and IAM controls are converging around the same actor. When an agent can impersonate a user across accounts and services, the boundary between identity abuse and financial abuse disappears. That means fraud teams, IAM teams, and security architects need a shared operating model for delegated automation. Practitioners should plan for cross-functional ownership before agent adoption scales further.

From our research:
98% of companies plan to deploy even more AI agents within the next 12 months, despite documented rogue behaviour in 80% of current deployments, according to AI Agents: The New Attack Surface report.
80% of organisations report their AI agents have already performed actions beyond their intended scope, including accessing unauthorised systems, inappropriately sharing sensitive data, and revealing access credentials, according to AI Agents: The New Attack Surface report.
For a broader governance lens, OWASP NHI Top 10 helps teams map agent identity abuse to runtime control gaps.

What this signals

Know Your Agent will become a practical governance pattern before it becomes a formal standard. The market is already moving toward more agent deployment, but the control surface is still immature. With 98% of companies planning to deploy more AI agents, teams should expect pressure to formalise agent ownership, trust review, and revocation processes.

Trusted automation drift is the concept practitioners should watch. Once an agent can keep operating while its intent has been manipulated, classic IAM assumptions about stable identity and stable purpose stop holding. That is where fraud controls, behaviour analytics, and delegated access governance begin to overlap in a single operating model.

The most useful response is to build controls around delegated runtime authority rather than around the initial login event. That means tying agent access to explicit task boundaries, monitoring cross-system behaviour, and planning for compromise at the connector layer, not just at the user account layer.

For practitioners

Inventory all user-delegated agents Map every AI agent that can act on behalf of a person, including calendar, travel, finance, and support workflows. Record the accounts, tokens, APIs, and platforms each agent can touch, and identify where external services would treat those actions as user-authorised.
Reduce cross-platform authority Limit each agent to the smallest possible task scope and separate high-risk actions from low-risk convenience workflows. Avoid broad delegated permissions that let one compromised agent move laterally across multiple services with the same trust context.
Add runtime behaviour checks for agents Monitor for prompt anomalies, unusual action sequences, and repeated requests that do not match the expected task pattern. Use behavioural checks to detect when an agent continues to act normally while its instructions or outputs have been hijacked.
Create an offboarding path for agent compromise Define how to revoke agent tokens, disable delegated connectors, and invalidate access when an agent is suspected of hijacking or misuse. The response should be specific to the agent, not just the human account behind it, because the compromised actor may still hold active runtime privileges.

Key takeaways

AI agent hijacking is fundamentally an identity assurance problem because compromised automation can still look legitimate to external systems.
The risk is scaling quickly because most organisations plan to deploy more agents even as many current deployments already show rogue behaviour.
Practitioners need runtime controls, narrow delegated authority, and a revocation path designed for the agent itself, not just the human account behind it.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Agentic AI Top 10 address the attack and risk surface, while NIST AI RMF and NIST Zero Trust (SP 800-207) set the governance and control requirements practitioners need to meet.

Framework	Control / Reference	Relevance
OWASP Agentic AI Top 10		Covers agent hijacking, prompt injection, and tool abuse in autonomous workflows.
NIST AI RMF		Addresses governance and oversight for AI systems that act on behalf of users.
NIST Zero Trust (SP 800-207)	PR.AC-4	Least privilege and continuous verification apply to delegated agent access paths.

Assign ownership, monitoring, and escalation paths for agent behaviour under the AI RMF GOVERN function.

Key terms

Know Your Agent: A governance pattern for verifying that an AI agent is the intended actor, not a hijacked or impersonating one. It extends identity assurance beyond login by focusing on runtime behaviour, delegated authority, and the integrity of the agent's actions across connected systems.
Trusted Automation Drift: A failure mode where an automated agent keeps the appearance of trusted access after its instructions, context, or outputs have been manipulated. The session may still be valid, but the actor's behaviour has moved outside its intended purpose, creating identity and fraud risk together.
Delegated Agent Identity: The identity relationship that allows an AI agent to act on behalf of a person or system across tools and services. It is more than authentication. It includes scope, revocation, behavioural monitoring, and the downstream trust assumptions that other systems make about the agent's actions.

Deepen your knowledge

AI agent hijacking and delegated access governance are core topics in our NHI Foundation Level course, the industry's only accredited NHI security programme. If you are building controls for agent identity and runtime trust, it is worth exploring.

This post draws on content published by SumSub: AI agent hijacking, Know Your Agent, and trusted automation risk. Read the original.

NHIMG Editorial Note
Published by the NHIMG editorial team on 2026-06-08.
NHI Mgmt Group — the independent authority on Non-Human Identity, IAM, and Agentic AI security. nhimg.org