Who is accountable when an AI agent delegation chain causes an unauthorised action?

Why This Matters for Security Teams

When an AI agent makes an unauthorised change, the real accountability question is not philosophical. It is operational: who approved the delegation, who owned the policy, and who could prove what the agent was allowed to do at that moment. Static IAM and broad RBAC often fail here because agents act autonomously, chain tools, and adapt to context in ways human workflows do not. That is why current guidance increasingly points to intent-based authorisation and real-time policy checks, not fixed permission sets. The issue is especially visible in agentic risk research such as OWASP NHI Top 10 and the NIST AI Risk Management Framework, both of which emphasize governance, traceability, and controlled behaviour over trust by default.

In practice, many security teams encounter accountability gaps only after an agent has already crossed a delegation boundary and the forensic trail is incomplete.

How It Works in Practice

The strongest accountability model starts with workload identity for each agent, then adds per-task delegation records, short-lived secrets, and policy evaluation at the moment of action. In a well-run chain, every hop should carry the sender identity, recipient identity, task ID, delegation depth, scope, validation outcome, and expiry. That makes it possible to identify whether the failure happened in orchestration, in policy design, or in a specific agent that exceeded its intended scope. Research from the AI Agents: The New Attack Surface report shows why this matters: 80% of organisations say their AI agents have already acted beyond intended scope, which means delegated behaviour is not a corner case.

Practitioners should treat agent permissioning as a runtime decision, not a static role assignment. That means:

issue JIT credentials for a single task or narrow time window, then revoke them automatically;

bind the agent to workload identity so cryptographic proof, not shared service accounts, defines who it is;

evaluate policy at request time with context such as task intent, data sensitivity, and destination tool;

log every delegation hop so investigators can reconstruct the chain after a breach.

For implementation patterns, the OWASP Agentic AI Top 10 and the CSA MAESTRO agentic AI threat modeling framework both support this shift toward contextual control. The same logic applies to secret abuse cases discussed in AI LLM hijack breach, where exposed credentials can be consumed by attackers within minutes. These controls tend to break down when multiple agents share one service principal because attribution and revocation no longer map cleanly to a single actor.

Common Variations and Edge Cases

Tighter delegation control often increases orchestration overhead, requiring organisations to balance traceability against latency, developer friction, and operational complexity. That tradeoff becomes sharper in multi-agent workflows, where one agent passes work to another and the security team must decide whether accountability sits with the platform owner, the primary agent owner, or the approving business function. There is no universal standard for this yet, so best practice is evolving toward documented ownership and explicit approval boundaries rather than informal assumptions.

Edge cases also appear when agents are allowed to discover tools dynamically, when they operate across teams, or when they can read and rewrite prompts, instructions, or memory. In those environments, static RBAC can look correct on paper but still permit unauthorized behaviour through chained actions. The safer pattern is to define what the agent is trying to do, then authorize that intent against context, data class, and current risk. That aligns with the accountability expectations reflected in the OWASP Agentic Applications Top 10 and the NIST AI Risk Management Framework. Where delegation chain cross external APIs, human-in-the-loop approval may still be needed for high-impact actions, but that is a policy choice, not a replacement for technical evidence. The model is weakest when logging is incomplete and shared secrets hide which agent actually initiated the action.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Agentic AI Top 10 and CSA MAESTRO address the attack and risk surface, while NIST AI RMF set the governance and control requirements practitioners need to meet.

Framework	Control / Reference	Relevance
OWASP Agentic AI Top 10	A2	Agentic controls address unauthorized chained actions and delegation abuse.
CSA MAESTRO	TRM	MAESTRO models delegated agent risk and accountability across orchestration layers.
NIST AI RMF	GOVERN	AI RMF governance frames ownership, traceability, and accountability for autonomous systems.

Bind each agent action to intent, context, and explicit authorization before execution.

#1 Authority in NHI Education, Research and Advisory, empowering organizations to tackle the critical risks posed by Non-Human Identities (NHIs), including AI Agents.

Who is accountable when an AI agent delegation chain causes an unauthorised action?

Why This Matters for Security Teams

How It Works in Practice

Common Variations and Edge Cases

Standards & Framework Alignment

Related resources from NHI Mgmt Group