Who is accountable when an authorised AI agent causes a breach?

Accountability usually sits with the organisation that assigned the access, defined the workflow, and failed to instrument runtime oversight. The hard part is proving whether the failure was an entitlement decision, a workflow design issue, or a missing behavioural control, which is why governance ownership must span IAM, security engineering, and application teams.

Why This Matters for Security Teams

An authorised agent can still create a breach if its access was too broad, its task boundaries were unclear, or nobody monitored what it did at runtime. That is why accountability cannot stop at “the agent was approved.” It spans the business owner, IAM, security engineering, and the application team that exposed tools or data. Current guidance suggests treating agent behaviour as a governance problem, not only an access problem, consistent with the NIST AI Risk Management Framework and the OWASP Agentic AI Top 10. NHIMG research shows the scale of this gap: 80% of organisations say their AI agents have already acted beyond intended scope, while only 44% have any governing policy in place in SailPoint’s AI Agents: The New Attack Surface report. In practice, many security teams encounter responsibility disputes only after the agent has already accessed data or chained tools into an incident, rather than through intentional control design.

How It Works in Practice

For autonomous workloads, the right question is not just who approved access, but who controlled the agent’s effective authority at the moment of action. That usually means combining workload identity, JIT credentials, intent-based authorisation, and runtime policy checks. The agent should authenticate as a workload, not as a human proxy, and receive short-lived secrets only for the task it is currently performing. Best practice is evolving toward cryptographic workload identity, such as SPIFFE/SPIRE or OIDC-based tokens, because those signals identify what the agent is and what it is allowed to do now, not what someone hoped it might do later.

Operationally, teams should separate three decisions: who owns the agent, who defines its task constraints, and who approves its data and tool access. Then enforce those decisions with policy-as-code at request time using context such as task, dataset, environment, and destination. That approach aligns with CSA MAESTRO agentic AI threat modeling framework, the NIST AI Risk Management Framework, and NHIMG’s own OWASP Agentic Applications Top 10. The practical control set usually includes:

JIT issuance of secrets with short TTLs and automatic revocation on task completion.
Per-tool and per-dataset scoping, rather than broad role grants.
Runtime approval for sensitive actions such as exfiltration, write operations, or privilege escalation.
Central logging that records intent, context, and result for later investigation.

These controls tend to break down when agents are given persistent service accounts with unrestricted toolchains, because the agent can reuse broad authority across many unseen action paths.

Common Variations and Edge Cases

Tighter control often increases latency and operational overhead, so organisations must balance containment against workflow friction. There is no universal standard for this yet, especially for multi-agent systems where one agent delegates to another and accountability becomes shared by design. In those cases, current guidance suggests assigning a primary owner for the originating workflow and a secondary owner for the runtime control plane, then documenting where liability changes from design-time approval to execution-time failure.

The hard cases are usually edge environments: legacy SaaS connectors, shared API gateways, and analyst-facing copilots that can call production tools. In those environments, static RBAC is often too blunt because it cannot express the difference between “read a ticket” and “move funds,” even when both actions sit inside the same role. That is why NHIMG’s The 52 NHI breaches Report and DeepSeek breach both point to the same pattern: long-lived secrets and weak runtime controls turn authorised access into breach fuel. For investigations, the most relevant question is often not “was the agent allowed in?” but “was it allowed to keep acting after its original intent changed?”

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Agentic AI Top 10 and CSA MAESTRO address the attack and risk surface, while NIST AI RMF set the governance and control requirements practitioners need to meet.

Framework	Control / Reference	Relevance
OWASP Agentic AI Top 10	A1	Agent mis-scoping and runtime abuse map to agentic application risks.
CSA MAESTRO	C2	MAESTRO frames governance for autonomous agent behaviour and control planes.
NIST AI RMF	GOVERN	AI RMF govern function covers accountability and oversight for AI systems.

Document accountable owners, escalation paths, and monitoring for every agent workflow.

Who is accountable when an authorised AI agent causes a breach?

Why This Matters for Security Teams

How It Works in Practice

Common Variations and Edge Cases

Standards & Framework Alignment

Related resources from NHI Mgmt Group