Subscribe to the Non-Human & AI Identity Journal

Who is accountable when an AI agent takes action through an MCP server?

The accountable party is the human or team that authorised the agent’s access, but only if the organisation can prove that chain. Without immutable logs that connect the initiating identity to the tool call and final action, accountability becomes weak, and legal or compliance teams lose the evidence they need.

Why This Matters for Security Teams

When an AI agent acts through an mcp server, the accountability question is not just “who approved it?” but “can the organisation prove who approved it, what it was allowed to do, and what it actually did?” That proof chain is where many programmes fail. Static RBAC is too blunt for autonomous, goal-driven systems, because an agent can change tools, chain actions, and behave differently by context. Current guidance increasingly points to intent-based authorisation and workload identity as better primitives than legacy human-centric access models, as reflected in the OWASP Agentic AI Top 10 and the CSA MAESTRO agentic AI threat modeling framework.

The practical issue is evidence. If the initiating identity, the policy decision, the MCP tool invocation, and the downstream side effect are not tied together in immutable logs, legal and compliance teams cannot reliably assign responsibility. NIST’s accountability guidance in the NIST AI Risk Management Framework is clear that governance must be measurable, not inferred after the fact. In practice, many security teams encounter this only after an agent has already accessed data or executed a tool call without a defensible audit trail.

How It Works in Practice

Accountability should be engineered as a chain of custody, not assumed from a policy document. The human owner, the workflow owner, and the system owner may all share responsibility, but the organisation still needs a cryptographic and operational record that shows who authorised the agent, which workload identity was used, what scope was issued, and which MCP server executed the request. That is why workload identity matters: it proves what the agent is, while JIT credentials prove what it may do for this specific task.

Best practice is evolving toward short-lived, context-aware access. Instead of a standing token with broad permissions, the agent receives ephemeral secrets or OIDC-bound credentials only for the current action. Policy decisions should be evaluated at request time using policy-as-code, not pre-baked into static roles that cannot reflect the agent’s live intent. For sensitive tool use, teams should require:

  • explicit human or policy approval before the first privileged action
  • per-task credential issuance with automatic revocation on completion
  • immutable logging that links user intent, agent identity, MCP server, and final effect
  • separation of duty between agent operators, platform admins, and approvers

That model aligns with findings in NHIMG research on agentic risk, including the OWASP NHI Top 10 and the Ultimate Guide to NHIs — 2025 Outlook and Predictions, which both emphasise that identity governance must follow non-human execution paths, not just user sign-in events. Where organisations can, they should also validate agent behaviour against the threat patterns described in the MITRE ATLAS adversarial AI threat matrix and the NIST AI Risk Management Framework.

These controls tend to break down when MCP servers are shared across teams without per-tenant scoping, because the tool layer becomes wider than the identity layer that is supposed to constrain it.

Common Variations and Edge Cases

Tighter control often increases operational overhead, requiring organisations to balance forensic certainty against developer speed. That tradeoff is real, especially in fast-moving agent deployments where teams want autonomous execution but still need a defensible accountability model. There is no universal standard for exactly how much delegation is acceptable yet, but current guidance suggests that higher-risk actions should never rely on a standing human approval alone.

Some environments complicate the answer. In a fully automated workflow, the accountable party may be the platform owner if the agent was granted access by policy, while in a business-owned workflow it may be the process owner who approved the agent’s objective. If an MCP server aggregates multiple tools, responsibility can also split across the agent builder, the server operator, and the data owner. That is why NHIMG research such as the AI LLM hijack breach and DeepSeek breach is useful: it shows how quickly privilege, data access, and execution can drift once an agent is operating with broad authority.

One important edge case is incident response. If logs are incomplete, accountability becomes a policy assertion rather than an evidentiary finding, and that is usually insufficient for legal review. Another is delegated use of a shared service account, where the real accountable party is the team that allowed the delegation pattern in the first place. In practice, the question is answered less by who clicked approve and more by whether the organisation can reconstruct the agent’s authority chain after the event.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Agentic AI Top 10 and CSA MAESTRO address the attack and risk surface, while NIST AI RMF set the governance and control requirements practitioners need to meet.

Framework Control / Reference Relevance
OWASP Agentic AI Top 10 A1 Agent autonomy and tool abuse create the core accountability risk here.
CSA MAESTRO MAESTRO frames agent governance, policy, and runtime controls for accountability.
NIST AI RMF AI RMF governance requires traceable ownership and accountable AI operation.

Bind every privileged action to intent, identity, and immutable logs before execution.