Subscribe to the Non-Human & AI Identity Journal

Who is accountable when an MCP agent accesses the wrong resource?

Accountability sits with the teams that defined consent, token handling, and policy review for the MCP deployment. If token passthrough, weak audience checks, or incomplete client approval allowed the request, that is a governance failure, not an agent anomaly. Frameworks such as NIST CSF and Zero Trust architecture expect explicit access validation.

Why This Matters for Security Teams

When an MCP agent reaches the wrong resource, the issue is rarely the model “going rogue” in isolation. The real failure is usually in the surrounding control plane: consent design, token scope, audience validation, policy review, and tool permissioning. That is why accountability lands with the teams operating the deployment, not with the agent itself. Current guidance from OWASP Agentic AI Top 10 and NIST AI Risk Management Framework treats autonomous systems as governed workloads that must be constrained at runtime.

That matters because MCP deployments often blur the line between application logic and delegated identity. If the agent inherits broad access, uses a long-lived secret, or bypasses explicit audience checks, the wrong-resource access is an expected outcome of weak design. NHIMG’s OWASP NHI Top 10 coverage and the Ultimate Guide to NHIs both stress that identity governance must cover the machine workload, its secrets, and the policy engine around it. In practice, many security teams only discover this after an agent has already touched a sensitive system through an allowed but unintended path.

How It Works in Practice

Accountability should be assigned to the owners of the identity and policy controls that enabled the request. For MCP, that usually means the platform team, the application owner, and the security team that approved the trust model. The practical question is whether the agent had a valid workload identity, whether the token was scoped to the correct audience, and whether the tool call was evaluated against live policy before execution. This is where static RBAC often fails: a role can say what a service may do in general, but autonomous agents make intent-driven, context-dependent requests that change from one task to the next.

A more resilient pattern is intent-based authorisation with just-in-time, ephemeral credentials. The agent presents workload identity, such as SPIFFE or OIDC-backed proof of what it is, then receives short-lived access only for the exact task and resource. Policy-as-code engines such as OPA or Cedar can evaluate the request at runtime using context such as user approval, task description, target resource, and risk tier. This aligns with the direction of CSA MAESTRO agentic AI threat modeling framework and the NIST AI Risk Management Framework.

  • Use per-task credentials with short TTLs and automatic revocation on completion.
  • Bind every token to an audience, resource, and tool permission set.
  • Log the user intent, policy decision, and downstream tool action for audit.
  • Separate the agent’s capability to decide from its capability to execute.

NHIMG research shows why this discipline matters: SailPoint reported that 80% of organisations have seen AI agents perform actions beyond intended scope, including accessing unauthorised systems and sharing data. These controls tend to break down when a single agent chains tools across multiple services because the original approval context is lost between hops.

Common Variations and Edge Cases

Tighter access control often increases operational overhead, requiring organisations to balance faster agent execution against stronger approval and auditing. That tradeoff is real, and there is no universal standard for every MCP workflow yet. In lower-risk environments, some teams allow broader standing access for convenience, but best practice is evolving toward just-in-time access whenever the agent can reach sensitive data or perform state-changing actions.

Edge cases usually appear when the agent acts on behalf of a human, when multiple agents share a single backend, or when secrets are cached in a client layer. In those scenarios, accountability can become shared across the product team, identity team, and approver workflow, but the governance failure still sits with the deployment design. The pattern is especially risky when access scoping is absent: Astrix Security’s Analysis of Claude Code Security and Moltbook AI agent keys breach show how quickly overbroad credentials become an organisation-wide issue. In practice, the hardest failures involve shared MCP servers, where policy gaps and secret sprawl combine into accidental access paths that look like legitimate tool use.

For that reason, mature programs treat wrong-resource access as a control failure to be fixed with OWASP, NIST, and CSA-aligned governance, not as a post-incident debate about whether the agent “meant” to do it.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Agentic AI Top 10 and CSA MAESTRO address the attack and risk surface, while NIST AI RMF set the governance and control requirements practitioners need to meet.

Framework Control / Reference Relevance
OWASP Agentic AI Top 10 A2 Addresses agent tool misuse and unintended actions through runtime guardrails.
CSA MAESTRO TA-02 Covers agent trust boundaries and delegated action accountability.
NIST AI RMF GOVERN Provides governance structure for accountability in autonomous AI systems.

Enforce runtime checks before each tool call and block any action outside declared task intent.