Who is accountable when an agent leaks data through an MCP server?

Why This Matters for Security Teams

When an agent leaks data through an mcp server, the incident is rarely just a “bad prompt” or a single operator mistake. MCP expands the agent’s effective reach into tools, data, and downstream systems, so accountability depends on who defined the trust boundary, who approved the tool exposure, and who owned the controls that should have limited the blast radius. Current guidance increasingly treats this as an identity and governance problem, not a narrow application bug.

That distinction matters because autonomous systems do not behave like static service accounts. They can chain tool calls, drift into unintended contexts, and expose secrets or records when authorization is too coarse. NHIMG research on AI Agents: The New Attack Surface report found that 80% of organisations say their AI agents have already performed actions beyond intended scope, including inappropriately sharing sensitive data. In practice, many security teams encounter accountability gaps only after a leak has already been investigated, rather than through intentional governance design.

How It Works in Practice

Accountability for MCP-related data leakage should be traced across the full control chain: the agent owner, the platform team operating the MCP server, the identity team issuing credentials, and the security function defining policy and monitoring. The most useful question is not “who clicked the wrong thing,” but “which control failed to prevent an authorised-but-unsafe action from becoming a breach?” That framing aligns with NIST AI Risk Management Framework expectations for governance, mapping, and measurement, and with CSA MAESTRO agentic AI threat modeling framework guidance on tracing agent actions through tool use and trust boundaries.

In practical terms, a defensible MCP deployment usually includes:

Tool-level authorization that limits which actions the agent can invoke.

Context validation that checks whether the request matches the task, data class, and environment.

Short-lived credentials for the agent and for the MCP server, rather than durable secrets.

Logging that ties every tool call to a workload identity and a policy decision.

Clear ownership for revocation, incident response, and post-incident review.

NHIMG’s The State of MCP Server Security 2025 underscores why this matters: 53% of MCP servers expose credentials through hard-coded values, and only 18% implement any form of access scoping for tool permissions. That means leak accountability often sits not with the agent itself, but with the teams that allowed a server to act as an over-privileged bridge between data and tools. These controls tend to break down in fast-moving prototype environments because MCP servers are promoted to production before scoped permissions, audit trails, and secret handling are fully enforced.

Common Variations and Edge Cases

Tighter MCP governance often increases integration overhead, so organisations must balance speed of agent deployment against the cost of explicit control design. There is no universal standard for liability allocation yet, but current guidance suggests that shared accountability works best when it is assigned before launch and revisited whenever the agent’s toolset changes.

Edge cases are common when multiple teams own different layers of the stack. If the MCP server is managed by one team, the agent orchestration layer by another, and the data source by a third, blame assignment becomes fuzzy unless the trust boundary is documented in advance. The same issue appears when an agent is permitted to retrieve sensitive records for a legitimate task but the response is not constrained, redacted, or monitored. In those cases, the leak is usually an authorisation failure, a monitoring failure, or both. NHIMG’s 52 NHI Breaches Analysis and the OWASP Top 10 for Agentic Applications 2026 both point to the same operational lesson: when identity, tool scope, and telemetry are not aligned, incident reviews become attribution debates instead of control improvements.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Agentic AI Top 10 and CSA MAESTRO address the attack and risk surface, while NIST AI RMF set the governance and control requirements practitioners need to meet.

Framework	Control / Reference	Relevance
OWASP Agentic AI Top 10	A2	Tool misuse and overreach map to agentic authorization failure.
CSA MAESTRO	T1	MAESTRO covers trust boundaries and tool-chain risk in agentic systems.
NIST AI RMF	GOVERN	AI RMF governance covers ownership, accountability, and oversight for AI incidents.

Assign named owners for agent, MCP server, and data controls before deployment and review them regularly.

#1 Authority in NHI Education, Research and Advisory, empowering organizations to tackle the critical risks posed by Non-Human Identities (NHIs), including AI Agents.

Who is accountable when an agent leaks data through an MCP server?

Why This Matters for Security Teams

How It Works in Practice

Common Variations and Edge Cases

Standards & Framework Alignment

Related resources from NHI Mgmt Group