MCP security testing must move beyond standard API assumptions

By NHI Mgmt Group Editorial TeamPublished 2026-04-29Domain: Best PracticesSource: Aembit

TL;DR: MCP breaks standard API assumptions by introducing dynamic agent, server, and tool interactions across multiple trust boundaries, so generic scanners miss confused deputy, token passthrough, SSRF, and authorization-pattern failures, according to Aembit. The security model now depends on validating relationships, token flow, and per-request authorization, not just endpoint responses.

At a glance

What this is: This analysis explains why Model Context Protocol security requires MCP-specific testing, because standard API tools miss relationship-based and token-flow vulnerabilities.

Why it matters: It matters because IAM teams now have to validate consent, audience, and session boundaries for non-human identities that operate across multiple trust boundaries, not just protect passwords and endpoints.

By the numbers:

Non-human identities now outnumber humans by as much as 144:1.
That ratio grew 56% in just one year.

👉 Read Aembit’s analysis of MCP security testing and trust-boundary failures

Context

Model Context Protocol security testing is not the same as testing a standard API. MCP adds agent-to-tool interaction across multiple trust boundaries, which means authorization, consent, token handling, and request validation all become relationship-aware rather than endpoint-only problems.

For identity teams, the issue is larger than protocol hygiene. MCP environments concentrate non-human identities, and the control plane has to account for short-lived tokens, client-specific consent, and tool invocation patterns that can change the security posture of workloads and AI agents in real time.

Key questions

Q: What breaks when MCP security is tested like a standard API?

A: Standard API testing misses relationship-based failures such as confused deputy abuse, token passthrough, and trust-boundary violations between clients and servers. MCP security depends on per-client consent, exact token audience validation, and request-level checks, so endpoint-only scans can report false confidence while the protocol remains exploitable.

Q: Why do MCP systems make identity governance harder for NHI teams?

A: MCP systems make governance harder because non-human identities can move through several consent and tool layers before any user-facing action occurs. That expands the identity blast radius and makes a single trust decision more consequential. Teams need to govern the relationship chain, not just the token or the endpoint.

Q: How do security teams know whether MCP authorization is actually working?

A: Look for evidence that consent is stored per client, tokens are validated at each hop, and invalid audience or redirect values are rejected consistently. If a server accepts session-only state, relayed tokens, or vague consent prompts, authorization is functioning as a convenience layer rather than a control.

Q: Who is accountable when an MCP client grants access too broadly?

A: Accountability sits with the team operating the client, the server, and the authorization policy, because MCP failures usually come from broken relationship handling rather than a single bad request. Security and platform teams should document who owns consent registration, token validation, and local execution restrictions.

Technical breakdown

Confused deputy testing in MCP

The confused deputy problem appears when a legitimate client is tricked into using its own authorized relationship to perform actions on behalf of another client. In MCP, the failure is not a broken endpoint but a missing per-client consent check across trust boundaries. Generic scanners often test one client in isolation, so they miss cross-client privilege transfer. Proper testing has to simulate at least two clients, different consent states, and the exact resources each client can reach. That is the mechanism the specification is trying to constrain.

Practical implication: Test whether consent is recorded and enforced per client application, not as a shared grant.

Token passthrough and session-only auth failures

Token passthrough occurs when a server forwards a token instead of validating it directly at each hop. MCP explicitly rejects that pattern, and it also requires per-request token validation rather than relying on HTTP session state alone. This is a structural control, not a tuning preference. If a server accepts relayed tokens, or treats cookies and session state as sufficient authentication, the trust boundary has been collapsed. The same logic applies to audience validation, where a signed token can still be invalid for a specific MCP service.

Practical implication: Verify that each hop validates the token directly against the authorization server and rejects session-only authentication.

SSRF and local execution risks in MCP servers

MCP clients and local servers expand the attack surface because they may fetch metadata, follow redirects, call local tools, or execute commands. That creates room for SSRF, DNS rebinding, path traversal, and command injection if URL and input validation are weak. The specification requires HTTPS, exact redirect URI matching, schema validation, and strict handling of local file and command parameters. Stdio transport reduces exposure for local servers because it avoids opening a network listener, but it does not remove the need for explicit consent and parameter validation.

Practical implication: Block internal-address metadata fetches, validate redirects exactly, and constrain local paths and command execution through allowlists.

Threat narrative

Attacker objective: The attacker’s objective is to turn legitimate MCP relationships into unauthorized tool access, data exposure, or command execution.

Entry occurs when an attacker exploits a confused deputy condition by persuading one MCP client to act with another client’s authorization.
Credential abuse follows when tokens are passed through intermediaries or session state is treated as sufficient, allowing unauthorized trust to persist across hops.
Impact lands when the attacker reaches tools, metadata endpoints, or local execution paths that were never meant to be accessible under the original consent scope.

Moltbook AI agent keys breach — Moltbook breach exposed 1.5M AI agent keys.
Salesloft OAuth token breach — hackers stole OAuth tokens to access Salesforce data via Salesloft.

Read our 52 NHI Breaches Analysis report for a comprehensive view of breaches impacting Non-Human Identities including AI Agents.

NHI Mgmt Group analysis

Relationship testing is now a first-order identity control for MCP. MCP does not fail mainly at the endpoint layer. It fails where a trusted client, a token, and a tool all meet under different authorization expectations. That means the real governance object is the relationship between identity, client, and resource, not the request alone. Practitioners should treat per-client consent as a control boundary, not a UX detail.

Token passthrough is a control failure, not a protocol convenience. The specification’s prohibition matters because forwarding tokens destroys the ability to prove which actor actually exercised authority at each hop. Session-only authentication creates the same blind spot by replacing request-level verification with stateful trust. For NIST CSF and zero-trust programs, this is a governance gap in identity verification and trust continuity, not a tuning issue.

MCP security testing exposes an identity blast radius problem. Once agents can orchestrate tools, the security question becomes how far a single consent decision can propagate across clients, servers, and local execution paths. Generic API testing does not measure that blast radius because it assumes fixed request-response behavior. Practitioners need to reframe MCP assurance around the maximum damage a valid relationship can do when it is over-trusted.

Identity-first testing is the right lens for MCP workloads because non-human identities dominate the environment. In these systems, the control set has to cover attestation, short-lived tokens, and policy decisions that change with workload posture. The older assumption that authentication and authorization can be validated once at session start no longer holds. Security teams should treat MCP as a workload identity problem with protocol-specific failure modes, not as a standard web app problem.

The named concept here is the identity blast radius. In MCP, one consent grant can span multiple tools, metadata lookups, and execution paths, so the reachable impact is much larger than the initial request suggests. That is the practical reason the specification insists on exact redirect checks, audience validation, and per-request token verification. The practitioner conclusion is simple: test the reach of each identity relationship, not just the correctness of each endpoint.

From our research:
Non-human identities now outnumber humans by as much as 144:1, according to AI Agents: The New Attack Surface report.
A separate finding shows 80% of organisations report AI agents have already acted beyond intended scope, which is why protocol-level identity testing cannot be treated as optional.
For a broader control baseline, The 52 NHI breaches Report shows how identity failures become operational incidents once trust boundaries are crossed.

What this signals

Identity blast radius: MCP forces security teams to measure how far one valid consent decision can propagate across clients, servers, and local execution paths. That changes the focus from authentication success to trust-boundary containment, which is a better fit for workload IAM and zero-trust controls.

MCP also raises the bar for observability. If teams cannot correlate token validation, consent grants, redirect handling, and tool invocation logs, they will miss the earliest signs of relationship misuse. The practical shift is toward continuous validation of non-human identity behaviour, not periodic point-in-time assurance.

For readers building policy and control frameworks, this is the moment to align protocol testing with identity governance. OWASP guidance on agentic applications and workload identity practices such as SPIFFE-style attestation become more relevant when tool orchestration crosses multiple trust zones.

For practitioners

Test consent as a relationship, not a checkbox. Simulate multiple client applications with different privilege levels and verify that each client receives and retains only its own consent grants. A cross-client access attempt should fail even when the user identity is the same. This is the most direct way to catch confused deputy behavior.
Validate token flow at every trust boundary. Trace tokens through the MCP stack and confirm that each server validates them directly against the authorization server rather than relaying them. Reject session-only authentication and confirm that audience checks fail when a token is presented to the wrong service.
Lock down discovery, redirects, and local execution paths. Block internal IPs, loopback targets, and link-local addresses in discovery fields, enforce exact redirect URI matching, and sanitize local file and command inputs. Where local servers are used, prefer stdio transport and disable unnecessary network listeners.
Build observability for client-level identity behaviour. Log consent grants, token validation failures, redirect checks, tool invocations, and anomalous client-to-resource combinations. Correlate those events in the SIEM so you can see when a legitimate client starts behaving like a deputy for another relationship.

Key takeaways

MCP security fails most often at the relationship layer, where client consent, token flow, and tool authority intersect.
The scale of non-human identity growth makes protocol-specific testing essential, not optional, for workload IAM programmes.
Teams that verify per-client consent, direct token validation, and strict URL handling will reduce the identity blast radius of MCP deployments.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Non-Human Identity Top 10 address the attack and risk surface, while NIST Zero Trust (SP 800-207) and NIST CSF 2.0 set the governance and control requirements practitioners need to meet.

Framework	Control / Reference	Relevance
OWASP Non-Human Identity Top 10	NHI-01	MCP token flow and consent failures map to NHI identity and secret handling risks.
NIST Zero Trust (SP 800-207)	PR.AC-3	Per-request validation and audience checks align with zero-trust access enforcement.
NIST CSF 2.0	PR.AC-4	MCP authorization needs least-privilege scoping and identity governance controls.

Validate per-client consent and direct token handling for all MCP trust boundaries.

Key terms

Confused Deputy: A confused deputy is a trusted component that is tricked into using its authority on behalf of the wrong requester. In MCP, the weakness appears when one client can consume another client’s authorization because consent is not enforced per relationship. The control failure is about misplaced trust, not a broken login.
Token Passthrough: Token passthrough is the practice of forwarding an authentication token through intermediaries instead of validating it at each trust boundary. In MCP this is prohibited because it prevents the server from proving who is actually authorised to act. The result is weaker accountability and a larger attack surface for stolen or replayed credentials.
Identity Blast Radius: Identity blast radius is the amount of damage a valid identity relationship can cause when it is over-trusted. In MCP, one consent decision can reach multiple tools, metadata services, and execution paths, so the blast radius is wider than a single request. The useful question is not whether access exists, but how far it can spread.
Per-Client Consent: Per-client consent means authorisation is recorded and enforced separately for each application or client, even when the same user is involved. For MCP, this prevents one trusted client from inheriting another client’s rights. It is a governance requirement that preserves the relationship boundary between identity, client, and resource.

Deepen your knowledge

MCP security testing and non-human identity governance are core topics in our NHI Foundation Level course, the industry's only accredited NHI security programme. If your team is building controls for agent-driven tool access, it is worth exploring.

This post draws on content published by Aembit: MCP security testing and how to verify the specification’s critical controls. Read the original.

NHIMG Editorial Note
Published by the NHIMG editorial team on 2026-04-29.
NHI Mgmt Group — the independent authority on Non-Human Identity, IAM, and Agentic AI security. nhimg.org