Subscribe to the Non-Human & AI Identity Journal

How should security teams review an MCP gateway for SOC 2 and HIPAA?

Start by treating the gateway as a governed access path for agent traffic, not as a standalone product feature. Reviewers should ask where it runs, what data crosses the boundary, how tool permissions are enforced, what evidence is logged, and whether regulated content can remain inside the customer’s control boundary.

Why This Matters for Security Teams

An mcp gateway is not just a routing layer. For SOC 2 and HIPAA, it becomes part of the control plane that decides which agent can touch which systems, which prompts or tool outputs may cross boundaries, and what evidence exists after the fact. That means reviewers need to assess access control, auditability, least privilege, data handling, and change management as operational controls, not product claims. Current guidance suggests the gateway should be reviewed like any other governed access path.

This is especially important because agentic workloads do not behave like static applications. In the State of MCP Server Security 2025, Astrix Security found that only 18% of MCP server deployments implement any form of access scoping for tool permissions. That gap matters when an MCP gateway brokers access to PHI, credentials, or internal tools. The relevant standard is emerging quickly, and the OWASP Agentic AI Top 10 is a useful benchmark for the risk patterns security teams should expect.

In practice, many security teams discover gateway weakness only after an agent has already been allowed to overreach across tools, logs, or data domains.

How It Works in Practice

A useful review starts with the trust boundary. Security teams should determine whether the MCP gateway runs inside the customer environment, in a vendor-managed service, or in a hybrid model, then map every data flow that passes through it. For SOC 2, that affects confidentiality, logging, access review, and monitoring evidence. For HIPAA, it affects whether PHI is stored, forwarded, transformed, or retained outside the covered entity or business associate boundary. The control question is not only “does it work?” but “can it prove who accessed what, when, and under what approval?”

Reviewers should validate that tool permissions are enforced at runtime, not merely documented in a policy. That includes scoping which tools the agent can call, whether secrets are injected per request, whether responses are filtered before reaching the model, and whether high-risk actions require step-up approval. The OWASP Agentic Applications Top 10 is relevant here because prompt injection, tool abuse, and excess agency often show up first at the gateway layer. The gateway should also emit logs that are tamper-evident and sufficiently detailed for incident response, including tool name, caller identity, request context, policy decision, and data classification tags.

  • Confirm whether PHI is ever cached, embedded in traces, or copied into support tooling.
  • Check whether access decisions are enforced by policy at request time, not by static roles alone.
  • Verify secret handling, especially rotation, redaction, and revocation after task completion.
  • Test whether denied tool calls are still observable in audit logs.

For implementation guidance, the OWASP Top 10 for Agentic Applications 2026 and the Analysis of Claude Code Security both reinforce a common pattern: control effectiveness depends on runtime governance, not developer intent alone. These controls tend to break down when the gateway proxies multiple tenants or forwards PHI into downstream tools that lack their own enforcement and logging.

Common Variations and Edge Cases

Tighter gateway control often increases integration overhead, requiring organisations to balance auditability against development speed. That tradeoff is real for SOC 2 and HIPAA reviews because overly restrictive policies can disrupt legitimate agent workflows, while loose policies create compliance blind spots. Best practice is evolving, and there is no universal standard for exactly how much agent telemetry must be retained or how granular tool-level approvals should be.

One edge case is when the MCP gateway is only a broker and the actual PHI exposure happens in downstream systems. In that model, reviewers still need evidence that the gateway does not become an uncontrolled relay for sensitive data. Another edge case is shared infrastructure, where logs, prompts, and tool outputs may cross tenant boundaries unless isolation is explicit. Security teams should also check whether human operators can override tool permissions without leaving a durable audit trail, since that can undermine both SOC 2 evidence and HIPAA accountability.

Where the environment includes sub-processors, external model endpoints, or federated agents, the review should extend beyond the gateway itself and into the full data path. If that path cannot be bounded and logged end to end, the gateway is not ready to be treated as a compliant control point.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Non-Human Identity Top 10 and OWASP Agentic AI Top 10 address the attack and risk surface, while NIST AI RMF set the governance and control requirements practitioners need to meet.

Framework Control / Reference Relevance
OWASP Non-Human Identity Top 10 NHI-03 MCP gateways often depend on secret handling and rotation, which is central to this control.
OWASP Agentic AI Top 10 Agent tool abuse and excessive agency are core MCP gateway review risks.
NIST AI RMF AIRMF supports governance, accountability, and monitoring for AI-mediated access decisions.

Inventory gateway secrets, enforce short TTLs, and automate rotation and revocation for all tool credentials.