TL;DR: MCP turns LLMs into tool-using agents, but that also creates risks from prompt injection, token leakage, overbroad permissions, unverified endpoints, and replayable sessions, according to WorkOS. The practical lesson is that security teams must treat MCP connections as identity and trust boundaries, not just integration plumbing.
At a glance
What this is: This is a practical guide to securing Model Context Protocol deployments, with the central finding that MCP expands AI capability by bridging untrusted model output to sensitive tools and APIs.
Why it matters: It matters because the same trust, scope, token, and verification problems now affect NHI, autonomous, and human-access workflows wherever MCP is used to connect AI to enterprise systems.
👉 Read WorkOS's complete guide to MCP server and client security
Context
Model Context Protocol security is really about trust boundaries: once an LLM can invoke tools, every request becomes an identity decision as much as a software action. The primary problem is that model-generated input is untrusted, yet it can be allowed to touch APIs, databases, shells, and workflow systems through MCP.
For IAM and NHI teams, the concern is not just whether a client can authenticate, but whether the server can constrain tool scope, validate tokens, and verify endpoints before execution. That is why MCP security sits squarely at the intersection of NHI governance, access control, and runtime assurance.
Key questions
Q: How should security teams govern MCP tool access in enterprise environments?
A: Treat each MCP tool as a discrete identity and privilege boundary, not as a generic API extension. Map the tool to the minimum scope needed, separate read and write actions, and enforce approval or policy checks for high-risk operations. That approach limits blast radius when a model is manipulated or a token is stolen.
Q: Why do MCP deployments increase identity and access risk?
A: MCP increases risk because it connects untrusted model output to real systems that can act on it. That creates new paths for token exposure, overbroad authorization, rogue endpoints, and confused deputy behaviour. The issue is not AI novelty, but the expansion of machine-to-machine trust at runtime.
Q: What breaks when MCP tokens are long-lived or widely reusable?
A: Long-lived or reusable tokens turn a short exposure into persistent access. If a token appears in logs, URLs, or caches, an attacker can replay it and impersonate a legitimate client across systems. That makes revocation slow, attribution unclear, and containment much harder after exposure.
Q: What should teams do when an MCP server cannot prove its identity?
A: Do not trust it. Require signed metadata, verified registries, and explicit allowlisting before any connection is accepted. If a server cannot prove provenance, it should never receive tokens or sensitive requests, because discovery without identity verification creates a direct path to rogue tool use.
Technical breakdown
Prompt injection and tool poisoning in MCP
MCP makes the model an interpreter between user intent and external systems, which means prompt content and tool descriptions can both become control inputs. Prompt injection tries to steer the model into unsafe actions, while tool poisoning manipulates how the model interprets available capabilities. The security problem is not only malicious text, but malicious context shaping. If tool metadata is not treated as untrusted, the model may select the wrong action path, leak data, or escalate a request that should never have been executed.
Practical implication: Validate tool descriptions, constrain model-visible context, and require policy checks before any high-risk tool invocation.
Token leakage, replay, and audience scoping
MCP deployments rely on tokens to prove who a client is and what it may do, so a leaked token becomes a reusable identity artifact. The guide highlights risks from plaintext logs, URLs, caches, long-lived credentials, and poor audience validation. Sender-constrained tokens, short lifetimes, and strict audience claims matter because the same token should not be usable across unrelated servers. Without those boundaries, a stolen credential can impersonate a legitimate client and extend access far beyond the original session.
Practical implication: Use short-lived, audience-bound tokens, keep them out of logs and URLs, and bind them to the specific client and server pair.
Unverified endpoints and confused deputy OAuth flows
MCP clients may discover servers dynamically, which creates a trust problem if discovery is not anchored in signed metadata or an allowlist. A malicious endpoint can impersonate a legitimate service and receive tokens or sensitive requests. The confused deputy pattern makes this worse when a server proxies OAuth flows without strict consent and authorization checks. In that case, the server becomes the mechanism by which an attacker acquires privileges it should not have received. This is an identity failure as much as a protocol failure.
Practical implication: Verify server identity cryptographically, enforce explicit OAuth consent, and block any endpoint that cannot prove provenance.
Breaches seen in the wild
- Moltbook AI agent keys breach — Moltbook breach exposed 1.5M AI agent keys.
- AI LLM hijack breach — attackers used stolen AWS access keys to hijack Anthropic LLM models on Bedrock.
Read our 52 NHI Breaches Analysis report for a comprehensive view of breaches impacting Non-Human Identities including AI Agents.
NHI Mgmt Group analysis
MCP security is an NHI problem before it is an AI problem. Once a model can call tools, every connection becomes a non-human identity with scope, trust, and revocation implications. The article’s risks map directly to classic NHI failures: token exposure, overbroad permissions, weak endpoint trust, and poor auditability. Practitioners should stop treating MCP as a new AI feature and start treating it as a new class of machine identity integration.
Tool metadata is now part of the attack surface. In traditional IAM, the sensitive object is often the credential. In MCP, the tool description, endpoint registry, and discovery path can be manipulated to steer behaviour before authentication even completes. That shifts governance left into registration, validation, and integrity controls. The implication is that trust decisions must extend to what the model sees, not just what the server accepts.
Least privilege becomes operationally harder when one integration can reach many downstream systems. MCP encourages broad utility, but broad utility is exactly what increases blast radius when a token or endpoint is compromised. That is why the real governance question is not whether MCP is supported, but whether each tool has a sharply bounded identity, purpose, and revocation path.
Runtime oversight matters because static approval models do not cover dynamic tool use well. MCP sessions can move quickly from request to action, especially when models are asked to chain tasks across systems. That compresses the window for manual review and makes audit, step-up control, and policy enforcement the only scalable guardrails. Teams should treat MCP as a control-plane problem, not a documentation exercise.
From our research:
- 53% of MCP servers expose credentials through hard-coded values in configuration files, according to The State of MCP Server Security 2025.
- In our NHI research, 96% of organisations store secrets outside of secrets managers in vulnerable locations including code, config files, and CI/CD tools.
- That gap makes secret handling a lifecycle problem as much as a technical one, and the Ultimate Guide to NHIs is the natural next reference point for rotation, offboarding, and visibility.
What this signals
Tool identity is becoming a governance object, not just an integration detail. As MCP adoption grows, teams will need to know which tools are authorised, which endpoints are verified, and which tokens are bound to which session. A practical way to frame the issue is tool trust debt: the accumulation of weak discovery, broad scopes, and stale credentials that eventually outgrows the control model.
With 96% of organisations storing secrets outside secrets managers in vulnerable locations including code, config files, and CI/CD tools, the surrounding control environment is already strained before MCP is added. That means the next phase of programme maturity is not simply better tooling, but tighter identity boundaries for every AI-connected system.
Security leaders should expect MCP reviews to converge with Zero Trust and NHI governance work rather than sit beside them. The relevant questions will be about provenance, scope, and session control, not just whether the integration functions correctly under test.
For practitioners
- Map every MCP tool to a unique access scope Assign each tool the narrowest feasible permissions and separate read from write operations so a single compromised tool cannot fan out across systems. Review scope creep regularly and remove any broad wildcard permissions that were added for convenience. Consider the smallest trustworthy access boundary rather than a shared privilege set.
- Require cryptographic verification for server discovery Only allow clients to connect to servers that publish signed metadata and can be validated against an approved registry or allowlist. Reject endpoints that cannot prove provenance, because unverified discovery is how rogue servers become trusted identities. This is especially important when MCP servers are added dynamically.
- Bind tokens to the intended audience and session Use short-lived credentials, sender-constrained tokens, and strict audience claims so a token captured in one place cannot be replayed against another server. Keep credentials out of logs, URLs, and cache layers, and rotate or revoke them as soon as exposure is suspected.
- Gate high-risk tool actions with policy approval Require explicit approval or automated policy checks for file writes, deletes, shell execution, and any other action that can alter state outside the model boundary. Use sandboxing and rate limits as additional containment so a bad tool call cannot become a full environment compromise.
Key takeaways
- MCP security is fundamentally an identity and trust problem because it lets untrusted model output drive real tool actions.
- Hard-coded credentials, broad scopes, and unverified endpoints create the main failure paths that attackers can exploit through MCP deployments.
- Practitioners should respond by narrowing tool scope, verifying server provenance, and binding tokens to specific audiences and sessions.
Standards & Framework Alignment
This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.
OWASP Agentic AI Top 10 and OWASP Non-Human Identity Top 10 address the attack and risk surface, while NIST Zero Trust (SP 800-207) set the governance and control requirements practitioners need to meet.
| Framework | Control / Reference | Relevance |
|---|---|---|
| OWASP Agentic AI Top 10 | A2 | MCP tool misuse and prompt injection map directly to agentic application risk. |
| OWASP Non-Human Identity Top 10 | NHI-01 | MCP clients and servers rely on machine credentials that can be exposed or replayed. |
| NIST Zero Trust (SP 800-207) | PR.AC-4 | MCP depends on continuous trust checks for clients, servers, and sessions. |
Inventory MCP identities, scope them narrowly, and rotate any exposed credentials immediately.
Key terms
- Model Context Protocol: Model Context Protocol is a standard that lets AI systems connect to external tools and data sources in a structured way. In security terms, it creates a new trust boundary because model output can become executable action, so identity, scope, and verification controls matter at every hop.
- Confused Deputy: A confused deputy is a system that is tricked into using its own authority on behalf of an attacker. In MCP environments, that can happen when a server proxies OAuth flows or forwards tokens without proving the request was intended for that specific resource and action.
- Sender-Constrained Token: A sender-constrained token is a credential that can only be used by the party that originally received it. This reduces replay risk because stealing the token is not enough on its own, which is especially important for MCP sessions that move quickly across multiple systems.
- Tool Poisoning: Tool poisoning is the manipulation of tool metadata or descriptions so the model selects an unsafe action or trusts a malicious capability. It matters in MCP because the model reads those descriptions as context, which makes tool identity and metadata integrity part of the attack surface.
Deepen your knowledge
MCP security, tool scoping, and identity-bound access controls are core topics in our NHI Foundation Level course, the industry's only accredited NHI security programme. If you are building AI-connected integrations with the same trust and privilege issues discussed here, it is worth exploring.
This post draws on content published by WorkOS: The complete guide to MCP security and how to secure MCP servers and clients. Read the original.
Published by the NHIMG editorial team on 2025-08-22.
NHI Mgmt Group — the independent authority on Non-Human Identity, IAM, and Agentic AI security. nhimg.org