By NHI Mgmt Group Editorial TeamPublished 2026-03-05Domain: Breaches & IncidentsSource: Noma Security

TL;DR: A critical flaw in Upstash’s Context7 MCP Server let poisoned “Custom Rules” flow through a trusted channel into AI coding assistants, enabling credential theft, exfiltration, and destructive file actions, according to Noma Security’s research. The case shows that MCP trust assumptions, not just model behavior, now shape the security of agentic development workflows.


At a glance

What this is: This is a security analysis of a Context7 MCP Server flaw that let attacker-controlled library rules reach AI coding assistants as trusted context.

Why it matters: It matters because MCP servers can become privileged delivery paths for NHI-like agent actions, so teams need controls for tool output trust, content sanitization, and execution boundaries.

👉 Read Noma Security's analysis of Context7 MCP server poisoning and AI coding risk


Context

Model Context Protocol has made it easier for AI assistants to reach documentation and tools, but it also creates a new trust boundary inside developer workflows. When an MCP server serves content into an agent’s context, that content can be treated like instruction, which turns content integrity into an IAM and NHI governance problem rather than a pure application bug.

Context7 sits in that middle layer for coding assistants, where the line between documentation and executable instruction becomes thin. For IAM and security teams, the operational question is whether any tool that feeds agents can also constrain what those agents are allowed to infer, execute, or forward.

The starting position here is not unusual. Most organisations adopting agentic development tools are still evaluating the trust model after deployment rather than before it, which is exactly where MCP poisoning risk grows.


Key questions

Q: How should security teams govern MCP servers used by AI coding assistants?

A: Treat MCP servers as privileged trust boundaries, not simple data sources. Security teams should classify each server by the authority it can influence, sanitize any user-generated or third-party content before delivery, and limit the agent’s tool access so malicious context cannot easily become destructive action.

Q: Why do MCP-based agents create a bigger risk than ordinary documentation tools?

A: MCP-based agents can act on the content they receive, which turns inbound text into operational influence. If an attacker can poison that content, the agent may follow malicious instructions with shell, file, or network access, so the risk is execution through trusted context rather than passive misinformation.

Q: What is the difference between trust scoring and real access control for agents?

A: Trust scoring is a reputation signal, while access control is a policy boundary. A high score may suggest a source looks credible, but it does not stop poisoned content from reaching an agent. Real control requires sanitization, provenance checks, and explicit limits on what the agent can do after receiving content.

Q: When should organisations restrict AI assistants from reading external context sources?

A: Restrict external context access whenever the source can be influenced by third parties, user submissions, or weak moderation. If the assistant can turn that content into tool calls, the organisation should require review, filtering, or a narrower permission set before allowing the integration in production workflows.


Technical breakdown

How MCP trust confusion turns content into instruction

MCP is designed to move structured tool output into an agent’s context. The failure mode is that the agent often cannot tell whether that output is documentation, policy guidance, or attacker-controlled text. If the server aggregates user-generated content and serves it through a trusted channel, the channel itself becomes the trust signal. That is why a read-only MCP server can still be dangerous: it may not execute code, but it can shape what the agent executes next inside the IDE. In practical terms, the server is not the attacker’s runtime. It is the delivery path for trusted-looking instructions.

Practical implication: Treat every MCP tool output as potentially executable influence and apply content validation before it reaches the agent.

Why registry and delivery roles create a larger attack surface

Context7’s architecture matters because it combines content registration with content delivery. A registry lets third parties publish rules or documentation, while the MCP server delivers that material directly to the agent. When those roles are combined, an attacker only needs to plant malicious content once to reach every downstream consumer. Popularity signals such as stars or download counts do not change that mechanics. They may make a library look trusted to humans, but they do nothing to prove that the content being delivered is safe for an autonomous agent to consume.

Practical implication: Separate content publishing trust from agent delivery trust and gate both with independent review.

Custom rules as a hidden control plane for agent behaviour

Custom rules act like a control plane for the assistant because they can steer tool use, file access, and exfiltration behavior without the user explicitly authorising each step. In the reported case, poisoned rules were used to search for secrets, move data to an attacker-controlled repository, and delete local files as cleanup. That pattern shows how a seemingly harmless configuration feature can become a privilege amplifier when the agent already has access to shell commands, file systems, and network paths. The architectural lesson is simple: any text that can change tool selection can also change blast radius.

Practical implication: Inventory every agent instruction source and restrict which ones can influence tool execution.


Threat narrative

Attacker objective: The attacker aims to turn a documentation lookup into code execution, secret theft, and local system damage through the victim’s own AI assistant.

  1. Entry via library registration and poisoned custom rules published into the Context7 registry.
  2. Escalation when the trusted MCP delivery path places attacker-controlled instructions into the agent’s context inside the IDE.
  3. Impact through credential theft, exfiltration to an attacker-controlled repository, and destructive local file deletion.

Read our 52 NHI Breaches Analysis report for a comprehensive view of breaches impacting Non-Human Identities including AI Agents.


NHI Mgmt Group analysis

Trust is now part of the attack surface in agentic development workflows. MCP changes the security problem from model output quality to the integrity of the path that delivers context to the agent. If that path can be poisoned, then an agent can be manipulated even when the underlying model behaves as designed. Practitioners should treat MCP trust decisions as identity decisions, because the server that speaks to the agent is effectively asserting authority.

Identity blast radius is the right concept for MCP-fed agents. The issue is not only whether a server is malicious, but how much privilege the receiving agent can exercise once the server’s content is accepted. When an instruction source can trigger shell access, file reads, or network calls, the blast radius is defined by the agent’s entitlements. Security teams should map those entitlements to NHI controls and reduce standing authority wherever possible.

Popularity metrics are not security controls for machine-to-machine trust. Stars, downloads, and reputation badges may help adoption, but they do not prove content safety or supply-chain integrity. That makes trust scoring a weak substitute for policy enforcement, content sanitization, and runtime inspection. Practitioners should stop using community popularity as a proxy for authorization.

Context poisoning is becoming a practical NHI governance issue. The industry often treats NHI as service accounts and API keys, but agentic systems also create instruction identities, delivery identities, and tool-use identities that must be governed together. That widens the policy surface beyond secrets management into data flow control and execution constraints. Teams should update their NHI model to include agents and the channels that feed them.

Vendor remediation does not remove architectural debt. A fast fix closes one flaw, but it does not change the fact that many agents still trust inbound MCP content too readily. The broader market signal is that agentic security tooling will need stronger inspection, policy enforcement, and provenance controls at the context layer. Practitioners should plan for layered controls, not point fixes.

From our research:

  • 98% of companies plan to deploy even more AI agents within the next 12 months, despite documented rogue behaviour in 80% of current deployments, according to AI Agents: The New Attack Surface report.
  • Only 52% of companies can track and audit the data their AI agents access, leaving 48% with a complete blind spot for compliance and breach investigation.
  • Use OWASP Agentic Applications Top 10 to map agent trust, tool misuse, and context poisoning into a control model before deployment.

What this signals

With agent adoption accelerating faster than governance, the practical risk is not just bad prompts but unowned instruction pathways. In environments using agentic coding tools, teams should assume that every context feed is part of the control plane and bind it to policy, logging, and least privilege.

Context provenance debt: once third-party content can influence agent behavior, organisations inherit a long-lived governance liability that traditional IAM review cycles do not see. That makes provenance, sanitization, and runtime inspection operational requirements, not optional hardening steps.

As agent use expands, teams will need to align this problem with external guidance such as OWASP Agentic AI Top 10 and internal entitlement reviews. The programme question is whether the organisation can explain, at any moment, why an agent was allowed to trust a specific source and act on it.


For practitioners

  • Classify MCP servers as privileged trust boundaries Inventory every MCP server used in development workflows, identify which ones can influence agent behavior, and assign them explicit trust tiers based on the data they deliver and the tools they can shape.
  • Sanitize all agent-facing content before delivery Strip or constrain custom rules, user-generated text, and embedded instructions before they reach the assistant’s context, especially when the server aggregates third-party material.
  • Reduce agent tool entitlements to the minimum workable set Limit shell, file, and network access for AI coding assistants so poisoned context cannot easily translate into credential theft or destructive cleanup actions.
  • Add runtime inspection for suspicious tool chains Detect sequences such as secret discovery, outbound issue creation, and file deletion as a correlated chain rather than isolated commands.
  • Challenge popularity as a trust signal Do not use stars, downloads, or account age as evidence that an MCP-integrated library is safe for autonomous use; require independent review and provenance checks instead.

Key takeaways

  • MCP trust is now a security control issue, because content delivered into an agent can shape execution just as directly as code.
  • Popularity metrics do not establish safety for agent-facing content, and they should never substitute for sanitization or policy enforcement.
  • Teams that deploy AI coding assistants need tighter trust boundaries, smaller tool entitlements, and runtime inspection before poisonable context becomes an incident.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Agentic AI Top 10 address the attack and risk surface, while NIST CSF 2.0 and NIST Zero Trust (SP 800-207) set the governance and control requirements practitioners need to meet.

FrameworkControl / ReferenceRelevance
OWASP Agentic AI Top 10NHI-01Agent context poisoning maps to agent goal and tool misuse risks.
NIST CSF 2.0PR.AC-4Least privilege is essential when assistants can turn context into execution.
NIST Zero Trust (SP 800-207)Trusted context delivery should be continuously verified, not assumed.

Review inbound context sources and constrain which ones can alter agent actions.


Key terms

  • Mcp Server: A Model Context Protocol server delivers structured content and tool results to an AI agent. In practice, it can become part of the agent’s trust boundary because the agent may treat delivered text as authoritative guidance, even when the source content originated from outside the organisation.
  • Context Poisoning: Context poisoning is the manipulation of information that an AI agent reads before acting. The malicious content does not need to be code. If it changes the agent’s instructions, tool choices, or assumptions, it can alter behaviour and expand the impact of a compromised delivery path.
  • Identity Blast Radius: Identity blast radius is the amount of damage that can occur when a non-human identity is trusted too broadly. For agents, it reflects how far malicious or unintended instructions can travel once the system is allowed to access files, shells, APIs, or network destinations.
  • Agent Trust Boundary: An agent trust boundary is the point at which external content becomes actionable inside an autonomous workflow. It includes the server, registry, or interface that feeds context to the agent, plus the permissions the agent can exercise after that context is accepted.

Deepen your knowledge

Context poisoning and agent trust boundaries are core topics in our NHI Foundation Level course, the industry's only accredited NHI security programme. If your team is now evaluating AI coding assistants and MCP integrations, the course helps establish the right control model.

This post draws on content published by Noma Security: Context7 MCP Server vulnerability research and remediation details. Read the original.

NHIMG Editorial Note
Published by the NHIMG editorial team on 2026-03-05.
NHI Mgmt Group — the independent authority on Non-Human Identity, IAM, and Agentic AI security. nhimg.org