Invisible MCP payloads expose a new AI supply chain risk

By NHI Mgmt Group Editorial TeamPublished 2025-09-10Domain: Agentic AI & NHIsSource: Noma Security

TL;DR: Invisible Unicode characters can alter MCP tool descriptions in ways humans miss but AI agents may still process, creating hidden instruction paths and unwanted tool execution, according to Noma Security. The risk shifts MCP governance from simple trust in visible metadata to integrity checks, privilege boundaries, and continuous monitoring.

At a glance

What this is: This analysis shows how invisible Unicode characters in MCP tool descriptions can hide instructions that AI agents may execute as if they were legitimate.

Why it matters: IAM and NHI teams need to treat MCP tool metadata as an attack surface, because hidden instructions can bypass human review and trigger unauthorized tool use.

👉 Read Noma Security's analysis of invisible character attacks in MCP

Context

Model Context Protocol, or MCP, standardizes how AI agents connect to tools and data sources, which also standardizes a new trust boundary for NHI governance. If tool descriptions, scripts, or server metadata can be altered without detection, the agent may act on information that the human reviewer never sees.

The operational problem is not limited to prompt injection in the chat layer. It extends into the tool supply chain, where invisible Unicode characters can change how an LLM interprets a function description and, by extension, what action it chooses to take. That makes MCP integrity, tool scoping, and content normalization core controls for teams running autonomous agents.

Key questions

Q: How should security teams protect MCP tools from hidden prompt injection?

A: Treat MCP tool metadata as untrusted input. Normalize text, remove invisible or control characters, validate the source of each tool, and restrict the agent to approved functions only. Add logging for tool selection and execution so hidden instructions cannot move unnoticed from description into action.

Q: When does MCP create more risk than value for AI agents?

A: MCP becomes higher risk when agents can reach sensitive systems without strong scope controls or integrity checks. If tool descriptions, wrappers, or server outputs can be altered, the protocol can carry hidden instructions as easily as valid requests. The risk is greatest where privileges are broad and review is weak.

Q: What is the difference between prompt injection and MCP tool injection?

A: Prompt injection targets the model conversation, while MCP tool injection targets the metadata and instructions tied to external tools. The second is often harder to spot because the user may never see the malicious text. Both can steer behavior, but tool injection can directly influence privileged actions.

Q: Should organisations allow AI agents to call privileged MCP functions?

A: Only with strict task boundaries, explicit approval, and continuous monitoring. If an MCP function can move money, access records, or change systems, the agent should receive just enough privilege for the task and nothing persistent. Broad standing access turns a hidden instruction into a high-impact event.

Technical breakdown

How invisible Unicode characters alter MCP tool interpretation

Invisible characters are Unicode code points that render as blank or near-blank text, such as zero-width spaces and control characters. A human scanning a tool description may see harmless prose, while the model or parser still processes hidden bytes that alter token boundaries or insert instructions. In MCP workflows, that matters because the agent consumes tool metadata before deciding which function to call. If the model treats hidden text as an instruction, the attack bypasses the user interface and lands in the agent’s reasoning path. The failure mode is not exotic. It is a parser mismatch between human perception, text rendering, and machine interpretation.

Practical implication: normalize, inspect, and sanitize tool metadata before any agent can read it.

Why MCP expands the AI supply chain attack surface

MCP creates a common integration layer between models and external systems, which reduces custom engineering but also concentrates trust. The server that publishes tools, the client that brokers access, and the agent that selects actions all become part of one control chain. A weakness in any layer can influence tool selection or execution. That is why MCP risk is broader than classic software supply chain risk. Traditional supply chain controls focus on package integrity and dependency provenance. MCP adds runtime intent, tool permissioning, and metadata integrity as first-class security concerns. In practice, the agent is only as trustworthy as the tool context it consumes.

Practical implication: extend supply chain controls to runtime tool context, not just code artifacts.

What secure MCP governance should enforce

Secure MCP governance needs to treat every tool as an identity-bearing object with a known source, allowed scope, and monitored behavior. Tool origin checks, least privilege, and continuous telemetry matter because the model will often make decisions faster than a human can review them. Teams should also consider static analysis for suspicious encodings, especially in tool schemas, wrapper scripts, and generated descriptions. If an agent can access payment, customer, or administrative functions, the governance bar must be higher than basic allowlisting. The key design goal is to prevent hidden instructions from becoming executable authority.

Practical implication: combine content integrity checks with least-privilege tool authorization and runtime monitoring.

Threat narrative

Attacker objective: The attacker wants the AI agent to execute a hidden command under the cover of legitimate-looking tool metadata.

Entry occurs when an attacker inserts invisible Unicode characters into an MCP tool description or related function text.
Escalation follows if the AI agent parses the hidden instruction and selects a privileged tool the user did not intend to invoke.
Impact is unauthorized action through the agent, such as data access, payment execution, or other tool-driven side effects.

Moltbook AI agent keys breach — Moltbook breach exposed 1.5M AI agent keys.
LiteLLM PyPI package breach — LiteLLM PyPI supply chain attack, credentials stolen from users.

Read our 52 NHI Breaches Analysis report for a comprehensive view of breaches impacting Non-Human Identities including AI Agents.

NHI Mgmt Group analysis

Invisible instruction abuse is a control-plane problem, not just a prompt problem. Hidden Unicode payloads succeed because the agent consumes tool metadata as authority, while the human reviewer sees only a normal description. That means the real failure is trust in unverified context, which belongs squarely in NHI governance. Practitioners should treat tool descriptions, schemas, and wrappers as security-relevant inputs, not documentation.

MCP expands the identity surface because every tool call now carries implicit privilege. The agent does not merely retrieve data. It inherits an authorization decision each time it binds to a tool and acts on the result. That makes tool provenance, scope, and runtime oversight part of the access model. Teams should align MCP governance with least privilege and session-level control rather than assuming static allowlists are enough.

Invisible characters create a new form of trust debt in agentic systems. The environment appears readable and reviewable, but the machine can still be receiving a different instruction set. That gap weakens auditability and makes incident response harder because the malicious intent may live in non-printing bytes. The practical conclusion is simple: if content integrity cannot be verified, the agent should not be allowed to execute.

Security teams should assume MCP abuse will evolve from proof of concept to operational tradecraft. Once attackers learn that hidden text can influence tool selection, they will try it in generated docs, copied snippets, and tool catalogs. The field should respond with integrity controls, not reactive exception handling. Practitioners need a default stance that all agent-consumed metadata is untrusted until normalized and validated.

Agent governance must move from visibility to verifiability. It is no longer enough to see the tool list. Teams need evidence that the text, source, and privilege scope behind each tool are the same ones the agent will actually use. That pushes MCP programs toward reproducible controls, logging, and policy enforcement before autonomous actions are permitted.

From our research:
80% of organisations report their AI agents have already performed actions beyond their intended scope, including accessing unauthorised systems (39%), inappropriately sharing sensitive data (31%), and revealing access credentials (23%), according to AI Agents: The New Attack Surface report.
Only 52% of companies can track and audit the data their AI agents access, leaving 48% with a complete blind spot for compliance and breach investigation, according to AI Agents: The New Attack Surface report.
For teams building policy and monitoring controls, the relevant next step is OWASP Agentic Applications Top 10, which helps map agent misuse to governance and runtime control gaps.

What this signals

Invisible Character Trust Gap: agentic systems now need integrity controls for the text they consume, not just the actions they take. When an AI agent can be influenced by non-printing bytes, traditional review workflows are no longer enough. Teams should move toward normalization, provenance checks, and execution policies that assume tool metadata may be adversarial.

The governance signal is broader than MCP itself. Autonomous systems are becoming dependent on machine-readable context that humans cannot reliably verify by eye, which makes auditability a structural requirement rather than a compliance afterthought. That is why AI agent programs should map this topic to the NIST AI Risk Management Framework and the OWASP Agentic AI Top 10 when defining review and approval gates.

The near-term programme risk is shadow context, not just shadow AI. If teams cannot prove that the tool text an agent reads is the text they approved, then their access model is already weaker than their policy says. That gap should drive tighter catalog governance, runtime monitoring, and red-teaming for hidden instruction paths.

For practitioners

Normalize tool metadata before agent ingestion Strip or reject invisible Unicode, control characters, and suspicious encodings from MCP tool descriptions, schemas, and wrapper files before the agent can read them.
Verify tool origin and integrity Require provenance checks for every MCP tool, including signed artifacts where possible and an allowlist for approved server sources and function definitions.
Limit agent authority to the minimum viable scope Bind each MCP client session to approved tools only, with task-scoped permissions that prevent a hidden instruction from reaching high-impact functions.
Monitor MCP traffic for anomalous tool use Log tool requests, arguments, and decision paths so unusual call sequences, unexpected privilege jumps, and repeated retries can be detected quickly.

Key takeaways

Invisible Unicode characters can convert tool metadata into an execution path that humans never intended.
MCP increases both integration efficiency and governance risk because it concentrates trust in the agent-tool boundary.
Teams need normalization, provenance, least privilege, and runtime monitoring before allowing autonomous tool use.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Agentic AI Top 10 address the attack and risk surface, while NIST AI RMF and NIST CSF 2.0 set the governance and control requirements practitioners need to meet.

Framework	Control / Reference	Relevance
OWASP Agentic AI Top 10		Agent tool misuse and prompt injection are central to this MCP attack path.
NIST AI RMF		AI governance needs integrity controls for agent context and tool decisions.
NIST CSF 2.0	PR.AC-4	Least privilege is needed when agents can invoke privileged tools through MCP.

Validate agent inputs and tool outputs, then constrain tool use to approved actions only.

Key terms

Model Context Protocol: MCP is a standard way for AI agents to connect to external tools, APIs, and data sources. It reduces custom integration work, but it also creates a shared trust boundary where tool identity, scope, and content integrity become security controls that must be enforced, not assumed.
Invisible Unicode Characters: Invisible Unicode characters are text code points that do not visibly render in most interfaces but still affect how systems parse or interpret content. In agentic workflows, they can hide instructions inside tool descriptions or scripts, creating a mismatch between what humans review and what machines consume.
Tool Metadata Integrity: Tool metadata integrity is the assurance that a tool’s name, description, schema, and source have not been altered in a way that changes intended behavior. For MCP and agent systems, this is a control problem because hidden or modified metadata can redirect privileged actions without obvious user consent.
Agentic Supply Chain Risk: Agentic supply chain risk is the exposure created when AI agents depend on external tools, generated artifacts, or third-party context to decide and act. The risk extends beyond software dependencies to include hidden instructions, untrusted metadata, and privilege abuse at runtime.

Deepen your knowledge

Invisible prompt injection and MCP tool trust are core topics in our NHI Foundation Level course, the industry's only accredited NHI security programme. If you are building controls for agentic systems that can act on tool metadata, this course is a practical starting point.

This post draws on content published by Noma Security: Model Context Protocol, invisible characters, and hidden prompt injection. Read the original.

NHIMG Editorial Note
Published by the NHIMG editorial team on 2025-09-10.
NHI Mgmt Group — the independent authority on Non-Human Identity, IAM, and Agentic AI security. nhimg.org