Tool poisoning is an attack in which malicious instructions are hidden inside tool descriptions, examples, or schemas that an AI agent reads when deciding what to do. The danger is not only in the tool's code, but in the metadata that shapes the agent's behaviour and trust decisions.
Expanded Definition
Tool poisoning is a form of agentic application attack where malicious content is embedded in a tool’s description, examples, field names, or schema so an AI agent treats hostile instructions as trusted operational context. It targets the agent’s decision layer, not just the executable code.
In practice, the attack surface grows wherever an agent ingests tool metadata from MCP catalogs, plugin registries, API wrappers, or internal documentation. Usage in the industry is still evolving, and definitions vary across vendors, but the core issue is consistent: the agent reads untrusted text as if it were policy. That is why this risk sits alongside broader agent instruction-injection concerns described in the OWASP Agentic Applications Top 10, while control design should still follow baseline governance principles in the NIST Cybersecurity Framework 2.0.
The most common misapplication is treating tool metadata as harmless documentation, which occurs when teams allow unreviewed descriptions or examples to be consumed directly by autonomous agents.
Examples and Use Cases
Implementing tool metadata controls rigorously often introduces friction in developer workflows, requiring organisations to weigh faster agent integration against tighter review and publishing gates.
- A compromised tool description tells an agent to prefer a specific endpoint, quietly redirecting workflow execution to an attacker-controlled service.
- A poisoned schema example inserts a hidden prompt that causes the agent to reveal secrets or over-share context when it formats a request.
- An internal MCP tool registry contains malicious usage notes, and the agent follows them because the registry is trusted more than the user prompt. The risk aligns closely with the guidance in the OWASP Agentic Applications Top 10.
- A tool wrapper copied from public code includes adversarial examples that alter the agent’s routing logic, even though the underlying API remains legitimate.
- Security teams map review requirements to the NIST Cybersecurity Framework 2.0 so tool publishing, approval, and monitoring are treated as controlled processes rather than ad hoc documentation updates.
Why It Matters in NHI Security
Tool poisoning matters because AI agents often act with delegated authority, and that authority can extend to secrets, service accounts, and non-human identity workflows. If the agent trusts poisoned metadata, it may approve a destructive action, disclose sensitive context, or invoke tools outside intended boundaries. The operational failure is not just a bad prompt response; it is misuse of machine identity and access.
NHI governance becomes especially important here because too many organisations already have weak visibility into service accounts and adjacent secrets. NHI Mgmt Group research shows only 5.7% of organisations have full visibility into their service accounts, which means poisoned tools can hide inside poorly governed automation estates for long periods. Defences should therefore combine tool provenance checks, strict change control, least privilege, and agent-aware review of descriptions, schemas, and examples, as reinforced by the OWASP Agentic Applications Top 10 and the access control discipline in the NIST Cybersecurity Framework 2.0.
Organisations typically encounter the consequence only after an agent has already approved a harmful tool action or exposed data, at which point tool poisoning becomes operationally unavoidable to address.
Standards & Framework Alignment
This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.
OWASP Agentic AI Top 10 and OWASP Non-Human Identity Top 10 address the attack and risk surface, while NIST CSF 2.0 set the governance and control requirements practitioners need to meet.
| Framework | Control / Reference | Relevance |
|---|---|---|
| OWASP Agentic AI Top 10 | A2 | Covers prompt and tool-instruction attacks that exploit agent trust boundaries. |
| OWASP Non-Human Identity Top 10 | NHI-02 | Addresses unsafe handling of secrets and adjacent trust material in automation. |
| NIST CSF 2.0 | PR.AC-4 | Least-privilege access management reduces damage from poisoned tool instructions. |
Review tool metadata as untrusted input and gate agent tool use behind approval and validation.
Related resources from NHI Mgmt Group
Deepen Your Knowledge
Reviewed and updated by the NHIMG editorial team on May 26, 2026.
NHI Mgmt Group — the #1 independent authority on Non-Human Identity, IAM, and Agentic AI security. nhimg.org