AI tool poisoning is the manipulation of local AI assistants or coding tools so they execute attacker-chosen behaviour, hide prompts, or persist changes. It is an identity problem because the assistant operates with delegated execution authority and can be steered by malicious input or environment changes.
Expanded Definition
AI tool poisoning is a local trust-abuse pattern, not just a prompt-injection problem. It happens when an AI assistant, coding agent, or plugin reads poisoned files, configuration, memory, or package content and then executes attacker-shaped actions under delegated authority. In NHI terms, the assistant behaves like an agent with standing access, so the poisoned environment becomes an identity-control failure as much as a model-safety issue. Definitions vary across vendors, but the practical boundary is clear: if the tool can write files, run commands, call MCP-backed services, or persist settings, then malicious input can become durable control. That is why guidance in the NIST Cybersecurity Framework 2.0 around access control, monitoring, and recovery is relevant even when the attack arrives through the model layer. The most common misapplication is treating poisoning as harmless text contamination, which occurs when teams fail to secure the assistant’s execution context, repository boundaries, and tool permissions.
Examples and Use Cases
Implementing protections against AI tool poisoning rigorously often introduces friction, because tighter workspace isolation and approval gates can slow developer workflows while reducing the blast radius of compromised input.
- A coding assistant consumes a malicious README or dependency note that instructs it to rewrite authentication code, then commits unsafe changes because the repository is trusted by default.
- An agent uses an MCP-connected tool to fetch project context, but a poisoned configuration file quietly redirects it toward exfiltration or destructive commands.
- A local desktop assistant persists a harmful “helpful” setting after opening a tampered prompt file, turning a one-time trigger into repeated bad behaviour.
- A developer workstation pulls in tainted snippets from a shared knowledge base, and the assistant reproduces insecure patterns as if they were approved team guidance. The OWASP Agentic Applications Top 10 is useful here because it frames tool access, autonomy, and execution abuse as core risks rather than edge cases.
- An exposed data source trains the assistant to repeat sensitive material patterns, echoing the failure pattern discussed in the DeepSeek breach, where contamination and exposure collided at scale.
Operationally, the distinction matters: prompt injection may steer a single response, while tool poisoning can steer the system that performs the work.
Why It Matters in NHI Security
AI tool poisoning matters because it turns delegated execution into a hidden attack surface. Once an assistant can edit code, move secrets, approve actions, or invoke downstream services, compromised context becomes a privilege problem. That is especially dangerous for NHI programs, where service accounts, tokens, and automation identities already create concentrated blast radius. Research from The State of Secrets in AppSec shows that organisations average 6 distinct secrets manager instances and need 27 days on average to remediate a leaked secret, which means poisoned tools can operate long enough to cause real exposure before defenders notice. The same report notes that 43% of security professionals worry about AI systems learning and reproducing sensitive patterns from codebases, reinforcing that model contamination and secrets governance are linked. In practice, this is where NIST Cybersecurity Framework 2.0 functions as a governance baseline, while NHI controls determine whether an agent can act without persistent privilege. Organisations typically encounter the consequence only after a tool has already rewritten code, exposed credentials, or triggered an unsafe deployment, at which point AI tool poisoning becomes operationally unavoidable to address.
Standards & Framework Alignment
This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.
OWASP Agentic AI Top 10 and OWASP Non-Human Identity Top 10 address the attack and risk surface, while NIST CSF 2.0 set the governance and control requirements practitioners need to meet.
| Framework | Control / Reference | Relevance |
|---|---|---|
| OWASP Agentic AI Top 10 | A1 | Agentic tool abuse and prompt-driven execution are central to AI tool poisoning risk. |
| OWASP Non-Human Identity Top 10 | NHI-02 | Poisoned tools often exploit weak secret and environment handling around NHIs. |
| NIST CSF 2.0 | PR.AC-4 | Least-privilege access is essential when assistants can execute actions through tools. |
Protect NHI secrets and execution contexts so assistants cannot inherit unsafe privileges from poisoned inputs.