What Is Zero-click agent exploit? Definition & Examples

Expanded Definition

A zero-click agent exploit is a manipulation path that causes an AI agent to take an action from untrusted input without a human approving the step. The risk is not limited to the model prompt itself; it includes email, documents, tickets, web pages, chat messages, retrieved context, and any tool output the agent treats as operationally meaningful. In practice, the ingestion boundary becomes part of the attack surface, which is why agent governance must consider both content trust and tool authority. This term is still evolving across vendors, but the security pattern is clear: if an agent can read it, reason over it, and act on it, an attacker may try to shape that sequence. OWASP’s OWASP Top 10 for Agentic Applications 2026 treats this class of issue as a core application-risk concern, while NIST’s NIST AI Risk Management Framework frames it as an issue of trustworthy operation, not just prompt safety.

The most common misapplication is assuming a zero-click exploit requires prompt injection only, when the condition often begins with a trusted integration feeding hostile data into an agent that has execution permission.

Examples and Use Cases

Implementing agent workflows with strong guardrails often introduces latency and review friction, requiring organisations to weigh automation speed against the cost of stricter validation and reduced autonomy.

An inbox triage agent reads a malicious email, interprets it as a valid instruction, and creates a ticket or sends a reply without any user click.

A support copilot ingests a poisoned knowledge-base article and uses its tool access to change a case status or disclose internal data.

A code assistant processes a repository comment or issue description and triggers a deployment step after mistaking attacker-controlled text for operator intent.

A procurement agent consumes a vendor document and sends an approval message, because the workflow treats document content as authoritative input.

A browser-using agent follows a crafted page that influences downstream actions, similar to patterns discussed in the AI LLM hijack breach and in the OWASP NHI Top 10.

These cases are best understood as trust-boundary failures between ingestion, reasoning, and execution. The relevant question is not whether the model was “fooled” in a narrow sense, but whether an untrusted source was allowed to influence an agent action path that should have been gated. That is also why the MITRE ATLAS adversarial AI threat matrix and the 52 NHI Breaches Analysis are useful references when mapping agent compromise scenarios.

Why It Matters in NHI Security

Zero-click agent exploits matter because they convert routine automation into an unauthorised actor with valid access. Once an AI agent can use service accounts, API keys, or delegated sessions, a compromised input path can become an identity compromise path. That is especially dangerous in environments where secrets are already overexposed: NHI Mgmt Group reports that 79% of organisations have experienced secrets leaks, with 77% of these incidents resulting in tangible damage. A zero-click exploit can turn that exposure into immediate misuse of tools, data, and workflows. The main governance failure is treating the agent as a passive consumer rather than an active principal with delegated authority, which breaks least privilege and complicates incident response. NHI controls, output validation, and tool permission scoping become essential, not optional, especially when coupled with agentic controls described in the CSA MAESTRO agentic AI threat modeling framework and the NIST AI Risk Management Framework.

Organisations typically encounter the consequences only after an agent has already sent, changed, exfiltrated, or approved something it should not have, at which point zero-click agent exploit analysis becomes operationally unavoidable to address.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Agentic AI Top 10 and OWASP Non-Human Identity Top 10 address the attack and risk surface, while NIST AI RMF set the governance and control requirements practitioners need to meet.

Framework	Control / Reference	Relevance
OWASP Agentic AI Top 10	AA1	Covers malicious input that drives unintended agent actions through trusted workflows.
OWASP Non-Human Identity Top 10	NHI-02	Agent exploits often pivot through exposed secrets and over-privileged non-human identities.
NIST AI RMF		Defines risk governance for AI systems exposed to manipulation and unsafe action.

Classify agent ingestion and execution paths as governed risk surfaces with monitoring and controls.

#1 Authority in NHI Education, Research and Advisory, empowering organizations to tackle the critical risks posed by Non-Human Identities (NHIs), including AI Agents.

Zero-click agent exploit

Expanded Definition

Examples and Use Cases

Why It Matters in NHI Security

Standards & Framework Alignment

Related resources from NHI Mgmt Group