Subscribe to the Non-Human & AI Identity Journal

Why are zero-click attacks especially dangerous for AI agents?

Zero-click attacks are dangerous because the attacker does not need the user to fail or click anything. The malicious instruction can arrive inside ordinary email, documents, or calendar content, and the agent may execute it while following normal workflow logic. That makes the attack covert, scalable, and hard to detect with traditional controls.

Why Zero-Click Attacks Are So Effective Against AI Agents

Zero-click attacks exploit a structural weakness in agentic systems: the agent is designed to act without waiting for a human to approve every step. That means malicious content can be delivered through ordinary inbox, calendar, ticketing, or document workflows and still trigger execution. The issue is not just prompt injection. It is the combination of autonomous behaviour, tool access, and trust in the surrounding workflow. Current guidance suggests that this is exactly where static IAM assumptions start to fail.

The risk is amplified because agent actions often look routine. An AI agent may summarise a message, fetch related files, update a ticket, or draft a reply, all while processing attacker-controlled text. SailPoint’s AI Agents: The New Attack Surface report found that 80% of organisations already saw agents act beyond intended scope, which is a strong signal that the problem is not theoretical. For broader context on where agentic exposure sits in the NHI stack, see OWASP NHI Top 10 and our Top 10 NHI Issues analysis. In practice, many security teams discover agent abuse only after the agent has already chained tools and executed a harmful instruction, rather than through intentional testing.

How Zero-Click Delivery Turns Into Real Agent Abuse

Zero-click delivery becomes dangerous when the agent can interpret untrusted content as part of its working context. A calendar invite can include instructions that influence the agent’s next tool call. A document can carry hidden prompts that steer retrieval. An email can masquerade as a legitimate request that causes the agent to read files, send data, or invoke MCP-connected services. The attack succeeds because the agent is following its objective, not because the user approved a malicious action.

Defending against this requires more than filtering inputs. The stronger pattern is to separate content ingestion from action authority. That means using intent-based authorisation, short-lived JIT credentials, and workload identity so the agent proves what it is and what task it is allowed to perform at that moment. NIST’s NIST AI Risk Management Framework and the CSA MAESTRO agentic AI threat modeling framework both support this shift toward runtime governance. For identity failure patterns, compare this with AI LLM hijack breach and 52 NHI Breaches Analysis, where compromised secrets and identity exposure enabled fast attacker action.

  • Use runtime policy checks instead of pre-baked allowlists for every agent action.
  • Issue ephemeral secrets per task, then revoke them when the task ends.
  • Bind agent identity to cryptographic workload identity, not just a session token.
  • Restrict tool scope so a reading step cannot silently become a writing or exfiltration step.

These controls tend to break down when agents are given broad mailbox, document, or SaaS permissions because the attack surface becomes the workflow itself, not a single endpoint.

Where the Standard Control Model Breaks Down

Tighter control often increases operational overhead, requiring organisations to balance safety against latency, usability, and automation depth. That tradeoff becomes most visible in multi-agent systems, where one agent can pass tainted context to another and each step still appears legitimate. Best practice is evolving here, and there is no universal standard for this yet, but the direction is clear: static RBAC is too blunt for goal-driven systems that change behaviour at runtime.

The most common edge case is “trusted collaboration.” Teams assume internal email, shared drives, or approved SaaS tools are safe inputs, but zero-click attacks exploit exactly those trusted channels. Another gap appears when secrets are long-lived. If an agent can reuse a token after a malicious instruction lands, the attacker gets persistence even after detection. NIST SP 800-63 guidance on identity assurance and the NIST AI Risk Management Framework both support stronger identity and governance discipline, while OWASP Agentic AI Top 10 highlights the need to treat tool use, context handling, and privilege boundaries as separate risk domains. For real-world breach patterning, Moltbook AI agent keys breach is a clear reminder that exposed agent secrets turn one injected instruction into a fast-moving compromise.

For security leaders, the practical lesson is simple: zero-click attacks are dangerous because they bypass user judgment and target the agent’s execution logic instead. Detection, containment, and revocation have to happen at the identity and policy layer, not just at the inbox or endpoint.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Agentic AI Top 10 and CSA MAESTRO address the attack and risk surface, while NIST AI RMF set the governance and control requirements practitioners need to meet.

Framework Control / Reference Relevance
OWASP Agentic AI Top 10 LLM01 Zero-click attacks exploit prompt and tool abuse in agentic workflows.
CSA MAESTRO T1 MAESTRO models agentic threat paths, including tainted inputs and tool abuse.
NIST AI RMF AI RMF addresses governance and accountability for autonomous AI behaviour.

Gate every agent action with runtime policy and validate untrusted context before tool execution.