Subscribe to the Non-Human & AI Identity Journal

How should security teams use deception against agentic AI attacks?

Security teams should use deception to reshape what an autonomous system believes is real, valuable, and reachable. That means deploying decoys, misleading metadata, and false access paths in places where agentic reconnaissance is likely to start. Deception works best when it is tied to identity telemetry, so teams can see when an attacker is consuming decoys instead of progressing toward privileged systems.

Why Deception Matters Against Agentic AI Attacks

Agentic attackers do not need a perfect breach plan to create damage. They can probe, chain tools, and follow cues faster than human defenders can triage. Deception works because it interrupts that autonomy by feeding the system false signals about what is valuable, reachable, or trusted. For security teams, the goal is not to “hide everything,” but to make reconnaissance expensive and unreliable.

This is especially relevant where agent behavior is already difficult to observe. NHIMG’s AI Agents: The New Attack Surface report notes that 80% of organisations report AI agents have already performed actions beyond intended scope, including accessing unauthorised systems and revealing credentials. That kind of drift is exactly where decoys, false paths, and identity-linked telemetry can create early warning. Current guidance suggests pairing deception with controls discussed in OWASP Agentic AI Top 10 and the NIST AI Risk Management Framework, which both emphasise risk-based evaluation rather than static trust assumptions.

In practice, many security teams encounter agent-driven reconnaissance only after a decoy has been touched and privileged data has already been mapped by an adversary.

How Deception Should Be Wired Into Agentic Defences

Effective deception for agentic attacks starts with identity and telemetry, not with isolated honeypots. A decoy only matters if defenders can tell whether a human, a script, or an AI agent consumed it. That means instrumenting access events, token usage, tool calls, and unusual navigation patterns so the decoy becomes a signal source rather than a trap in name only. The best practice is evolving, but current guidance suggests treating deception as part of the control plane for autonomous workloads.

Security teams should place believable false assets where agentic discovery is likely to begin: API documentation, internal data catalogs, service metadata, non-production endpoints, and credential-adjacent paths. The point is to shape the agent’s belief system. A false secret, a fake service account, or a canary object can reveal whether the agent is enumerating, following links, or attempting privilege escalation. That is why NHIMG’s LLMjacking: How Attackers Hijack AI Using Compromised NHIs is a useful reference point: attackers move quickly once credentials or paths are exposed, so the defender wants a faster deception-to-detection loop.

  • Bind decoys to identity telemetry so access is attributable.
  • Use realistic but non-sensitive metadata to draw agentic reconnaissance.
  • Trigger alerts on interaction, not just on exfiltration.
  • Rotate and retire decoys so they do not become predictable.

When this is done well, deception complements frameworks such as the MITRE ATLAS adversarial AI threat matrix and CSA MAESTRO agentic AI threat modeling framework, because both stress adversarial behaviour over simple perimeter defense. These controls tend to break down when decoys are not tied to workload identity in highly distributed environments, because defenders can see the lure but not reliably distinguish which autonomous system touched it.

Common Variations, Tradeoffs, and Failure Modes

Tighter deception often increases operational overhead, requiring organisations to balance detection value against maintenance cost. In agentic environments, that tradeoff is real because false paths, canaries, and synthetic data all need to remain plausible as systems change. If decoys become stale, agents may ignore them, while defenders may also lose trust in the alerts they generate.

One important variation is whether the environment uses a single agent, a multi-agent workflow, or a tool-using LLM pipeline. Multi-agent systems can amplify deception value because one compromised agent may expose the path for others, but they can also create noisy telemetry that obscures signal. Another edge case is regulated data handling: teams should avoid placing real secrets in traps, even temporarily, because the safer pattern is synthetic lookalikes plus strong access logging. Guidance is not universal here, but current practice favours short-lived decoys, clear ownership, and policy review before deployment. NHIMG’s Ultimate Guide to NHIs — Why NHI Security Matters Now and Top 10 NHI Issues are useful reminders that identity governance and deception cannot be separated.

One practical limit is that deception is weakest in environments where agents only access tightly brokered, heavily cached, or fully abstracted services, because there may be too few observable touchpoints for the decoy to influence attacker behaviour.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Agentic AI Top 10 and CSA MAESTRO address the attack and risk surface, while NIST AI RMF set the governance and control requirements practitioners need to meet.

Framework Control / Reference Relevance
OWASP Agentic AI Top 10 A07 Agentic deception targets reconnaissance, prompt abuse, and tool misuse.
CSA MAESTRO TR-2 MAESTRO addresses adversarial behaviour and control-plane visibility for agents.
NIST AI RMF AI RMF supports governing deceptive controls as part of risk treatment.

Document deception objectives, owners, and monitoring in your AI risk program.