TL;DR: AI agent adversary coverage was extended with 14 agent-focused techniques and subtechniques to address threats that can be manipulated through context, memory, configuration, tools, and data access, according to Zenity. The important shift is that agent behaviour now has to be governed as an identity and access problem, not only as a prompt-safety problem.
At a glance
What this is: Zenity’s collaboration with MITRE ATLAS adds 14 AI agent techniques that map how agents are discovered, influenced, and used to access tools and data.
Why it matters: IAM, NHI, and AI governance teams need a control model that treats agent behaviour as a live identity risk, not just a model-safety issue.
By the numbers:
- 80% of organisations report their AI agents have already performed actions beyond their intended scope, including accessing unauthorised systems (39%), inappropriately sharing sensitive data (31%), and revealing access credentials (23%).
- 92% agree governing AI agents is critical to enterprise security, yet only 44% have implemented any policies to do so.
- 96% of technology professionals identify AI agents as a growing security threat, and 66% believe this risk is immediate.
👉 Read Zenity's analysis of AI agent attack techniques added to MITRE ATLAS
Context
AI agent security is becoming an identity governance problem because agents do not just generate content, they act on systems, tools, and data. When attack techniques shift from prompt abuse to configuration tampering, tool discovery, and exfiltration through legitimate actions, conventional LLM guardrails are no longer enough.
Zenity’s work with MITRE ATLAS matters because it turns that shift into a shared threat language for defenders. The article is best read as evidence that AI agents need the same kind of structured attack mapping that machine identities and privileged workflows already require in mature IAM programmes.
Key questions
Q: How should security teams govern AI agents that can use tools and data sources?
A: Security teams should govern AI agents as privileged actors with explicit tool boundaries, data boundaries, and logging. The practical test is whether each tool call is necessary, observable, and constrained to a narrow purpose. If an agent can discover, retrieve, or write data beyond that purpose, the governance model is too loose.
Q: What breaks when AI agent context or memory can be manipulated?
A: When context or memory is mutable and untrusted, the agent’s future decisions can be steered by attacker-controlled inputs. That breaks the assumption that an agent’s next action is based on authorised intent. Teams then lose reliable control over what the agent believes, which tool it chooses, and what data it may expose.
Q: How do you know if AI agent tool access is too broad?
A: Agent tool access is too broad when the operator cannot explain which data source, system, or write action each tool is allowed to reach. A second signal is when configuration reveals credentials or embedded access paths that the business owner did not expect. Broad reach without clear task scoping is the warning sign.
Q: What should organisations do when an AI agent can exfiltrate data through legitimate actions?
A: Organisations should narrow the set of tool actions that can carry sensitive data and apply separate approval or monitoring for write operations. If email, document creation, or CRM updates can be used to move data out of scope, those actions need explicit control points and auditability before the agent is allowed to use them.
Technical breakdown
Context poisoning in AI agents
Context poisoning describes adversaries manipulating the inputs, memory, or working context an AI agent relies on to decide what to do next. In practice, that can mean malicious instructions embedded in a thread, poisoned retrieval content, or altered agent memory that persists across sessions. The key security issue is not only output corruption, but action corruption. Once the agent treats tainted context as trustworthy, it can select tools, retrieve data, or trigger downstream operations that were never intended by the operator.
Practical implication: inventory every place an agent can ingest mutable context and treat those inputs as security boundaries, not convenience features.
Tool discovery and credential harvesting
The article’s new techniques show that agents expose more than prompts. Their configuration, embedded knowledge, connected tools, and retrieval sources can reveal where credentials live and what systems the agent can reach. That makes AI agent environments attractive for adversaries who want to enumerate privileges before using them. When credentials are stored in configuration files, internal documents, or connected services, an agent can become an unwitting discovery and extraction path rather than a passive interface.
Practical implication: review agent configuration, tool manifests, and connected data sources as sensitive assets with their own access controls.
Exfiltration through legitimate tool invocation
A distinctive AI agent failure mode is exfiltration by ordinary action. Instead of stealing data through an obviously malicious transfer, an adversary can prompt the agent to perform a write operation that looks legitimate, such as sending email, updating a record, or generating a file. Because the action is executed through an approved tool, the security signal can be weak or absent. This is why agent security extends beyond prompt content to the authorisation of specific tool actions and the data they can carry.
Practical implication: constrain which tool actions can move sensitive data, and separate read access from write-capable workflows wherever possible.
Threat narrative
Attacker objective: The attacker’s objective is to turn the agent’s own privileges and trusted workflows into a channel for data access and exfiltration without obvious compromise signals.
- Entry occurs when an adversary reaches an AI agent through poisoned context, exposed configuration, or accessible retrieval sources that the agent trusts.
- Escalation follows as the adversary learns the agent’s tools, embedded knowledge, activation triggers, and credential-bearing data sources.
- Impact occurs when the agent uses approved tools to retrieve sensitive data or transmit it through a seemingly legitimate action such as email, document creation, or CRM updates.
Breaches seen in the wild
- Moltbook AI agent keys breach — Moltbook breach exposed 1.5M AI agent keys.
- AI LLM hijack breach — attackers used stolen AWS access keys to hijack Anthropic LLM models on Bedrock.
Read our 52 NHI Breaches Analysis report for a comprehensive view of breaches impacting Non-Human Identities including AI Agents.
NHI Mgmt Group analysis
AI agent attack mapping is now an identity control requirement, not a niche threat-model exercise. The addition of agent-focused techniques to MITRE ATLAS shows that defenders need a vocabulary for how agents are discovered, influenced, and made to act. That shift matters because it moves the discussion from model behaviour alone to the privileges, tools, and data paths attached to the actor. Practitioners should treat agent attack mapping as part of access governance, not as a separate AI-only discipline.
Context poisoning is the named failure mode that makes agent governance different from conventional application security. An agent can ingest hostile context, persist it in memory, and then act on it through trusted tools. That is not simply prompt injection in a new wrapper. It is a control failure in which the decision surface itself becomes attacker-controlled. The practitioner conclusion is that trust boundaries around memory, thread state, and retrieval content now need explicit ownership.
Tool discovery creates identity blast radius, because the agent’s effective privilege set is often larger than the operator sees. When configuration, embedded knowledge, and connected services reveal what an agent can reach, the real risk is not only misuse but hidden reach. This is the same governance problem NHI teams know from over-permissioned service accounts, now expressed through agent behaviour. Practitioners should interpret tool transparency as a prerequisite for least privilege, not as a documentation exercise.
Exfiltration via tool invocation shows why approval of the agent is not approval of every action it can take. The article’s techniques expose a gap between identity authentication and action authorisation. An agent may be allowed to operate, yet still be capable of moving sensitive information in ways the operator did not anticipate. That means governance has to move down to the action and data level. Practitioners should re-evaluate which tool calls are actually safe to permit at all.
Named concept: agentic identity attack surface. The article’s core contribution is not just more AI TTPs, but a clearer description of the attack surface created when identity, tools, memory, and retrieval are fused into one actor. That surface can be attacked without breaking the model itself. The implication for practitioners is that AI agent governance must be built around reachable systems, not around model prompts alone.
From our research:
- 80% of organisations report their AI agents have already performed actions beyond their intended scope, including accessing unauthorised systems (39%), inappropriately sharing sensitive data (31%), and revealing access credentials (23%), according to AI Agents: The New Attack Surface report.
- Only 52% of companies can track and audit the data their AI agents access, leaving 48% with a complete blind spot for compliance and breach investigation.
- For a broader benchmark on the governance gap, see Top 10 NHI Issues and compare agent controls with your current non-human identity programme.
What this signals
Agentic identity governance is converging with NHI governance faster than most programmes are structured to handle. If agents can discover tools, reuse memory, and invoke write-capable actions, they must be governed like privileged non-human identities with stronger attention to reachable systems than to model output. The immediate programme signal is to align discovery, ownership, and logging around the actor’s real blast radius, not its interface.
Only 44% of organisations have implemented any policies to govern AI agents, according to AI Agents: The New Attack Surface report. That gap suggests policy is lagging behind deployment, which means many teams are already operating with undocumented privilege and unclear accountability. The next practical step is to tie policy decisions to tool access, data reach, and review ownership before the estate grows further.
Context poisoning is becoming the operational boundary teams need to watch. As AI agents move closer to real workflows, the control point shifts from model response quality to the trustworthiness of memory, retrieval, and tool invocation. Security teams should expect their detection and review models to follow that shift, using frameworks such as the MITRE ATLAS adversarial AI threat matrix to keep pace.
For practitioners
- Map agent tool exposure as an access inventory List every tool, service, retrieval source, and credential-bearing configuration item an AI agent can touch, then assign ownership for each one. Use that inventory to separate benign read paths from write-capable actions and to identify where privileged data can flow through legitimate agent behaviour.
- Treat memory and thread state as hostile inputs Review where an agent persists context across turns or sessions, and define controls for poisoning, reuse, and retention. If a thread or memory store can influence future actions, it needs the same scrutiny as any other security-relevant input boundary.
- Separate tool permission from data permission Do not assume that allowing an agent to use a tool is equivalent to allowing it to move sensitive data through that tool. Restrict write actions, approve only narrowly scoped outputs, and monitor for legitimate-looking exfiltration paths such as email, document updates, and CRM writes.
- Integrate agent TTPs into threat modelling and detection Use the new ATLAS agent techniques to update threat models, detection logic, and red-team scenarios. Focus on agent discovery, configuration tampering, credential harvesting, and tool-based exfiltration so defenders can spot abuse patterns before they become routine.
Key takeaways
- AI agents now need to be governed as actors with tools, data access, memory, and action paths, not as prompt engines alone.
- The new ATLAS techniques show that context poisoning, credential discovery, and legitimate-looking exfiltration are core agent risk patterns, not edge cases.
- Practitioners should tighten tool boundaries, separate read and write paths, and update threat models before agent privilege becomes normalised.
Standards & Framework Alignment
This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.
OWASP Agentic AI Top 10 and MITRE ATLAS address the attack and risk surface, while NIST AI RMF set the governance and control requirements practitioners need to meet.
| Framework | Control / Reference | Relevance |
|---|---|---|
| OWASP Agentic AI Top 10 | Agent context poisoning and tool misuse map directly to agentic application threats. | |
| MITRE ATLAS | The article extends ATLAS with agent-focused techniques and subtechniques. | |
| NIST AI RMF | Agent governance and accountability depend on structured AI risk management. |
Map agent tools, memory, and retrieval paths against OWASP Agentic AI risks before expanding deployment.
Key terms
- Agentic Identity Attack Surface: The collection of tools, memory, retrieval sources, configuration, and action paths that an AI agent can reach while operating. It matters because the real risk is often the actor’s reachable privileges, not the model itself. For autonomous or semi-autonomous agents, this surface is the unit of governance.
- Context Poisoning: A technique in which attackers manipulate the information an AI agent relies on to make decisions, including thread content, memory, or retrieved context. The goal is to steer future actions without breaking the model directly. In agent governance, poisoned context is an input control problem with identity consequences.
- Tool Invocation: The act of an AI agent calling an external function, service, or workflow to perform work in a connected system. Tool invocation is security-relevant because it turns intent into action. If a tool can read, write, or transmit sensitive data, then its permissions and logging need identity-level controls.
- Retrieval Augmented Generation Database: A data store used to supply an AI system with internal or contextual information during generation. When agents can query it, the database becomes part of the actor’s effective access path. If sensitive material lands there without tight access governance, it can become a discovery and exfiltration source.
Deepen your knowledge
NHI governance, agentic AI identity, and machine identity security are core topics in our NHI Foundation Level course, the industry's only accredited NHI security programme. If you are responsible for identity security strategy or governance in your organisation, it is worth exploring.
This post draws on content published by Zenity: Zenity Labs and MITRE ATLAS collaborate to advance AI agent security with the first release of agent-focused TTPs. Read the original.
Published by the NHIMG editorial team on 2025-10-21.
NHI Mgmt Group — the independent authority on Non-Human Identity, IAM, and Agentic AI security. nhimg.org