TL;DR: AI security risks have moved from theory to production reality, with incidents ranging from prompt injection and sensitive data exposure to excessive agent permissions and supply-chain abuse, according to Cyera Research. The governance gap is now an identity and data access problem, not just a model-safety problem.
At a glance
What this is: Cyera Research argues that real-world AI incidents now map directly to OWASP LLM risk categories, showing that production AI security failures are increasingly NHI governance failures.
Why it matters: IAM and NHI teams need to treat AI assistants and agents as privileged identities whose data access, tool use, and output handling require explicit controls.
👉 Read Cyera's analysis of AI security risks becoming production reality
Context
AI security now overlaps directly with NHI governance because many of the risky systems are not passive applications. They are agents or assistants that can read data, call tools, write files, and influence workflows with privileges that look more like service identities than user sessions. When those systems are allowed to operate across email, code, and cloud data, traditional IAM assumptions break down quickly.
Cyera frames the issue as a gap between AI access patterns and the controls most organisations already have in place. That gap is visible in the recurring incidents the post cites, where prompt injection, data exposure, and excessive permissions turn normal AI usage into an access problem. For IAM and security leaders, the starting point is not model quality but control of identity, privilege, and data scope.
Key questions
Q: How should security teams govern AI agents that can access business systems?
A: Treat AI agents as non-human identities with bounded permissions, explicit owners, and time-limited access. Map each agent to the data and tools it can reach, then require human approval for high-impact actions such as deletion, external sharing, or production changes. If the agent can act autonomously, least privilege matters more, not less.
Q: When does AI access become an identity risk instead of a productivity feature?
A: AI access becomes an identity risk when the system can read sensitive data, trigger actions, or persist context without clear limits. At that point, the issue is no longer model output quality. It is the blast radius of the identity behind the model and the data it can touch.
Q: What is the difference between protecting an AI model and protecting an AI identity?
A: Protecting a model focuses on prompts, outputs, and training integrity. Protecting an AI identity focuses on who or what can use the model, what data it can access, and which actions it can take. Both matter, but IAM and NHI teams usually own the identity side of the risk.
Q: Why do AI assistants make zero trust harder to implement?
A: AI assistants often need broad, dynamic access to data and tools, which can conflict with zero trust principles if access is not continuously verified. To stay aligned with zero trust, organisations need strong authentication, policy checks at runtime, and step-up controls before an assistant can perform sensitive actions.
Technical breakdown
Prompt injection turns untrusted content into executable intent
Prompt injection works because large language models do not reliably separate instructions from data. If an AI system reads repository files, emails, web pages, or tickets as context, an attacker can hide malicious instructions inside that content and influence the model’s behaviour. The risk increases sharply when the model is connected to tools or write permissions, because the manipulated output can become an operational action. In practical terms, prompt injection is not just a chat problem. It is a control-plane problem for any AI workflow that trusts external text as system context.
Practical implication: Treat every external or user-controlled input as untrusted context and constrain tool use with explicit approval gates.
Sensitive data disclosure is a data-access and memory problem
Sensitive information disclosure in AI systems happens when the model, its connectors, or its supporting infrastructure can access more data than it should. That includes chat histories, API keys, internal communications, and regulated data copied into prompts. The core issue is not only leakage in the output. It is also the accumulation of data across retrieval, caching, memory, and summarisation layers that were not designed for least privilege. In other words, the AI system often becomes a concentrator of sensitive data before it becomes a leak point.
Practical implication: Classify what the model can read, minimise connector scope, and block sensitive data from entering prompts whenever possible.
Excessive agent permissions create identity blast radius
AI agents become a governance problem when they are granted broad permissions to read, write, delete, or trigger workflows across systems. At that point, the security question is no longer whether the model is accurate. It is whether the identity behind the model has an appropriate blast radius. Excessive agency means a small misclassification, a poisoned instruction, or a compromised connector can produce changes at machine speed. That is the same failure pattern seen in over-privileged service accounts, except the decision cycle is much faster and harder to inspect.
Practical implication: Apply least privilege, JIT access, and human approval for high-impact actions before agents can touch production assets.
Threat narrative
Attacker objective: The attacker wants to turn AI trust boundaries into execution boundaries so that the agent performs harmful actions on the attacker’s behalf.
- Entry occurs when an attacker embeds malicious instructions in content that an AI system will ingest as trusted context, such as repository files, emails, or web content.
- Escalation follows when the model has connector permissions or write access, allowing the injected instruction to influence code, data access, or workflow execution.
- Impact occurs when the agent or assistant carries out destructive, deceptive, or exfiltration actions at machine speed under an apparently legitimate identity.
Breaches seen in the wild
- Moltbook AI agent keys breach — Moltbook breach exposed 1.5M AI agent keys.
- AI LLM hijack breach — attackers used stolen AWS access keys to hijack Anthropic LLM models on Bedrock.
Read our 52 NHI Breaches Analysis report for a comprehensive view of breaches impacting Non-Human Identities including AI Agents.
NHI Mgmt Group analysis
AI security is now an NHI governance problem because autonomous systems behave like privileged identities. Once a model can read data, call tools, or write outputs that affect production, it belongs in the same governance conversation as service accounts and machine identities. The control question is no longer whether the AI is useful, but whether its identity, scope, and approval model are explicit. Practitioners should govern AI systems as identities with bounded authority, not as generic software features.
Prompt injection exposes a runtime governance gap that traditional application security does not close. The failure is not simply malicious text. It is the combination of untrusted content, overbroad context ingestion, and operational permissions. That is why static controls alone do not hold up when agents act across dynamic workflows. Security programmes need runtime guardrails, content trust boundaries, and action-level policy enforcement, not just pre-deployment review. Practitioners should assume the input channel is contested.
Sensitive data exposure in AI systems is often a symptom of poor data entitlements, not model misuse. When an assistant can summarise confidential emails or surface secrets from connected systems, the root issue is usually over-access. That means data classification, connector scoping, and access reviews must move into AI governance. The practical conclusion is simple: if the AI can see too much, the incident is already in progress.
Excessive agency creates identity blast radius, and that is the concept teams should measure. An AI agent with delete, write, or workflow privileges can do more damage in less time than a human operator in the same seat. This is why least privilege, JIT access, and human approval remain central even in agentic environments. Practitioners should measure how far a compromised agent can move, not just whether it can be detected after the fact.
From our research:
- 72% of organisations have experienced or suspect they have experienced a breach of non-human identities, according to The 2024 ESG Report: Managing Non-Human Identities.
- Enterprises that have experienced a compromised NHI averaged 2.7 separate incidents in the past 12 months, according to The 2024 ESG Report: Managing Non-Human Identities.
- For the operational angle, see NHI Lifecycle Management Guide for provisioning, rotation, and offboarding patterns that apply directly to AI identities.
What this signals
Ephemeral credential trust debt: the more quickly AI systems are given access to data and tools, the more difficult it becomes to prove that their privileges are still justified. That debt accumulates across connectors, memory layers, and delegated actions, so identity reviews need to move faster than traditional quarterly cadence. Teams should pair AI onboarding with continuous entitlement review and runtime logging, not annual policy refreshes.
If an AI assistant can summarise emails, inspect repositories, or trigger workflows, the organisation has already created a privileged non-human identity. That should trigger the same lifecycle questions raised by service accounts: who owns it, what data can it see, how is access revoked, and how do you know when its scope has drifted? For policy baselines, align the programme with the NIST Cybersecurity Framework 2.0 and use least privilege as the default operating model.
For practitioners
- Implement prompt trust boundaries Separate user input, retrieved content, and system instructions so the model cannot treat untrusted text as operating guidance. Limit which sources can enter the context window and require human review for actions triggered by external content.
- Reduce connector scope for AI assistants Grant the narrowest possible read and write permissions to every AI assistant, then revisit those permissions as use cases expand. Map each connector to a data owner and remove access to sensitive mailboxes, repositories, and files that are not essential.
- Enforce least privilege for agent actions Treat agentic workflows like privileged service identities. Use JIT access for destructive actions, require approval for production changes, and log each tool call so privilege can be reviewed after an incident or policy breach.
- Classify data before it reaches the model Block secrets, regulated data, and high-risk internal content from entering prompts unless there is a documented business need. Pair classification with DLP rules that cover prompts, outputs, and connected storage locations.
Key takeaways
- AI security failures increasingly look like NHI governance failures because agents and assistants operate with real privileges.
- The evidence points to a control gap in data access, connector scope, and runtime permissions rather than a single model flaw.
- Security teams should govern AI systems as identities with bounded authority, time limits, and approval gates for high-impact actions.
Standards & Framework Alignment
This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.
OWASP Agentic AI Top 10 and OWASP Non-Human Identity Top 10 address the attack and risk surface, while NIST AI RMF set the governance and control requirements practitioners need to meet.
| Framework | Control / Reference | Relevance |
|---|---|---|
| OWASP Agentic AI Top 10 | The article maps real incidents to agentic AI risk categories. | |
| OWASP Non-Human Identity Top 10 | NHI-02 | Sensitive data exposure and over-privilege align with NHI identity governance risks. |
| NIST AI RMF | AI RMF supports governance, mapping, and monitoring for autonomous AI risk. |
Assign ownership, monitor behaviour, and document controls for each AI system that can act independently.
Key terms
- Agentic AI: Software that can decide and act with some execution authority instead of only returning text. In security terms, agentic AI behaves like a non-human identity because it can hold credentials, use tools, and change data or systems within the permissions it is given.
- Prompt injection: A technique where an attacker hides instructions inside content that an AI system processes as context. The model may treat those instructions as legitimate guidance, which can lead to unsafe outputs or real actions when the system is connected to tools, files, or workflows.
- Excessive agency: A condition where an AI system is given more operational authority than its task requires. The risk is not just poor output. It is that mistakes, manipulation, or compromise can produce destructive actions at machine speed across the systems the agent can reach.
- Identity blast radius: The amount of damage a compromised identity can cause before it is contained. For AI systems, this includes every data source, tool, and workflow the agent can touch, which makes blast radius a practical measure of whether access has been scoped responsibly.
Deepen your knowledge
AI agent governance and non-human identity controls are core topics in our NHI Foundation Level course, the industry's only accredited NHI security programme. If you are building controls for assistants, coding agents, or workflow automations, it is worth exploring.
This post draws on content published by Cyera: When AI Security Risks Become Reality. Read the original.
Published by the NHIMG editorial team.
NHI Mgmt Group — the independent authority on Non-Human Identity, IAM, and Agentic AI security. nhimg.org