TL;DR: AI systems expand the attack surface across data, model, application, and infrastructure layers, while traditional tools struggle to detect poisoning, prompt injection, model theft, shadow AI, and data leakage, according to Cyera Research. The governance gap is no longer about better perimeter security, but about identity, data, and runtime controls built for autonomous systems.
At a glance
What this is: This guide breaks AI security into ten risk areas and argues that conventional security tools do not map cleanly to AI workloads or their dependencies.
Why it matters: For IAM and NHI practitioners, the key issue is that AI environments introduce new identities, new access paths, and new governance failure modes that existing controls often miss.
By the numbers:
- 89% of organizations have zero AI service visibility into what’s being used within the company.
👉 Read Cyera's guide to the 10 critical AI security risks and controls
Context
AI security is not just a stronger version of application security. AI systems depend on large data pipelines, model weights, exposed inference endpoints, and external integrations, which means the attack surface shifts as the system changes. That creates an NHI governance problem because AI workloads, APIs, and agents can behave like non-human identities with access that is hard to inventory, scope, and review.
The article frames ten risks, but the broader issue is control mismatch: classic tools were designed for static software and known trust boundaries, not for systems that learn, generate, and call external services. That is a typical starting point for organisations moving fast on AI adoption, and it is exactly where visibility and privilege discipline break down first.
Key questions
Q: How should security teams govern AI systems that act like non-human identities?
A: Security teams should treat AI systems as governed non-human identities with scoped permissions, monitored activity, and explicit ownership. That means inventorying every model, API key, token, connector, and service account involved in the workflow. The practical goal is to reduce identity blast radius, not just improve model security.
Q: What is the difference between prompt injection and model theft?
A: Prompt injection changes what an AI system does by steering its runtime behaviour, while model theft tries to reconstruct the model’s capabilities through repeated queries or probing. One targets action and output integrity. The other targets intellectual property and system replication. Both require controls at the interface, not only at storage.
Q: How can organisations reduce shadow AI risk without slowing adoption?
A: Organisations should reduce shadow AI risk by discovering unsanctioned tools first, then creating a clear approval path for approved services. Security teams need logging, rate limits, and policy enforcement on AI usage so users can adopt tools safely. The objective is visible governance, not blanket prohibition.
Q: Why do traditional security tools miss many AI security risks?
A: Traditional tools are tuned for static systems, known boundaries, and conventional traffic patterns. AI introduces dynamic prompts, model memory, external tool calls, and data-dependent behaviour that do not fit those assumptions. That gap is why AI security needs identity, runtime, and data controls together, not in isolation.
Technical breakdown
Why AI systems create a different identity and data plane
AI environments combine training data, model parameters, inference endpoints, and plugin or API integrations into a single operational surface. That surface is dynamic, so risk can emerge in the data feeding the model, the model itself, or the runtime path that lets users and tools interact with it. Traditional controls often inspect packets, endpoints, or file movement, but AI risk also lives in prompts, memory, and outputs. For NHI governance, this means the system is not one workload with one identity. It is a chain of service accounts, tokens, APIs, and embedded dependencies that all need separate control logic.
Practical implication: Map AI systems as identity chains, not as single applications, and assign governance to each access path.
Prompt injection, jailbreaking, and unauthorized functional use
Prompt injection works by steering the model to ignore its intended instructions or safety logic. Direct attacks tell the model to override rules, while indirect attacks hide malicious instructions in content the model later consumes. Jailbreaking is the broader pattern of bypassing guardrails to get the system to reveal data, generate disallowed content, or perform actions outside policy. The security issue is not only bad answers. It is unauthorized function use, where the model becomes a tool for credential harvesting, social engineering, or policy evasion. That makes prompt handling an access-control problem, not only a content-moderation problem.
Practical implication: Treat prompt channels as governed inputs and add validation, filtering, and runtime monitoring to them.
Shadow AI and model theft as governance blind spots
Shadow AI appears when employees or teams use AI services outside approved controls, creating blind spots in logging, review, and data handling. Model theft is a related risk where attackers probe public APIs, infer behaviour from outputs, and reconstruct capabilities without touching source code or internal data. Both risks exploit openness at the interface layer. If query volume is unlimited or unmonitored, the public endpoint becomes a discovery surface. For IAM and NHI teams, the lesson is that exposed AI services need the same entitlement discipline, monitoring, and rate controls that high-risk machine identities already require.
Practical implication: Inventory all AI services, restrict public interfaces, and enforce logging and rate limits as standard controls.
Threat narrative
Attacker objective: The attacker wants to turn AI access into data exposure, unauthorized actions, or reusable model intelligence.
- Entry occurs through exposed AI interfaces, shadow deployments, or poisoned content that reaches the model or its connected tools.
- Escalation follows when the attacker uses prompt steering, repeated querying, or unreviewed integrations to expand control beyond intended behaviour.
- Impact arrives as data leakage, model theft, unsafe automation, or manipulated outputs that damage operations and trust.
Breaches seen in the wild
- Moltbook AI agent keys breach — Moltbook breach exposed 1.5M AI agent keys.
- Reviewdog GitHub Action supply chain attack — reviewdog/action-setup GitHub Action supply chain attack exposed secrets.
Read our 52 NHI Breaches Analysis report for a comprehensive view of breaches impacting Non-Human Identities including AI Agents.
NHI Mgmt Group analysis
AI security governance fails when organisations treat AI as software instead of an identity-bearing system. AI pipelines contain service accounts, API keys, inference endpoints, and external connectors that all create access decisions. That means the control problem is not only model integrity, but who or what can act through the model. Practitioners should govern AI as an NHI estate, not as a single application category.
Shadow AI is the clearest signal that discovery has not caught up with adoption. If teams cannot see which models, SaaS tools, and API-connected agents are active, they cannot apply policy consistently or prove data handling boundaries. The practical consequence is that inventory, not policy wording, becomes the first control to harden.
Prompt injection creates a trust boundary problem that classic content filters do not solve. When inputs can redirect model behaviour, the issue becomes runtime authorisation and guardrail enforcement. Security teams should assume that any model exposed to external content can be steered unless controls exist at the input, policy, and execution layers.
Identity blast radius is the right concept for AI risk management. The real question is not whether a model is accurate, but how far a compromised token, connector, or prompt path can spread damage. That pushes organisations toward least privilege, short-lived credentials, and explicit approval for high-risk tool use.
From our research:
- Only 44% of organisations have implemented any policies to manage their AI agents, despite 92% agreeing that governing AI agents is critical to enterprise security, according to the 2026 Infrastructure Identity Survey.
- Only 13% of organisations feel extremely prepared for the reality of agentic AI, which shows how quickly adoption is outrunning control maturity.
- OWASP NHI Top 10 maps the most common agentic application risks to concrete defensive priorities.
What this signals
Identity blast radius now defines AI programme risk. With 70% of organisations granting AI systems more access than they would give a human employee performing the same job, the control question is no longer whether AI should be enabled, but how much damage a compromised agent, token, or connector can cause before detection. Teams should align AI access reviews to short-lived credentials and tightly scoped tool permissions, then monitor those paths as NHI.
Security leaders should expect AI governance to move from policy drafting to evidence collection. The practical challenge is proving which systems are active, which data they touch, and which approvals back their runtime actions. That makes asset discovery, entitlement review, and logged decision paths the minimum viable programme for AI oversight.
The strongest near-term signal is organisational inconsistency. Some teams are adopting AI faster than their control model can absorb, while others are still deciding where ownership sits. Programmes that anchor AI oversight in existing IAM and NHI controls will move faster than those building a separate governance stack from scratch.
For practitioners
- Implement AI asset discovery across all environments Inventory models, datasets, APIs, plugins, and shadow AI tools so every AI-related dependency is visible to security and IAM teams.
- Scope AI access to least privilege Review service accounts, tokens, and connector permissions for AI workloads, and remove broad access that is not required for the task.
- Add runtime controls for prompt and output handling Use input validation, output filtering, and anomaly detection to reduce prompt injection, jailbreaking, and unauthorized data exposure.
- Enforce rate limits and query monitoring Apply throttling, logging, and alerting on public AI endpoints to detect model extraction attempts and unusual usage patterns early.
- Build AI governance into existing review cycles Include AI systems in access reviews, data classification checks, and third-party risk assessments so they do not remain outside standard controls.
Key takeaways
- AI security risk is distributed across data, model, application, and infrastructure layers, so single-point controls miss too much.
- Shadow AI and overexposed interfaces create the largest governance blind spots because they hide who can act, query, or connect.
- IAM and NHI teams should extend least privilege, monitoring, and review cycles into AI workflows before adoption outruns visibility.
Standards & Framework Alignment
This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.
OWASP Agentic AI Top 10 and OWASP Non-Human Identity Top 10 address the attack and risk surface, while NIST CSF 2.0 set the governance and control requirements practitioners need to meet.
| Framework | Control / Reference | Relevance |
|---|---|---|
| OWASP Agentic AI Top 10 | NHI-01 | Prompt injection and unauthorized function use map directly to agentic application abuse. |
| OWASP Non-Human Identity Top 10 | NHI-03 | The guide stresses AI credential exposure, rotation, and overprivilege in model and API access. |
| NIST CSF 2.0 | PR.AC-4 | AI services need least privilege and identity governance under the Protect function. |
Add input validation, output controls, and execution guards to every externally reachable AI workflow.
Key terms
- Shadow AI: Shadow AI is the use of AI tools, models, or agents outside approved security and governance controls. It creates blind spots in logging, data handling, and access management, which means teams cannot reliably prove what was used, who used it, or what data moved through it.
- Prompt Injection: Prompt injection is a technique that manipulates an AI system through crafted input so it ignores intended instructions or safety rules. It can be direct or indirect, and the risk extends beyond bad answers because the model may reveal data, trigger tools, or take unintended actions.
- Model Theft: Model theft is the extraction of a model’s behaviour or capabilities through probing, repeated querying, or inference attacks rather than direct source-code access. The attacker aims to reconstruct a usable copy, which makes public APIs and weak monitoring a primary exposure point.
- Identity Blast Radius: Identity blast radius is the amount of damage a compromised non-human identity can cause before detection and containment. In AI environments, it depends on how broadly a token, connector, or service account can query data, call tools, and trigger downstream actions.
Deepen your knowledge
AI governance, shadow AI discovery, and identity blast-radius control are core topics in our NHI Foundation Level course, the industry's only accredited NHI security programme. If you are adapting IAM controls for agentic systems, it is worth exploring.
This post draws on content published by Cyera: Critical AI Security Risks and How to Prevent Them. Read the original.
Published by the NHIMG editorial team.
NHI Mgmt Group — the independent authority on Non-Human Identity, IAM, and Agentic AI security. nhimg.org