TL;DR: AI threat modeling adapts traditional threat analysis to AI systems by mapping model, data, and infrastructure risks such as poisoning, prompt injection, and unauthorized inference, according to WitnessAI. The discipline is now essential because current application-security patterns do not fully address AI lifecycle behavior or output-driven abuse.
At a glance
What this is: This is an independent analysis of AI threat modeling and its role in identifying model, data, and infrastructure risks across the AI lifecycle.
Why it matters: It matters because IAM, NHI, and security teams need a governance model for AI systems that can access data, expose outputs, and influence decisions in ways traditional app security does not cover.
👉 Read WitnessAI's guide to AI threat modeling across the AI lifecycle
Context
AI threat modeling is the structured process of identifying where an AI system can be attacked, what can be abused, and which controls reduce that exposure. In practice, it extends familiar threat-modeling methods to model inputs, model outputs, data flows, and inference APIs, which makes it directly relevant to AI governance and identity control.
The governance gap is that AI systems can combine access, inference, and workflow influence in ways that are harder to assess with conventional application-security reviews. For security teams, the question is no longer only whether the model works, but what it is allowed to do, what data it can reach, and how its outputs affect downstream systems.
Key questions
Q: How should security teams apply threat modeling to AI systems?
A: Security teams should model AI systems as a combination of data pipelines, inference surfaces, outputs, and connected workflows. That means mapping what the model can read, what it can generate, and what downstream actions those outputs can trigger, then adding controls for access, validation, monitoring, and approval where risk is highest.
Q: Why do AI systems create governance gaps that standard app security misses?
A: AI systems create governance gaps because their behavior depends on prompts, model state, external data, and connected tools, not just static code. A standard application review may miss information disclosure, prompt manipulation, or workflow abuse because the risk emerges from how the system is used at runtime.
Q: What do teams get wrong about AI threat modeling?
A: Teams often treat AI threat modeling as a one-time design exercise instead of a living governance process. That misses retraining, prompt updates, new integrations, and shifting access paths, which are often the real sources of risk once the AI system is in production.
Q: How can organisations know whether AI threat modeling is working?
A: AI threat modeling is working when it changes decisions about allowed data sources, approved tools, output handling, and human approval points. If the process produces diagrams but no access changes, no validation tests, and no response constraints, it is not reducing risk in practice.
Technical breakdown
AI threat modeling for model, data, and output risk
AI threat modeling treats the model, its data pipeline, and its inference surface as linked attack surfaces. That means analysing training data poisoning, prompt injection, model inversion, unauthorized inference, and sensitive output leakage as part of one system rather than separate issues. Traditional threat models often focus on code paths and infrastructure boundaries, but AI adds probabilistic behavior, external model dependencies, and output-driven misuse. The practical value is in forcing teams to map where the AI system can be influenced, what integrity controls exist, and where sensitive data can escape through responses or downstream automation.
Practical implication: Map every AI data flow and output path before deployment, then attach explicit access and validation controls to each stage.
Why AI lifecycle threat modeling changes control design
AI systems evolve after initial deployment because models are retrained, prompts are updated, tools are added, and dependencies change. That makes one-time review inadequate. A lifecycle threat model captures how risk shifts from training to inference and from internal testing to production use. It also helps teams identify where supply chain dependencies, open-source components, or hosted model services create inherited exposure. For identity teams, this matters because AI systems may hold credentials, query sensitive sources, or trigger actions that should be governed as access, not just as application behavior.
Practical implication: Reassess the threat model whenever the model, dataset, prompt set, or connected tools change.
STRIDE and attack trees remain useful when adapted to AI
STRIDE still provides a useful structure if it is applied to AI-specific failure modes. Spoofing can mean impersonating trusted input sources, tampering can mean poisoning training data or prompts, information disclosure can mean leaking sensitive context through generated output, and elevation of privilege can mean an AI workflow reaching systems beyond its intended scope. Attack trees add detail by showing how an attacker can chain prompt manipulation, API abuse, and workflow misuse. The key is to model AI-specific entry points rather than forcing generic software assumptions onto the system.
Practical implication: Use STRIDE and attack trees together to test whether AI controls fail at input, inference, or downstream action stages.
Threat narrative
Attacker objective: The attacker wants to use the AI system as a trusted interface for data exposure, workflow abuse, or decision manipulation.
- Entry occurs when an attacker manipulates AI inputs, abuses exposed AI APIs, or targets a weak trust boundary in the model pipeline.
- Escalation follows when the attacker turns model behavior, output access, or connected workflow permissions into broader data exposure or action abuse.
- Impact is achieved when the AI system leaks sensitive information, generates harmful output, or drives unauthorized downstream decisions.
Breaches seen in the wild
- Cisco DevHub NHI breach — IntelBroker exploited exposed Cisco credentials, API tokens and keys in DevHub.
- DeepSeek breach — DeepSeek breach exposed 1M+ log lines and sensitive secret keys.
Read our 52 NHI Breaches Analysis report for a comprehensive view of breaches impacting Non-Human Identities including AI Agents.
NHI Mgmt Group analysis
AI threat modeling is now an identity governance problem, not just an application-security exercise. Once an AI system can query data, generate output, or trigger downstream actions, the relevant question becomes who or what it is allowed to access. That shifts the center of gravity from code review to governance of AI permissions, data reach, and action scope. Practitioners should treat AI threat models as part of access design, not as a separate security artifact.
Model, data, and inference paths create a new governance surface that traditional threat models under-describe. Classic app threat modeling assumes the application is a bounded component with stable inputs and outputs. AI systems are more fluid because prompts, tools, retrieval sources, and model behavior all change the attack surface. The practical implication is that security programmes need to track AI identity exposure wherever the model can read, write, or act.
Runtime controls matter because AI risk is shaped by what the system can do after deployment. A threat model that stops at launch misses prompt evolution, model updates, and connected workflow expansion. That is why AI security governance must include continuous review of access, data paths, and tool connectivity. Practitioners should assume that AI risk grows as operational scope expands unless governance moves with it.
AI threat modeling should connect human IAM, NHI, and autonomous decision paths in one view. In many environments, the AI system is not the only identity issue. It may sit between human users, service accounts, and automation pipelines, which means a single weak assumption can cascade across all three domains. Practitioners should use AI threat models to expose where identity boundaries collapse under workflow automation.
Threat modeling only works when it changes decisions, not when it becomes documentation theater. The article correctly frames the discipline as a strategic tool, but the discipline has to drive concrete choices about allowed capabilities, connected data sources, and response boundaries. Security teams should use it to decide what the AI system must never be allowed to reach.
From our research:
- 1 in 4 organisations are already investing in dedicated NHI security capabilities, with an additional 60% planning to do so within the next twelve months, according to The State of Non-Human Identity Security.
- Lack of credential rotation is cited as the top cause of NHI-related attacks by 45% of organisations, followed by inadequate monitoring and logging at 37%, according to The State of Non-Human Identity Security.
- If AI security governance is moving toward identity-aware controls, practitioners should also review Top 10 NHI Issues for the operational patterns that keep recurring across machine access.
What this signals
AI threat modeling is becoming the control plane for AI governance. As AI systems move from experimentation into production workflows, security teams need a repeatable way to decide which data sources, outputs, and tool connections are acceptable. The issue is not whether a model can be attacked, but whether the programme can explain and constrain its access before the model reaches operational scale.
With 85% of organisations lacking full visibility into third-party vendors connected via OAuth apps, the identity problem is already broader than the model itself. AI systems often sit inside delegated access chains, which means governance fails when teams only look at the model boundary. Security and IAM teams should extend threat modeling to the surrounding identities, approvals, and integrations that make the AI useful in the first place.
Model risk and identity risk are converging into one operating problem. That means the next step is not another diagram, but a programme that ties AI access, secrets handling, and workflow approval into one review cycle. Teams that separate those controls will keep missing how AI systems actually reach data and execute actions.
For practitioners
- Build AI threat models around actual data flows Document how prompts, retrieval sources, model outputs, and downstream systems connect before production rollout. Include credentials, API calls, and any tool or workflow dependencies so the review reflects the real attack surface.
- Classify AI access as a governance control Treat read, write, and action permissions for AI systems like access entitlements. Define what the system can see, what it can influence, and where human approval is required before sensitive actions occur.
- Re-run the threat model after each AI change Update the assessment when prompts change, models are retrained, new tools are added, or connected data sources expand. A stale model is a broken control because the exposure surface moves with the system.
- Test for output leakage and unauthorized inference Exercise the model with adversarial inputs, sensitive prompts, and malformed requests to see whether it reveals protected information or bypasses intended boundaries. Validate both the model and the surrounding access controls.
Key takeaways
- AI threat modeling extends threat analysis into model behavior, data flow, and output abuse, which standard application security does not fully cover.
- The most material AI risks are runtime risks, because prompts, tools, and connected systems can change after initial deployment.
- Security teams should govern AI access, approval boundaries, and data reach as part of identity control, not as a standalone documentation exercise.
Standards & Framework Alignment
This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.
OWASP Agentic AI Top 10 address the attack and risk surface, while NIST AI RMF and NIST CSF 2.0 set the governance and control requirements practitioners need to meet.
| Framework | Control / Reference | Relevance |
|---|---|---|
| OWASP Agentic AI Top 10 | AI model and output abuse map to agentic system threat surfaces. | |
| NIST AI RMF | Risk management for AI systems fits the article's lifecycle and governance focus. | |
| NIST CSF 2.0 | PR.AC-4 | AI access to data and tools depends on controlled entitlements. |
Model AI access paths, tool use, and output handling as attack surfaces that need explicit governance.
Key terms
- AI Threat Modeling: A structured process for identifying how an AI system can be attacked, misused, or made to reveal sensitive information. It extends traditional threat modeling to include prompts, model behavior, training data, outputs, and connected tools so teams can govern the system’s real attack surface.
- Model Inversion: An attack that tries to reconstruct sensitive training data or hidden attributes from a model’s outputs or behavior. In practice, it matters because even a well-trained model can leak information through repeated queries, overexposed outputs, or poorly controlled inference access.
- Prompt Injection: A manipulation technique where an attacker supplies crafted text that changes how a model interprets instructions. It is especially important in AI systems that act on retrieved content or user input, because the attack can redirect behavior without altering the underlying code.
- Inference API: The interface used to send inputs to a trained model and receive outputs in return. It is a key governance point because it can expose sensitive data, enable abuse at scale, and become a bridge between the model and downstream systems that rely on its answers.
Deepen your knowledge
NHI governance, agentic AI identity, and machine identity security are core topics in our NHI Foundation Level course, the industry's only accredited NHI security programme. If you are responsible for identity security strategy or governance in your organisation, it is worth exploring.
This post draws on content published by WitnessAI: AI threat modeling for AI and ML security. Read the original.
Published by the NHIMG editorial team on 2026-01-22.
NHI Mgmt Group — the independent authority on Non-Human Identity, IAM, and Agentic AI security. nhimg.org