What should security teams control in AI-powered phishing simulations?

Why This Matters for Security Teams

AI-powered phishing simulations are not just email tests. They are controlled social-engineering exercises that can touch threat intelligence, identity data, vendor references, and delivery infrastructure. If teams lose control of any of those inputs, a training exercise can become a data exposure event, especially when simulation content is forwarded, copied, or re-used outside the intended workflow. That is why the control question starts with provenance and scope, not just template quality.

This also matters because AI can generate convincing variants at scale, which increases the blast radius of weak governance. Current guidance suggests treating the simulation pipeline as a governed workflow, not a one-time campaign. That means validating the source of threat intelligence, limiting identity attributes to what is necessary, and reviewing the channels used to send the simulation. The NIST Cybersecurity Framework 2.0 is useful here because it reinforces governance, risk oversight, and control of external dependencies, while NHIMG’s The State of Non-Human Identity Security highlights how visibility gaps and over-privileged access are common failure modes in adjacent identity workflows. In practice, many security teams encounter unintended disclosure only after a phishing template has already been circulated beyond the exercise audience.

How It Works in Practice

Effective control of AI-powered phishing simulations starts by separating four layers: intelligence input, targeting logic, template generation, and delivery. The intelligence source should be approved, traceable, and current, especially if it includes vendor names, internal systems, or realistic incident themes. The targeting layer should use only the identity attributes needed to make the simulation credible, such as role, department, or region, and should exclude sensitive attributes unless there is a documented training reason.

Template approval should be explicit, ideally with a human review step before any message is sent. AI can speed up drafting, but it should not be allowed to invent unsafe prompts, overfit to private role data, or mimic privileged workflows without oversight. Delivery should also be controlled through approved senders, rate limits, and scoped distribution lists so simulations do not leak into broader mail streams or external recipients.

Practitioners can use a simple control model:

Approve the threat source before generation begins.

Minimise identity attributes used for personalisation.

Review the final template for sensitive references and hidden instructions.

Constrain the delivery mechanism to the training platform or sanctioned mail service.

NHIMG’s Ultimate Guide to NHIs — Standards is a useful reference point for thinking about governance boundaries, while the NIST Cybersecurity Framework 2.0 helps teams map those controls to oversight and monitoring expectations. These controls tend to break down when simulation tooling is connected to live user directories and unrestricted generative AI services because the content can be personalised faster than it can be reviewed.

Common Variations and Edge Cases

Tighter control often increases administrative overhead, so security teams have to balance realism against privacy, operational safety, and campaign speed. That tradeoff becomes sharper when simulations are highly targeted, such as executive impersonation, vendor-themed lures, or role-specific workflow prompts. Current guidance suggests avoiding any identity attribute that is not necessary to achieve the training objective, but there is no universal standard for how much realism is too much.

One edge case is forwarded content. If a simulation includes vendor names, internal project references, or role-based escalation cues, the message can create confusion or reveal sensitive structure if it escapes the training context. Another edge case is AI-generated template reuse. A prompt that is safe in one campaign may become unsafe if reused with a different audience, because the model may surface stale assumptions or disclose context that is no longer appropriate. Teams should also treat delivery metadata as part of the control surface, not an implementation detail, because headers, reply paths, and tracking links can expose more than the body text itself.

NHIMG research on DeepSeek breach underscores how quickly AI-enabled workflows can propagate sensitive context when guardrails are weak. The practical test is simple: if a simulation asset can be misunderstood, forwarded, or repurposed outside the exercise, it needs stronger approval and narrower distribution before release.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Agentic AI Top 10 and CSA MAESTRO address the attack and risk surface, while NIST AI RMF set the governance and control requirements practitioners need to meet.

Framework	Control / Reference	Relevance
OWASP Agentic AI Top 10	A03	AI-generated simulations can leak prompt context and unsafe output.
CSA MAESTRO	GOV-02	Simulation workflows need governed approval, scope, and oversight.
NIST AI RMF		Phishing simulations are a governed AI use case with privacy and misuse risk.

Review AI prompts and outputs before release, and block sensitive context from template generation.

#1 Authority in NHI Education, Research and Advisory, empowering organizations to tackle the critical risks posed by Non-Human Identities (NHIs), including AI Agents.

What should security teams control in AI-powered phishing simulations?

Why This Matters for Security Teams

How It Works in Practice

Common Variations and Edge Cases

Standards & Framework Alignment

Related resources from NHI Mgmt Group