Who should own response when LLM abuse becomes a phishing channel?

Why This Matters for Security Teams

When an LLM becomes a phishing channel, the problem is no longer limited to content moderation or prompt safety. It becomes a workflow abuse issue spanning identity, fraud, detection engineering, and customer protection. A compromised assistant can generate convincing lures, impersonation copy, or deepfake-ready scripts at scale, which means the blast radius extends far beyond the model itself. Current guidance suggests treating this as an enterprise abuse path, not a product feature defect.

NHI Management Group research on the AI Agents: The New Attack Surface report shows why ownership cannot sit in one silo: 80% of organisations report AI agents have already acted beyond intended scope, while only 52% can track and audit the data those agents access. That combination is exactly what makes phishing-channel abuse hard to contain. Security teams often focus on the model prompt, but the real control point is the identity and tool chain behind the assistant. OWASP’s OWASP Top 10 for Agentic Applications 2026 and the NIST AI Risk Management Framework both reinforce that abuse needs coordinated governance, not just model filtering.

In practice, many security teams encounter phishing-enabled LLM abuse only after the assistant has already been used to draft lures, impersonate staff, or push malicious links through trusted channels.

How It Works in Practice

Ownership should map to the abuse path, not the technology label. Fraud teams usually own customer deception, IAM owns credentials and privilege boundaries, security operations owns alerting and containment, and bot management or abuse operations owns automated misuse patterns. The question is not who “owns the model,” but who can stop the next harmful action fastest. That usually requires a joint response model with one decision maker, one incident queue, and pre-agreed escalation thresholds.

Operationally, the first step is to define the assistant as a governed workload with explicit identity, scope, and telemetry. The control plane should record who invoked it, what tools it touched, what content it generated, and whether the output was used in a phishing flow. That aligns with the direction of OWASP NHI Top 10 and NHIMG guidance on agentic abuse paths. Where the assistant can send email, open tickets, or message users, those actions should pass through policy checks at runtime rather than pre-approved role access alone.

Use just-in-time access for any assistant capability that can reach inboxes, customers, or identity workflows.

Separate content generation from content publication so phishing material cannot be auto-delivered.

Apply real-time policy evaluation using policy-as-code before tool calls, not after the fact.

Feed security operations with abuse telemetry that can be correlated to identity, session, and device context.

For implementation detail, the CSA MAESTRO agentic AI threat modeling framework and NIST AI Risk Management Framework both support this kind of shared accountability. These controls tend to break down when an LLM is embedded inside customer support, marketing automation, or delegated email workflows because those environments blur content generation, approval, and delivery into one action path.

Common Variations and Edge Cases

Tighter abuse controls often increase response time and review overhead, requiring organisations to balance user experience against safety and containment. That tradeoff is real, especially when business teams want low-friction automation and security teams want stronger gates. Current guidance suggests that ownership should still stay cross-functional, but the lead function may change depending on the abuse outcome: fraud for customer impersonation, IAM for credential misuse, security operations for active compromise, and platform engineering for integration hardening.

One edge case is internal misuse, where employees use an assistant to generate highly convincing spear-phishing content for lateral abuse. Another is external abuse of a public-facing assistant that becomes a brand impersonation engine. A third is indirect abuse, where the model itself is not breached but its outputs are chained into a social engineering workflow. In all three, the decisive issue is whether the organisation can prove what the assistant did and stop it quickly. NHIMG’s LLMjacking: How Attackers Hijack AI Using Compromised NHIs research is a reminder that once credentials or access paths are exposed, attacker dwell time can be very short. The practical answer is shared ownership with a single incident lead, because there is no universal standard for this yet.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Agentic AI Top 10 and CSA MAESTRO address the attack and risk surface, while NIST AI RMF set the governance and control requirements practitioners need to meet.

Framework	Control / Reference	Relevance
OWASP Agentic AI Top 10	A04	Agent abuse and unsafe tool use are central to phishing-channel LLM misuse.
CSA MAESTRO	TM-1	MAESTRO addresses threat modeling for agentic workflows and abuse paths.
NIST AI RMF		AI RMF GOVERN and MAP apply to shared accountability for abuse and escalation.

Model phishing as an agent workflow risk and assign controls across identity, tools, and output paths.

#1 Authority in NHI Education, Research and Advisory, empowering organizations to tackle the critical risks posed by Non-Human Identities (NHIs), including AI Agents.

Who should own response when LLM abuse becomes a phishing channel?

Why This Matters for Security Teams

How It Works in Practice

Common Variations and Edge Cases

Standards & Framework Alignment

Related resources from NHI Mgmt Group