Who is accountable when AI-driven defence blocks legitimate users or misses fraud?

The organisation remains accountable, not the model. Security, fraud, and identity owners need a shared governance model that defines decision rights, exception handling, and auditability. If an AI system affects access or customer trust, it needs the same accountability discipline as any other identity control.

Why This Matters for Security Teams

When AI-driven defence blocks legitimate users or misses fraud, the technical failure is only half the issue. The harder question is accountability: who owns the policy, who approves exceptions, and who is responsible when the system makes a harmful decision? NIST’s Cybersecurity Framework 2.0 treats governance as a first-class function because automated controls still operate inside human risk acceptance. That matters for access, payment, and fraud workflows where false positives can stop revenue and false negatives can create losses.

NHI Management Group sees the same pattern across identity programs: organisations deploy AI to reduce analyst load, but do not define decision rights for edge cases, rollback, or customer challenge paths. The result is operational ambiguity after the fact, not control before the fact. In practice, many security teams encounter accountability gaps only after legitimate users are blocked or fraud has already been approved, rather than through intentional governance design.

How It Works in Practice

Accountability for AI-driven defence should sit with the organisation, but it needs to be distributed across named control owners. Security typically owns the model risk and detection logic, fraud owns harm thresholds and case review, and identity owns authentication, step-up, and recovery policies. The practical goal is not to blame a model, but to make every automated decision traceable to a human-approved policy.

A workable operating model usually includes four elements. First, define decision rights: which outcomes the AI can enforce automatically, which it can recommend, and which must be escalated. Second, require auditability: log the input context, model output, policy version, and final disposition. Third, establish exception handling: support staff need a documented override path for legitimate users, with time-bound review and post-incident analysis. Fourth, connect the control to existing identity governance so that a blocked session, challenged login, or fraud hold maps to a clear owner and service-level objective.

This is where guidance from NIST Cybersecurity Framework 2.0 and the NHIMG research on LLMjacking is especially relevant: AI systems are only as accountable as the credentials, policies, and governance around them. When attackers abuse compromised NHIs, the organisation still owns the failure domain. The same applies when a model misclassifies a trusted user or misses synthetic fraud signals.

Security leaders should also separate “who approved the control” from “who reviews the alert.” That distinction matters in incidents, audits, and customer disputes. These controls tend to break down when identity, fraud, and security teams operate different escalation paths because no single team can reverse or justify the decision quickly enough.

Common Variations and Edge Cases

Tighter automated defence often increases friction, requiring organisations to balance fraud loss reduction against customer disruption and support overhead. The hardest cases are not obvious attacks, but edge conditions such as travellers, VIP accounts, shared devices, regulated step-up flows, and unusually high-value transactions. In those environments, current guidance suggests that a single pass or fail decision is too blunt for operational risk.

There is no universal standard for this yet, but best practice is evolving toward tiered accountability. Low-risk actions can remain fully automated, medium-risk actions should trigger human review, and high-risk actions should require explicit approval from a named control owner. That model also helps when one team tunes the model while another team absorbs the customer impact. The organisation should keep a decision log that records whether the outcome was a model recommendation, a policy override, or a manual exception.

Two NHIMG research points are useful here: the DeepSeek breach shows how quickly exposed secrets and AI systems can create broad trust failures, while the CI/CD pipeline exploitation case study reinforces that control gaps often appear where governance was assumed, not verified. The edge case to watch is a highly automated environment with no named human backstop, because that is where accountability becomes indistinct exactly when it is most needed.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Agentic AI Top 10 address the attack and risk surface, while NIST CSF 2.0 and NIST AI RMF set the governance and control requirements practitioners need to meet.

Framework	Control / Reference	Relevance
NIST CSF 2.0	GV.OV-01	Governance must assign accountability for AI defence outcomes.
NIST AI RMF	GOVERN	AI RMF governance covers accountability, oversight, and risk ownership.
OWASP Agentic AI Top 10	A2	Autonomous AI decisions need controls for misalignment and harmful actions.

Constrain AI actions with approval gates, logging, and rollback for harmful or blocked decisions.

Who is accountable when AI-driven defence blocks legitimate users or misses fraud?

Why This Matters for Security Teams

How It Works in Practice

Common Variations and Edge Cases

Standards & Framework Alignment

Related resources from NHI Mgmt Group