Real AI in cybersecurity still depends on human oversight

By NHI Mgmt Group Editorial TeamPublished 2026-06-26Domain: EventsSource: Abnormal AI

TL;DR: Chapter 5 of The Convergence of AI + Cybersecurity series examines how to distinguish genuine AI from automation and rule-based systems, with machine learning experts and academics explaining email-threat detection, human oversight, and vendor due-diligence questions in an on-demand webinar from Abnormal AI. The real governance issue is not whether a tool uses AI language, but whether the control model matches the system’s actual decision-making behaviour.

At a glance

What this is: This on-demand webinar explains how to separate genuine AI in cybersecurity from automation and marketing language, with a focus on email threat detection and practitioner due diligence.

Why it matters: It matters because IAM, security architecture, and governance teams need to know when they are evaluating human-controlled automation versus systems that introduce new trust, oversight, and lifecycle requirements.

👉 Watch Abnormal AI's on-demand webinar on real AI in cybersecurity

Context

The core problem is ambiguity: many cybersecurity products describe themselves as AI-powered even when the underlying behaviour is closer to automation, classification, or rule-based workflow. For identity and security teams, that distinction matters because governance should follow actual runtime behaviour, not branding.

In cybersecurity, genuine AI is not defined by the label on the product page. It is defined by whether the system learns from data, makes probabilistic judgments, and still depends on human experts for deployment, tuning, and oversight. That framing is especially important for teams assessing email security, threat detection, and vendor claims about AI capability.

Key questions

Q: How should security teams evaluate AI claims in cybersecurity tools?

A: They should evaluate the tool by its actual decision behaviour, not by marketing language. Ask whether it learns from data, how it handles false positives, where humans intervene, and what evidence exists for performance in real environments. If the answer stays vague, treat the AI claim as unverified.

Q: Why does machine learning matter for email threat detection?

A: Machine learning helps detect evolving email threats because attackers constantly change wording, sender patterns, and link structure to evade static rules. The value is in pattern recognition across large volumes of messages, but the control still depends on human-defined thresholds, review, and response governance.

Q: When should organisations trust AI-enabled security controls?

A: They should trust them only when the system’s learning behaviour, input data, error handling, and human oversight are clear and measurable. If a product cannot explain what it learns, what it misses, and how analysts can intervene, it should not be treated as a mature control.

Q: What is the difference between AI-driven detection and automation in cybersecurity?

A: Automation follows predefined rules, while AI-driven detection uses data-driven models to infer patterns and score risk. That difference matters because automation is easier to predict, but AI can detect novel behaviour at the cost of more validation, tuning, and governance requirements.

Background and context

AI versus automation in cybersecurity controls

In security tooling, automation executes predefined rules, while machine learning infers patterns from data and assigns confidence to outputs. That difference is operational, not cosmetic. A rule-based system can block known indicators with high consistency, but it cannot generalise beyond what it was explicitly told to detect. A real AI-enabled detector can surface novel or subtle patterns, such as unusual email phrasing or anomalous sender behaviour, but it also introduces tuning, false-positive management, and model-validation requirements. The governance question is therefore not whether AI exists in the stack, but whether the control’s decision logic changes based on data rather than static policy.

Practical implication: classify security controls by decision behaviour, then apply the right validation, monitoring, and exception-handling process.

Machine learning for email threat detection

Email threat detection is a strong use case for machine learning because attackers constantly vary subject lines, sender patterns, body text, and malicious intent to evade signature-based controls. ML systems typically combine features such as linguistic cues, sender reputation, attachment traits, link structure, and historical behaviour to estimate risk. That does not make them autonomous. The model still operates inside a human-defined security workflow, and analysts remain responsible for training data quality, threshold selection, escalation logic, and response. The value comes from pattern recognition at scale, not from independent action.

Practical implication: validate detection quality against realistic phishing and BEC patterns, not just vendor demo datasets.

Vendor questions that expose weak AI claims

The fastest way to separate genuine AI from hype is to ask how the system was trained, what inputs it uses, how false positives are handled, and where human review is required. Good answers describe data sources, model limits, feedback loops, and operational boundaries. Weak answers stay at the level of “AI-based” or “AI-native” without explaining what actually changes in detection, classification, or response. For IAM and security leaders, that matters because procurement decisions should align to control outcomes, not branding language.

Practical implication: make AI governance questions part of procurement, risk review, and ongoing control assurance.

NHI Mgmt Group analysis

AI branding is not a control model: Security products that describe themselves as AI-powered can still behave like conventional automation. The governance mistake is assuming the label reveals the operating model. Practitioners should judge tools by whether they learn, adapt, and require human oversight, because that determines the real assurance burden.

Human experts remain part of the security control plane: The webinar’s strongest practical point is that machine learning does not replace operational judgment. Humans still define training quality, threshold tuning, escalation paths, and acceptable error rates. That means AI in cybersecurity is best understood as a decision-support layer, not an authority layer.

Trust should be tied to behaviour, not vendor vocabulary: Teams should not accept “AI-native” as evidence of better detection or stronger protection. The meaningful question is whether the system improves pattern recognition, reduces analyst load, and preserves auditability. If those three cannot be shown, the AI claim has little governance value.

Model transparency is the real differentiator: The most useful AI systems explain what data they use, what they miss, and where humans intervene. That transparency matters more than whether the product marketing sounds advanced. In practice, procurement teams should treat opacity as a risk signal, not a branding detail.

From our research:
43% of security professionals are concerned about AI systems learning and reproducing sensitive information patterns from codebases, according to The State of Secrets in AppSec.
Only 44% of developers are reported to follow security best practices for secrets management, according to The State of Secrets in AppSec.
For teams assessing AI-enabled controls, the broader identity question is how to govern secrets, workflows, and access boundaries together, not in isolation.

What this signals

AI governance for security tools will increasingly be judged on explainability and oversight, not branding. Teams that cannot map where a model learns, where a rule engine ends, and where human intervention starts will struggle to defend their control decisions in audit or incident review. The practical standard is behaviour-based assurance, not vendor language.

With 43% of security professionals already worried about AI systems reproducing sensitive patterns from codebases, the boundary between detection and data leakage is no longer theoretical. That concern should push security leaders to review how models are trained, what data they ingest, and how outputs are retained. The same issue appears in secrets management, where learning systems can surface hidden exposure paths.

AI-enabled security programmes should now be assessed alongside identity and secrets governance. If a control can observe patterns but cannot prove containment, the result is visibility without accountability. That is why tools, access, and data handling need to be evaluated as one operating model rather than separate risks.

For practitioners

Separate AI claims from control behaviour Inventory security tools by how they decide, what they learn from, and where human review occurs. Treat rule engines, classifiers, and adaptive models as different governance objects rather than one generic AI category.
Test email detection against realistic adversary variation Use varied phishing, business email compromise, and impersonation samples to see whether detection improves beyond static indicators. Measure false positives, miss rates, and analyst workload, not just headline accuracy.
Add AI due diligence to vendor review Ask vendors to explain training inputs, feedback loops, threshold tuning, and audit evidence. If the answer does not describe operating boundaries clearly, the control is not ready for production trust.
Document human oversight responsibilities Assign who approves model changes, who reviews high-risk alerts, and who owns exceptions when the system behaves unexpectedly. Governance fails when oversight exists in practice but not on paper.

Key takeaways

The central issue is not whether cybersecurity tools use AI language, but whether their behaviour is genuinely data-driven and governable.
Machine learning can improve email threat detection, but only when humans still control training quality, thresholds, escalation, and response.
Procurement and risk teams should demand explainability, oversight, and evidence of performance before trusting an AI-enabled security control.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

NIST AI RMF, NIST CSF 2.0 and NIST SP 800-63 set the governance and control requirements practitioners need to meet.

Framework	Control / Reference	Relevance
NIST AI RMF		AI claims here require governance, transparency, and human oversight.
NIST CSF 2.0	GV.OV-01	This article focuses on how organisations oversee security control behaviour and evidence.
NIST SP 800-63		Human oversight and assurance are central when AI tools support identity-related decisions.

Treat automated decisions that affect identity controls as governed workflows requiring review and auditability.

Key terms

Machine Learning: Machine learning is a method where software identifies patterns from data and uses them to make predictions or classifications without every rule being hard-coded. In cybersecurity, it is useful for spotting evolving threat patterns, but it still depends on training quality, thresholds, and human oversight.
Rule-Based System: A rule-based system makes decisions using predefined conditions written by humans, such as allow lists, deny lists, or pattern matching. It is predictable and easy to audit, but it cannot generalise beyond its instructions, which limits its ability to detect novel attacker behaviour.
Human-in-the-Loop: Human-in-the-loop means a person remains involved in review, approval, tuning, or exception handling after a system produces an output. In security operations, this keeps automated or AI-assisted controls accountable, especially when the model’s confidence, impact, or error rate requires judgment.
Email Threat Detection: Email threat detection is the process of identifying malicious, suspicious, or high-risk messages before users interact with them. It often combines signatures, rules, and machine learning, and effective programmes measure not only block rates but also accuracy, analyst load, and response quality.

Deepen your knowledge

NHI governance, agentic AI identity, and machine identity security are core topics in our NHI Foundation Level course, the industry's only accredited NHI security programme. If you are responsible for identity security strategy or programme maturity, it is worth exploring.

This post draws on content published by Abnormal AI: Chapter 5 of The Convergence of AI + Cybersecurity series. Read the original.

NHIMG Editorial Note
Published by the NHIMG editorial team on 2026-06-26.
NHI Mgmt Group — the independent authority on Non-Human Identity, IAM, and Agentic AI security. nhimg.org