Behavioral AI exposes email attacks that bypass native Microsoft controls

By NHI Mgmt Group Editorial TeamPublished 2026-05-14Domain: Governance & RiskSource: Abnormal AI

TL;DR: Customers average 462 advanced attacks per month per 1,000 mailboxes bypassing Microsoft native controls, according to Abnormal AI. Its behavioral model is trained on more than 1 billion signals and now powers 85% of detections across the platform, and the governance lesson is that intent-based identity and communication profiling is now essential because signature-only controls cannot keep pace with socially engineered abuse.

At a glance

What this is: This analysis argues that behavioral AI is outperforming rules-based email defense because it models identity, behavior, and content together to identify attacker intent.

Why it matters: It matters to IAM practitioners because email compromise, vendor impersonation, and account abuse all rely on identity trust, not just malicious payloads, and those trust signals now sit inside identity governance scope.

By the numbers:

Abnormal customers average 462 advanced attacks per month per 1,000 mailboxes bypassing Microsoft native controls.
44% of employees who read a vendor email compromise message engage with it.
Since Attune launched, unique attack detections rose approximately 68%.

👉 Read Abnormal AI's analysis of behavioral email attacks bypassing Microsoft controls

Context

Email security fails when defenders look only for malicious signatures, because the highest-value attacks are often crafted to resemble normal business communication. This article focuses on how behavioural AI changes that detection model by learning identity-specific patterns across email, authentication, tenant configuration, and application permissions.

For IAM teams, the important shift is that trust is no longer just an authentication problem. Vendor impersonation, executive compromise, and relationship abuse all depend on understanding whether a request fits the established behaviour of a person or third party, which makes identity context part of the security control plane.

Key questions

Q: How should security teams reduce business email compromise without drowning analysts in false positives?

A: Use behavioural detections that model each identity’s normal communication, authentication, and request patterns. That lets teams separate legitimate business changes from socially engineered abuse, even when the message contains no malicious link or attachment. The goal is not more alerts, but better intent recognition and fewer reviews for routine traffic.

Q: Why do rules-based email controls fail against modern phishing and vendor impersonation?

A: They depend on known indicators, but modern attacks often avoid those indicators entirely. If the email is clean, the domain is close enough, and the request fits a business process, signature logic can miss it. Behavioural analysis works better because it measures whether the request is normal for that relationship, not just whether the message looks suspicious.

Q: How can organisations tell whether email AI is actually improving security?

A: Look for measurable reductions in false positives, faster analyst review, and higher-confidence detections that explain why a message was flagged. If the system cannot show its reasoning or still requires daily tuning, it is adding complexity rather than resilience. Effective email AI should reduce workload while expanding coverage of intent-based attacks.

Q: What should teams prioritise when evaluating behavioural email security tools?

A: Prioritise per-identity baselines, explainable detections, and a feedback loop that improves with live production traffic. Those capabilities matter more than broad claims about AI because they determine whether the tool can detect abuse that looks normal in content but abnormal in context. Without them, the platform will struggle with invoice fraud, executive impersonation, and thread hijacking.

Technical breakdown

Behavioral AI for email security and identity context

Behavioral AI in email security does not rely on a single suspicious indicator. It combines communication patterns, authentication norms, message structure, relationship history, and request type to establish a baseline for each identity. That matters because a message can look legitimate at the payload level while still being abnormal for the sender, the recipient, or the vendor relationship involved. When the model evaluates identity, behavior, and content together, it can flag intent rather than just known indicators. This is a different detection philosophy from rules or reputation scoring, which depend on prior examples and surface traits.

Practical implication: teams should evaluate email controls on whether they use identity-specific behavioural baselines, not just threat signatures.

Why rules-based detection misses socially engineered attacks

Rules-based systems work best when attacks reuse known artefacts such as malicious URLs, attachments, or sender infrastructure. They struggle when an attacker hijacks a real thread, uses a lookalike domain, or sends a plausible internal request because the message may be technically clean while still being malicious. The article’s key point is that the attack objective stays stable even as wording, personas, and delivery paths change. That makes behaviour the more durable detection signal, especially for business email compromise and vendor fraud, where the message is designed to blend into routine workflow.

Practical implication: augment signature detection with relationship-aware analytics for invoice, banking-change, and executive-request workflows.

Precision, automation, and analyst feedback loops

Detection quality is only operationally useful if it is precise enough to support automation. The article argues that low false positives come from evaluating intent at scale and then feeding analyst review back into the model. In practice, that creates a continuous tuning loop where customer reports, behavioural signals, and ambiguous cases improve future detections. The architecture matters because it shifts email defense from daily triage toward model-assisted operations, but only if the feedback loop is disciplined and the detections remain explainable to analysts.

Practical implication: require measurable false-positive performance and a clear analyst feedback loop before trusting automation in email defense.

Threat narrative

Attacker objective: The attacker aims to exploit trusted communication channels to steal credentials, redirect payments, or gain account access without triggering signature-based detection.

Entry occurs when an attacker uses a lookalike domain or hijacked invoice thread to send a message that resembles a normal business request.
Credential or payment abuse follows when the recipient engages with the message and the attacker leverages that trust to redirect banking details or extend account access.
Impact lands as account compromise, financial diversion, or broader business email compromise that bypasses native controls because the request itself looked legitimate.

Cisco DevHub NHI breach — IntelBroker exploited exposed Cisco credentials, API tokens and keys in DevHub.
ASP.NET machine keys RCE attack — 3,000+ exposed ASP.NET machine keys enabled remote code execution.

Read our 52 NHI Breaches Analysis report for a comprehensive view of breaches impacting Non-Human Identities including AI Agents.

NHI Mgmt Group analysis

Behavioral email security is really identity security in another form. The article shows that the most effective detections are built on per-identity behavioural baselines, not org-wide averages. That is the same governance problem IAM teams face everywhere: the control fails when it assumes one normal pattern fits all users, vendors, and service relationships. Practitioners should treat messaging behaviour as identity evidence, not just content risk.

Signature-only defense leaves a structural gap against social engineering. A rule engine needs a prior example, but the most expensive attacks are designed to avoid leaving one. That means the real failure mode is not just missing indicators, but over-trusting communication that is syntactically valid and operationally plausible. Security teams should recognise this as a control-design problem, not a tuning problem.

Identity blast radius: the same behavioural pattern can be safe in one relationship and dangerous in another. The article’s strongest technical point is that a request must be judged against the sender’s established history, the recipient’s norms, and the relationship context. This is a useful named concept for practitioners because it explains why org-wide baselines underperform in vendor fraud and executive impersonation. Teams should scope detections around identity-specific blast radius, not mailbox volume.

Automation becomes credible only when false positives are near zero. The article links behavioural precision to operational efficiency, with most customers spending very little time in the platform. That matters because email security still fails many programmes by creating too much human review. The implication for practitioners is clear: if a model cannot support low-friction response, it is not ready to carry the detection load.

Continuous learning changes the procurement question. The relevant question is no longer whether the system can classify yesterday’s threats, but whether it can absorb new patterns without constant manual rule writing. That is a governance and operating-model question, not just a detection question. Practitioners should evaluate whether their email security stack improves from live telemetry at production scale or merely reacts after the fact.

From our research:
80% of identity breaches involved compromised non-human identities such as service accounts and API keys, according to Ultimate Guide to NHIs.
91.6% of secrets remain valid five days after the targeted organisation is notified, showing a critical gap in remediation procedures.
That remediation lag is one reason practitioners should also review NHI Lifecycle Management Guide for offboarding and rotation discipline.

What this signals

Identity-specific behaviour is becoming a core security signal, not an email-only optimisation. As attackers keep using legitimate-looking communication to trigger trusted actions, teams need controls that understand relationship context across human identity, vendors, and machine-triggered workflows. The practical shift is toward evaluating who normally says what, through which channel, and under what trust boundary, rather than relying on sender reputation alone.

84% of organisations still struggle to fully address NHI risk, according to our Ultimate Guide to NHIs. That matters here because the same trust assumptions that fail for service accounts also fail for email-based impersonation and account abuse. The governance gap is broader than one control surface, which is why email, IAM, and identity analytics now need shared operating assumptions.

Business email compromise is a relationship-governance problem as much as a detection problem. Teams that still separate mail security from identity governance will keep missing the point of thread hijacking, invoice fraud, and executive impersonation. The stronger programme response is to treat communication trust, identity behaviour, and privilege-sensitive requests as one control domain.

For practitioners

Rebuild detections around identity-specific baselines Prioritise controls that compare a message against the sender’s communication history, authentication pattern, and relationship context rather than broad organisational norms.
Harden vendor-change workflows Require secondary verification for banking detail changes, invoice amendments, and other relationship-sensitive requests, especially when the request arrives in an existing thread.
Test for signature blind spots Run controlled simulations of thread hijacking, lookalike domains, and internal-account abuse to see whether current tools detect intent when payloads remain clean.
Measure false positives as an operational control Track how much analyst time the platform consumes weekly and whether detections remain explainable enough to support low-friction automation.

Key takeaways

Behavioural AI changes email defense from signature matching to identity-aware intent detection.
The practical risk is not only missed malicious content, but normal-looking requests that exploit established trust relationships.
Security teams should measure whether their controls can explain, contain, and operationalise intent-based detections at low false-positive cost.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Non-Human Identity Top 10 address the attack and risk surface, while NIST CSF 2.0 and NIST Zero Trust (SP 800-207) set the governance and control requirements practitioners need to meet.

Framework	Control / Reference	Relevance
OWASP Non-Human Identity Top 10	NHI-01	Identity-specific abuse in email depends on unmanaged trust signals.
NIST CSF 2.0	PR.AC-4	Access and trust validation are central to preventing identity-based abuse.
NIST Zero Trust (SP 800-207)	SC-7	The article’s identity-context approach aligns with continuous verification principles.

Map email-facing identities and tighten controls where communication trust is exploitable.

Key terms

Behavioral baseline: A behavioral baseline is the normal pattern of activity expected from a specific identity, relationship, or account. In email security, it includes communication cadence, authentication behaviour, request type, and context so that deviations can be judged as intent signals rather than isolated anomalies.
Business email compromise: Business email compromise is an attack in which an adversary abuses trusted email relationships to redirect payments, request sensitive actions, or obtain access. The message may be technically clean, so the real defence is validating behaviour and relationship context, not just scanning for malicious payloads.
Identity-specific detection: Identity-specific detection evaluates risk against the normal behaviour of one person, vendor, or account instead of using broad organisational averages. That approach is stronger for impersonation and thread hijacking because it can recognise when a request is abnormal for the relationship even if it looks acceptable in isolation.
Thread hijacking: Thread hijacking is the abuse of an existing email conversation to deliver a malicious request inside a trusted context. It is effective because recipients already recognise the thread, so defenders need behavioural and relationship-aware controls to detect when the familiar conversation has been taken over.

Deepen your knowledge

NHI governance, agentic AI identity, and machine identity lifecycle are core topics in our NHI Foundation Level course, the industry's only accredited NHI security programme. If you are responsible for identity security strategy or NHI governance in your organisation, it is worth exploring.

This post draws on content published by Abnormal AI: Key insights on behavioral AI detecting advanced email attacks. Read the original.

NHIMG Editorial Note
Published by the NHIMG editorial team on 2026-05-14.
NHI Mgmt Group — the independent authority on Non-Human Identity, IAM, and Agentic AI security. nhimg.org