Multilingual phishing detection still fails where language cues matter

By NHI Mgmt Group Editorial TeamPublished 2025-11-06Domain: Best PracticesSource: Abnormal AI

TL;DR: Localized phishing and BEC attacks exploit language-specific cues such as informal German phrasing and wrong Japanese honorifics, exposing the limits of English-trained detection systems and translation-based AI, according to Abnormal AI. The real issue is that security models built for translation, not communication nuance, miss the cultural signals attackers use to appear authentic.

At a glance

What this is: Abnormal AI argues that multilingual phishing detection fails when systems rely on English-first or translation-based models instead of understanding language-specific cues and cultural tone.

Why it matters: For IAM and security teams, this matters because email identity abuse, executive impersonation, and user training all depend on recognising intent in the user’s native language, not just filtering words.

By the numbers:

Abnormal has analysed communication patterns across more than 100 languages.

👉 Read Abnormal AI’s analysis of multilingual phishing detection and localised BEC

Context

Multilingual phishing is an identity problem as much as an email-filtering problem. When attackers write in the language, tone, and formality expected by a target, they are abusing the trust signals that people use to decide whether a message is legitimate.

The weakness in many mail security stacks is not just content inspection. It is the assumption that translating text is enough to detect abuse, even though business email compromise often hinges on cultural nuance, honorifics, register, and relationship context.

Abnormal AI’s update is a useful marker because it treats language as behavioural evidence rather than a simple text-string problem. That is the right framing for global IAM and awareness programmes that must cover localised fraud patterns, not just English-language phishing.

Key questions

Q: How should security teams handle phishing and BEC in multilingual environments?

A: Security teams should validate their email controls against real local-language attacks, not just translated English samples. That means testing regional phrasing, formality, and impersonation patterns, then pairing detection with native-language awareness training. If the model cannot interpret the way people actually communicate, attackers can hide inside ordinary business tone.

Q: Why do English-trained email filters miss localised phishing attempts?

A: English-trained filters miss localised attacks because malicious intent is often carried by cultural cues, not just keywords. Wrong honorifics in Japanese or overly casual German phrasing can be decisive to a human reader but invisible to translation-based detection. Security teams need semantic models that understand business language in context.

Q: How do organisations measure whether multilingual phishing controls are working?

A: Measure detection accuracy, false positives, and analyst review load separately for each major language group, then compare those results against known local attack patterns. A control is not working if it performs well in English but misses or over-flags legitimate communication in regional teams. Coverage must be judged by operating language, not global averages.

Q: What should organisations include in native-language phishing awareness training?

A: They should include the scam motifs employees actually encounter, such as invoice fraud, supplier impersonation, executive requests, and login prompts written in local tone. Training should reflect the language, formality, and channel-shifting tactics used in real attacks. Generic templates leave users unprepared for the cues attackers rely on.

Technical breakdown

Why translation-based phishing detection misses localised intent

Translation-based systems normalise text into a common language and then look for suspicious patterns. That approach weakens when the attacker is not relying on obvious keywords but on tone, politeness, and context, such as overly casual German phrasing or the wrong level of Japanese honorifics. Those cues are meaningful to humans but often diluted by translation or token-level matching. A model trained only on English phishing also struggles with compound nouns, sentence structure, and relationship nuance that signal whether a business message is normal or socially engineered. Practical implication: security teams should treat multilingual intent as a detection requirement, not a localisation nice-to-have.

Practical implication: do not rely on translation alone for multilingual email defence; validate whether the detection layer understands local language cues.

Multilingual embeddings and neural classifiers in email security

Multilingual embeddings map words and phrases from different languages into a shared semantic space, allowing a model to compare meaning rather than exact wording. A neural classifier then learns patterns from authentic and malicious regional data, which helps distinguish legitimate business correspondence from impersonation or BEC attempts. This is different from rule-based text tagging, which depends on predefined patterns and is brittle when attackers vary phrasing. In practice, this architecture is better suited to languages where formality, relationship markers, and conversational style carry security significance. Practical implication: evaluate whether your email security stack uses semantic modelling for non-English traffic or still depends on fixed linguistic rules.

Practical implication: favour semantic models that learn from local-language examples instead of static rules tied to English phrasing.

Native-language phishing simulations and user reporting

Detection alone is not enough if users are trained only with templated English examples. Native-language simulations extend the same linguistic nuance used in detection into awareness programmes, which matters because employees recognise fraud more quickly when examples match their real communication environment. This is especially relevant in global organisations where finance, executive support, procurement, and regional teams receive different forms of impersonation pressure. The training layer should therefore mirror local phrasing, tone, and common scam motifs rather than generic phishing tropes. Practical implication: align awareness content with the languages and business contexts in which employees actually work.

Practical implication: localise phishing training alongside detection so employees see the same language patterns attackers use.

Threat narrative

Attacker objective: The attacker wants to use language-specific credibility to bypass email security, move the conversation or login flow into a controlled channel, and ultimately steal credentials or trigger fraudulent action.

Entry begins with a multilingual phishing or BEC message that uses local language cues, such as Japanese honorifics or informal German phrasing, to appear credible to the recipient.
Escalation occurs when the message leverages relationship context, impersonation, or a redirect to a lookalike login page to obtain trust, clicks, or conversation off email.
Impact is achieved when the attacker converts the initial deception into credential capture, executive impersonation, or payment fraud that bypasses English-trained controls.

Cisco DevHub NHI breach — IntelBroker exploited exposed Cisco credentials, API tokens and keys in DevHub.
DeepSeek breach — DeepSeek breach exposed 1M+ log lines and sensitive secret keys.

Read our 52 NHI Breaches Analysis report for a comprehensive view of breaches impacting Non-Human Identities including AI Agents.

NHI Mgmt Group analysis

Language-specific trust cues are now part of the identity attack surface. Multilingual phishing does not succeed because attackers simply translate English templates. It succeeds because they reproduce the tone, formality, and cultural markers that people use to judge legitimacy. That makes language a governance issue for IAM-adjacent programmes, not only a mail-security feature. Practitioners should treat local linguistic context as part of identity trust evaluation.

English-first detection leaves a measurable control gap in global programmes. Systems trained primarily on English are structurally weaker against localised BEC and phishing because they are optimised for the wrong baseline. The named concept here is linguistic trust gap: a mismatch between how identity trust is signalled in a region and how security tools interpret it. Teams should assume the gap exists wherever business communication is multilingual.

Phishing awareness fails when training content is culturally generic. Awareness programmes that use templated examples create a false sense of coverage in regions where attackers exploit formality, honorifics, and regional business norms. The result is a control plane that detects less and teaches less in the same blind spots. Practitioners should align training language with the identity risk patterns their users actually face.

Multilingual identity risk sits at the intersection of email security, fraud prevention, and human IAM. Executive impersonation, supplier fraud, and phishing all depend on social credibility, but the controls that evaluate that credibility are often split across teams. That fragmentation hides the pattern until local-language attacks start to scale. Security leaders should treat multilingual protection as a cross-domain governance problem, not a point solution.

Better language coverage is a sign of maturity, but it is not a complete control model. Even if detection accuracy matches across languages, attackers will keep adapting tone, channel, and pretext. The deeper lesson is that identity confidence depends on behaviour, context, and user response, not words alone. Practitioners should keep multilingual detection tied to continuous tuning and local threat validation.

From our research:
Only 1.5 out of 10 organisations are highly confident in their ability to secure NHIs, compared to nearly 1 in 4 for securing human identities, according to The State of Non-Human Identity Security.
Our research also shows that 85% of organisations lack full visibility into third-party vendors connected via OAuth apps, which is a useful reminder that identity blind spots usually start with incomplete control over connected access paths.
That gap points to the next step: NHI Lifecycle Management Guide for teams that need to bring provisioning, rotation, and offboarding discipline to every identity type, not just human users.

What this signals

Linguistic trust gap: global identity programmes now need controls that recognise how legitimacy is expressed in each operating language. When attackers use native phrasing, the weakest point is often not the mailbox filter but the organisation’s assumption that a single English-trained model can cover every region.

The practical signal for security teams is to segment detection and training by business language, then validate those controls with local attack simulations. A single global average can hide serious regional exposure, especially where finance, executive support, and supplier communication are linguistically distinct.

Programmes that already struggle with machine identity visibility should treat multilingual phishing as the human analogue of the same governance problem: incomplete context produces false confidence. The control goal is not just better translation, but better identity judgement across all communication surfaces.

For practitioners

Test detection against local-language pretexts Run red-team and vendor validation using authentic German, Japanese, and other regional business phrasing, including honorific errors, informal tone, and supplier impersonation. Measure whether the platform flags those cues without inflating false positives in normal business mail.
Localise phishing simulations by business region Deliver awareness content in the employee’s native language and mirror the forms of impersonation they actually see, such as invoice scams, executive requests, and payment-update lures. Keep the examples aligned to local communication style rather than generic phishing templates.
Review the English bias in mail security tuning Check whether your current rules, models, and exception workflows were calibrated on English-only corpora or translation output. If so, measure missed threats and analyst workload separately for each major business language before expanding coverage.
Connect email defence to fraud and IAM response paths Route multilingual impersonation events to both the SOC and the teams that handle payment controls, executive protection, and identity verification. A message that looks like harmless conversation in one language may be the first stage of credential theft or business compromise.

Key takeaways

Multilingual phishing is a trust problem, not only a content-filtering problem, because attackers exploit tone and cultural cues that translation-based systems miss.
Regional language coverage has to be tested explicitly, since English-trained models can miss localised BEC while also generating false positives in legitimate business mail.
Security teams should pair semantic detection with native-language awareness training so that what users learn matches how attackers actually write.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Non-Human Identity Top 10 address the attack and risk surface, while NIST CSF 2.0 and NIST SP 800-63 set the governance and control requirements practitioners need to meet.

Framework	Control / Reference	Relevance
NIST CSF 2.0	DE.CM-1	Continuous monitoring must account for local-language phishing signals.
NIST SP 800-63		Phishing-resistant trust depends on how users recognise and verify identity cues.
OWASP Non-Human Identity Top 10	NHI-05	Identity abuse patterns inform how attackers exploit trust and communication channels.

Map impersonation and credential-theft paths to identity-risk controls across channels.

Key terms

Multilingual phishing: Phishing that is written to match the language, tone, and business style of the target audience. The attacker relies on local cues such as formality, honorifics, and phrasing to make malicious messages look routine, which makes simple translation-based detection less effective.
Business email compromise: A social engineering attack in which the attacker impersonates a trusted person or organisation to induce payment, credential sharing, or off-channel conversation. The fraud often succeeds because the message feels operationally normal, not because it contains obvious malware or links.
Linguistic trust gap: The gap between how legitimacy is expressed in a local language and how security tools are trained to interpret it. It emerges when models and awareness content are optimised for English or generic templates, leaving regional identity cues under-protected.
Multilingual embeddings: A machine learning representation that maps words and phrases from different languages into a shared semantic space. In security, it helps models compare meaning across languages instead of depending on exact wording, which improves detection of nuanced social engineering.

Deepen your knowledge

NHI governance, agentic AI identity, and machine identity lifecycle are core topics in our NHI Foundation Level course, the industry's only accredited NHI security programme. If you are responsible for identity security strategy or NHI governance in your organisation, it is worth exploring.

This post draws on content published by Abnormal AI: multilingual phishing detection and localised business email compromise. Read the original.

NHIMG Editorial Note
Published by the NHIMG editorial team on 2025-11-06.
NHI Mgmt Group — the independent authority on Non-Human Identity, IAM, and Agentic AI security. nhimg.org