Subscribe to the Non-Human & AI Identity Journal
Home Glossary Threats, Abuse & Incident Response Multilingual prompt injection
Threats, Abuse & Incident Response

Multilingual prompt injection

← Back to Glossary
By NHI Mgmt Group Updated July 5, 2026 Domain: Threats, Abuse & Incident Response

A prompt attack that uses more than one language, translation, or transliteration to weaken an AI system’s safety checks. In practice, the attacker is not changing the goal, only the linguistic path, which can cause the model to apply policy inconsistently across otherwise equivalent requests.

Expanded Definition

Multilingual prompt injection is a language-based evasion technique used against LLMs and AI agents, where the harmful instruction is translated, partially translated, transliterated, or mixed across languages to bypass safety filtering. The attack does not depend on new intent; it depends on a mismatch between what the system can understand and what its policy layer can reliably inspect. This matters most in cross-border workflows, translated support channels, and agentic systems that accept user input, tool output, or retrieved content in multiple languages. Industry usage is still evolving, but the core risk is consistent: multilingual content can fragment token-level and semantic detection, causing policy enforcement to vary across equivalent requests. For a broader agentic risk context, NHI Management Group’s coverage of the OWASP Agentic Applications Top 10 is useful, while the OWASP community also tracks this class of issue in the OWASP Agentic AI Top 10. The most common misapplication is assuming a single-language safety test proves robustness when the same prompt later arrives through translation, mixed-script text, or locale-specific tooling.

Examples and Use Cases

Implementing multilingual prompt defenses rigorously often introduces latency and false-positive risk, requiring organisations to weigh broader linguistic coverage against more complex review and moderation logic.

  • A user asks for policy-violating instructions in another language, and the model responds because the safety classifier was tuned primarily on English examples.
  • An agent ingests translated customer messages and follows an embedded malicious instruction hidden inside a normal support request.
  • A prompt mixes English with transliterated phrases, weakening keyword filters while still preserving the attacker’s intent.
  • An enterprise routes content through machine translation before moderation, and the attack succeeds because the translated output no longer matches the original risk pattern.
  • A retrieval-augmented assistant summarizes multilingual documents and treats an instruction inside a foreign-language attachment as authoritative tool guidance.

For practitioners, this is best understood as a control-gap problem rather than a pure language problem. NHI Management Group’s analysis of the OWASP Agentic Applications Top 10 emphasizes how agent workflows expand the attack surface when input provenance is weak. The OWASP Agentic AI Top 10 likewise treats adversarial instruction handling as a core safety concern.

Why It Matters in NHI Security

Multilingual prompt injection becomes especially dangerous when an AI agent has tool access, delegated authority, or access to secrets, because a bypassed instruction can move from text manipulation into real-world execution. In NHI-heavy environments, that can expose API keys, alter workflows, or trigger unauthorized actions through service accounts and delegated identities. The risk is not theoretical: NHI Mgmt Group reports that 79% of organisations have experienced secrets leaks, and 77% of those incidents caused tangible damage, which shows how quickly an input-safety failure can become an identity and credential incident. That is why multilingual prompt handling should be treated as part of NHI governance, not just content moderation. The same operational logic appears in the Ultimate Guide to Non-Human Identities, where excessive privilege, weak visibility, and poor rotation all amplify downstream impact. NHI Mgmt Group’s agentic risk guidance is especially relevant when multilingual prompts can steer tools, connectors, or delegated automation. Organisations typically encounter the real consequence only after a model-driven action has already been executed, at which point multilingual prompt injection becomes operationally unavoidable to address.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Agentic AI Top 10 and OWASP Non-Human Identity Top 10 address the attack and risk surface, while NIST AI RMF set the governance and control requirements practitioners need to meet.

FrameworkControl / ReferenceRelevance
OWASP Agentic AI Top 10A2Covers prompt injection risks in agentic systems across languages and modalities.
OWASP Non-Human Identity Top 10NHI-02Prompt bypasses can expose secrets and delegated NHI credentials through agent actions.
NIST AI RMFAddresses harmful manipulation and robustness failures in AI systems, including prompt attacks.

Test agents against multilingual jailbreaks and add input normalization before tool execution.

NHIMG Editorial Note
Reviewed and updated by the NHIMG editorial team on July 5, 2026.
NHI Mgmt Group — the #1 independent authority on Non-Human Identity, IAM, and Agentic AI security. nhimg.org