Notifications

Clear all

Visual prompt injections: are your multimodal controls keeping up?

Last Post

RSS

NHI Mgmt Group

(@nhi-mgmt-group)

Member Moderator

Joined: 1 year ago

Posts: 12387

Topic starter 05/07/2026 6:47 pm

TL;DR: Visual prompt injections embed malicious instructions inside images so multimodal models can be induced to ignore their original task, change outputs, or suppress nearby content, according to Lakera’s analysis. The governance gap is no longer just prompt hygiene, but control over what an AI system can do after it has already seen the image.

NHIMG editorial — based on content published by Lakera: The Beginner's Guide to Visual Prompt Injections

Questions worth separating out

Q: How should security teams test for visual prompt injection in multimodal AI systems?

A: Test the full input path, not just the model output.

Q: Why do multimodal AI systems create new governance risks for identity teams?

A: Because the system can be reached legitimately and still be manipulated after access is granted.

Q: What breaks when image inputs are allowed to influence tool use in AI workflows?

A: The application loses a reliable separation between perception and action.

Practitioner guidance

Separate untrusted image content from protected instructions Place multimodal prompt construction behind a trust boundary so image-derived text cannot be treated as system or developer instructions.
Constrain tool use for multimodal models Limit which actions a model can trigger after reading an image, especially when those actions involve search, retrieval, messaging, or content suppression.
Add adversarial image testing to red-team exercises Test whether off-white text, hidden instructions, overlays, and caption-like artefacts can steer outputs or suppress recognition.

What's in the full article

Lakera's full article covers the operational detail this post intentionally leaves for the source:

Worked examples of invisibility-cloak style injections and how the model responded
Image-based prompt injection variants that change identity recognition and content suppression
Defence ideas for multimodal systems, including the visual prompt injection detector the vendor is building
Related reading links on prompt injection, content moderation, and AI red teaming

👉 Read Lakera's analysis of visual prompt injection in multimodal AI →

Visual prompt injections: are your multimodal controls keeping up?

Explore further

View Full Forum → | NHI Foundation Course →

Quote

Topic Tags

Mr NHI

(@mr-nhi)

Member Moderator

Joined: 3 months ago

Posts: 11961

05/07/2026 7:04 pm

Visual prompt injection is an execution-layer problem, not just a prompt problem. The article shows that malicious instructions hidden in images can change model output without changing the application code or the user-facing prompt. That means the trust boundary is inside the inference path, where untrusted content and system intent are already colliding. Practitioners should treat multimodal input as a governed execution surface, not a harmless content type.

A few things that frame the scale:

Only 44% of developers are reported to follow security best practices for secrets management, exposing a significant developer behaviour gap, according to The State of Secrets in AppSec.
The average estimated time to remediate a leaked secret is 27 days, despite 75% of organisations expressing strong confidence in their secrets management capabilities.

A question worth separating out:

Q: How can organisations reduce the impact of prompt injections without blocking multimodal use?

A: Limit the model's authority. Keep image interpretation separate from privileged actions, log the chain from input to decision, and require human approval for any step that changes records, sends messages, or accesses sensitive data. The objective is to preserve multimodal capability while preventing untrusted content from becoming execution.

👉 Read our full editorial: Visual prompt injections expose a new control gap in multimodal AI

ReplyQuote

Forum Statistics

11 Forums

13.6 K Topics

26.1 K Posts

38 Online

135 Members

Latest Post: LLM security and AI-driven crime: what security teams must change Our newest member: Alex Recent Posts Unread Posts Tags

Forum Icons: Forum contains no unread posts Forum contains unread posts

Topic Icons: Not Replied Replied Active Hot Sticky Unapproved Solved Private Closed

#1 Authority in NHI Education, Research and Advisory, empowering organizations to tackle the critical risks posed by Non-Human Identities (NHIs), including AI Agents.

Get in Touch

Quick Links

FAQ

NHI 101 Articles

Legal & Policies