GenAI data leakage exposes the limits of regex-based DLP

By NHI Mgmt Group Editorial TeamPublished 2025-08-27Domain: Best PracticesSource: Lakera

TL;DR: Traditional DLP was built for static artifacts like emails, files, and endpoints, but GenAI systems now paraphrase, summarize, translate, and trigger tool calls that expose sensitive data in real time, according to Lakera. Legacy pattern matching cannot keep pace with language-native leakage or agent workflows, so runtime, context-aware controls are now mandatory.

At a glance

What this is: This is an analysis of why traditional DLP fails in GenAI environments and what language-native, runtime controls must replace it.

Why it matters: It matters because IAM and security teams now have to govern not just who can access data, but what models and agents can infer, transform, and leak from it.

👉 Read Lakera's analysis of why traditional DLP fails for GenAI

Context

Data leakage prevention was designed for a world where sensitive information stayed in predictable forms such as files, emails, and database rows. GenAI changes that assumption by turning data into prompts, summaries, translations, and tool-driven outputs that no longer resemble the original source.

For IAM, NHI, and security teams, the problem is no longer limited to blocking access at the perimeter. The control question shifts to whether a model, agent, or user is allowed to see, reason over, and regenerate the data in the first place, especially when the output may expose information no regex rule would match.

Key questions

Q: How should security teams stop GenAI systems from leaking sensitive data?

A: Security teams should combine runtime policy enforcement, semantic detection, and identity-aware access checks. The goal is not to block every model response, but to prevent the model from seeing or transforming data the requester is not authorised to use. That means guarding prompts, retrieval, memory, outputs, and tool calls together.

Q: Why do traditional DLP tools struggle with GenAI and agents?

A: Traditional DLP tools depend on static patterns and predictable content, while GenAI rewrites information in real time. Once data is paraphrased, translated, or summarised, exact-match controls lose sight of it. Agents make this worse by chaining retrieval and actions, so the leak may occur outside the final output.

Q: What breaks when DLP only inspects prompts and outputs?

A: What breaks is the assumption that the risky event happens at the edge of the conversation. In practice, sensitive data can be exposed during retrieval, memory access, function calls, or intermediate reasoning steps. If you only inspect prompts and outputs, you miss the path where the leak actually occurs.

Q: How do organisations govern sensitive data in AI agents and LLM workflows?

A: Organisations should treat sensitive data governance as a runtime identity and context problem. That means authorising access based on who is asking, what the model can infer, and how the output will be used. The strongest controls sit inside the workflow, not around it.

Technical breakdown

Why regex-based DLP misses GenAI leakage

Traditional DLP works by matching patterns such as keywords, regular expressions, metadata, and known data formats. That works when sensitive content is static and predictable, but LLMs transform data by paraphrasing, translating, summarizing, and recombining it. Once the meaning is preserved but the surface form changes, syntax-based controls lose visibility. The failure is structural: the protection model assumes the output will still resemble the input. GenAI breaks that assumption by generating new language that can still carry confidential meaning.

Practical implication: move beyond exact-match detection and test whether your controls can recognise sensitive meaning after transformation.

How agent workflows expand the leakage surface

Agents widen the risk because they do more than answer prompts. They retrieve context, call APIs, store memory, and chain actions without requiring each step to look dangerous in isolation. A single request can therefore pull sensitive records into a summary, forward internal material into another system, or expose credentials through a tool call. This is why input and output scanning alone is insufficient. The risk lives in the full workflow, including intermediate state, tool selection, and message passing across steps.

Practical implication: inspect the full agent execution path, not just the final response, when designing guardrails and detections.

What language-native DLP changes in practice

Language-native DLP replaces static string matching with contextual interpretation. It evaluates meaning, intent, audience, and usage at runtime, which makes it better suited to GenAI systems that work through summarization and transformation rather than direct retrieval. That approach does not remove governance needs, but it aligns enforcement with how the model actually behaves. For identity teams, this is the difference between assuming access is the control and recognising that access plus transformation is the real security boundary.

Practical implication: adopt runtime policy enforcement that understands context and intent before sensitive content is exposed or reused.

Threat narrative

Attacker objective: The attacker wants sensitive data to leave protected systems through model output or agent action without a classic breach signal.

Entry occurs when a user supplies a benign-looking prompt or an agent receives a request that causes it to retrieve internal context, memory, or connected data sources.
Escalation happens when the model paraphrases, translates, or summarises sensitive material, or when an agent uses tool calls and function execution to surface data beyond the user’s intended scope.
Impact is the leakage of confidential information through generated text, forwarded content, or automated actions that bypass traditional pattern-based DLP.
The attacker objective is to extract sensitive information without triggering conventional file-based or regex-based controls.

MongoBleed breach — MongoBleed exposed secrets across 87K MongoDB servers.
ASP.NET machine keys RCE attack — 3,000+ exposed ASP.NET machine keys enabled remote code execution.

Read our 52 NHI Breaches Analysis report for a comprehensive view of breaches impacting Non-Human Identities including AI Agents.

NHI Mgmt Group analysis

Regex-era DLP is an access-control model for a transformation problem. Traditional controls were designed to spot known strings in known places, but GenAI changes the unit of risk from file or field to meaning. Once a model can paraphrase, translate, and synthesise information, the control boundary is no longer the literal token. Practitioners should treat this as a governance mismatch, not a tuning problem.

Language-native detection is the named concept this category now needs. The article’s core insight is that sensitive data protection must understand intent and semantics, not just syntax. That shift matters across human IAM, NHI workflows, and agentic systems because the leakage event often happens after authorised access, inside the generation step. Security teams should stop measuring only blocked inputs and start measuring whether outputs preserve confidentiality.

Agentic leakage is a workflow failure, not an output failure. The model path now includes memory retrieval, API calls, tool use, and recursive summarisation, so the harmful event may occur several steps before the final text appears. This is why legacy DLP assumptions about single-point inspection fail. The practitioner takeaway is to govern the whole execution chain, not the last response.

Context-based authorisation has become the real control plane for GenAI data use. The article shows that who is asking, what they are allowed to infer, and how the data will be reused all matter as much as the raw content itself. That makes identity, purpose, and runtime context inseparable in modern DLP governance. Teams should reframe sensitive data protection as an identity decision with language-aware enforcement.

Zero Trust for data is now a runtime discipline, not a policy statement. Treating every request as untrusted only works if enforcement can evaluate the request in the moment the model responds. That requires runtime guardrails over prompts, outputs, memory, and tool activity, especially in environments where agents act faster than review cycles. Practitioners need controls that operate at the speed of generation, not the speed of incident response.

From our research:
88% of security professionals are concerned about secrets sprawl, with 49% of those in larger organisations described as "very concerned", according to The 2024 State of Secrets Management Survey.
54% of organisations are dissatisfied with their current secrets management solution because not all secrets are secured, and 43% cite lack of central management.
For lifecycle and rotation context, see NHI Lifecycle Management Guide for how access governance changes when credentials are ephemeral.

What this signals

Language-native DLP will become a baseline expectation for GenAI programmes. Teams that still rely on regex-based inspection will miss the majority of meaningful leakage paths once users and agents start paraphrasing, translating, and summarising internal content. The programme signal to watch is whether controls understand meaning at runtime rather than just matching strings after the fact.

Secret sprawl and data leakage are converging governance problems. With 88% of security professionals already concerned about secrets sprawl, per The 2024 State of Secrets Management Survey, the next failure mode is not just exposure but transformation. If models can reach too much data, DLP becomes the last line of defence instead of a control boundary.

Runtime visibility will matter more than policy volume. Teams should expect stronger demand for controls that inspect memory, retrieval, and tool activity across GenAI workflows, especially where agents can take actions on behalf of users. The organisations that can explain and constrain model behaviour in context will have a cleaner path to adoption.

For practitioners

Map data flows across the full GenAI stack Track prompts, embeddings, context windows, retrieval sources, memory, outputs, and downstream tool calls so you know where sensitive data can appear and reappear.
Enforce access at the moment of generation Allow models and agents to reach only the data the requesting identity is authorised to see at that exact interaction, not the broader repository by default.
Inspect agent reasoning and tool use Monitor memory access, function calls, and message passing, because many leaks happen in intermediate steps that never show up in a simple prompt or response log.
Deploy language-aware detection for sensitive content Use semantic detectors that can identify paraphrased, translated, or synthesised confidential material instead of relying only on regex and keywords.
Red-team for prompt injection and chained actions Simulate how a model or agent might be induced to summarise restricted data, forward hidden content, or escalate leakage through multi-step workflows.

Key takeaways

Traditional DLP fails in GenAI because it was built for static content, not meaning that is rewritten in real time.
Agents expand leakage risk by combining retrieval, reasoning, and tool use into workflows that legacy inspection never sees end to end.
The practical response is runtime, language-aware, identity-based enforcement across prompts, memory, outputs, and actions.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Non-Human Identity Top 10 address the attack and risk surface, while NIST CSF 2.0 and NIST Zero Trust (SP 800-207) set the governance and control requirements practitioners need to meet.

Framework	Control / Reference	Relevance
OWASP Non-Human Identity Top 10	NHI-03	GenAI leakage often exposes secrets and sensitive context through runtime workflows.
NIST CSF 2.0	PR.AC-4	Access control must account for context and authorised data use in GenAI workflows.
NIST Zero Trust (SP 800-207)	SC-7	Zero Trust principles fit runtime inspection and explicit trust decisions for model responses.

Treat prompts, memory, and tool calls as sensitive attack surfaces and enforce least privilege at runtime.

Key terms

Language-native DLP: A data protection approach that evaluates meaning, context, and intent rather than only matching fixed patterns. In GenAI environments, it is designed to catch sensitive content after paraphrasing, translation, or summarisation, when the original syntax has changed but the confidentiality risk remains.
Runtime guardrails: Controls that enforce policy while a model or agent is actively processing data, not after the fact. They inspect prompts, outputs, memory, and tool use in the moment, which makes them better suited to GenAI systems where leakage can happen during generation rather than at storage boundaries.
Agent workflow: The end-to-end sequence an AI agent follows when it retrieves context, reasons over it, calls tools, and produces output. For security teams, the workflow matters because the risky action may occur in an intermediate step, long before any final answer is displayed to a user.
Sensitive data transformation: The process by which protected information is rewritten into a new form such as a summary, translation, or paraphrase. The content may no longer match traditional detection rules, but it can still reveal confidential meaning and therefore needs policy controls that understand semantics.

What's in the full article

Lakera's full article covers the operational detail this post intentionally leaves for the source:

How the vendor frames language-native DLP detectors and what that means for implementation choices.
Examples of GenAI leakage patterns involving summarisation, translation, and tool-driven workflows.
Practical checklist items for teams building guardrails into LLM applications and agent pipelines.
The vendor's discussion of red-team exercises and where runtime monitoring fits into a broader GenAI security programme.

👉 Lakera's full post covers language-native detection, runtime guardrails, and agent workflow monitoring in more operational detail.

Deepen your knowledge

NHI governance, agentic AI identity, and machine identity security are core topics in our NHI Foundation Level course, the industry's only accredited NHI security programme. If you are responsible for identity security strategy or operational governance in your organisation, it is worth exploring.

NHIMG Editorial Note
Published by the NHIMG editorial team on 2025-08-27.
NHI Mgmt Group — the independent authority on Non-Human Identity, IAM, and Agentic AI security. nhimg.org