LangChain vulnerabilities expose how AI frameworks can drain enterprise data

By NHI Mgmt Group Editorial TeamPublished 2026-04-07Domain: Breaches & IncidentsSource: Cyera

TL;DR: Cyera reports three LangChain and LangGraph vulnerabilities, including one critical flaw, that can expose filesystem files, environment secrets, and conversation history across widely deployed AI infrastructure with roughly 847 million combined PyPI downloads. The real governance issue is that AI frameworks behave like data pipelines, so existing IAM and data security controls must extend into the agent runtime, not stop at the app boundary.

At a glance

What this is: Cyera found three vulnerabilities in LangChain and LangGraph that can leak files, secrets, and conversation history from enterprise AI deployments.

Why it matters: For IAM and NHI teams, the case shows that AI frameworks can become hidden data conduits unless runtime access, secret handling, and lineage are governed explicitly.

By the numbers:

The LangChain family has reached roughly 847 million total downloads across langchain, langchain-core, and langchain-community.
The Trivy supply chain compromise spread across 5 ecosystems in March 2026.

👉 Read Cyera's analysis of LangChain vulnerabilities and AI data exposure

Context

LangChain has become part of the hidden infrastructure behind enterprise AI applications, which means it sits in the path of prompts, retrieved documents, environment secrets, and conversation memory. That makes it an NHI governance problem as much as an application security problem, because the framework can move sensitive data even when teams do not treat it as a security boundary.

Cyera’s analysis matters because AI frameworks often inherit trust from the surrounding application even when they handle untrusted inputs, load files from disk, or deserialize state from a checkpoint store. That mismatch between assumed trust and actual data flow is exactly where NHI risk accumulates.

The Trivy supply chain compromise in the same article is a useful reminder that small control failures can cascade through developer tooling, AI middleware, and downstream ecosystems. The LangChain findings are not an isolated edge case; they are a typical outcome when framework visibility and runtime governance lag adoption.

Key questions

Q: How should security teams handle hidden AI framework dependencies in enterprise environments?

A: Treat AI frameworks as governed infrastructure, not incidental libraries. Build an inventory of every service that uses them directly or transitively, track versions, and map the data each instance can reach. If you cannot see the framework, you cannot scope its access, patch it reliably, or explain the blast radius after an incident. Visibility is the first control.

Q: Why do AI frameworks create new NHI governance risks?

A: AI frameworks often sit between identities, tools, secrets, and persistent memory, so they can amplify a small coding flaw into broad data exposure. They handle file reads, environment variables, and workflow state, which means the framework layer itself becomes part of the access path. That makes identity, secret, and data governance inseparable in agentic systems.

Q: What breaks when prompt loading or deserialisation is not constrained?

A: Unconstrained prompt loading can turn a harmless configuration reference into local file disclosure, while permissive deserialisation can reinterpret attacker-controlled data as trusted framework objects. In practice, that means secrets, configuration files, and memory content can leak through features developers thought were routine. The failure is trust boundary confusion, not just a single bug.

Q: How do teams reduce the blast radius of vulnerable AI middleware?

A: Patch affected packages, but do not stop there. Restrict where framework components can read files, disable secret resolution for untrusted inputs, validate checkpoint metadata, and review every downstream wrapper that may inherit the same flaw. The goal is to make one vulnerable dependency less able to expose the rest of the stack.

Technical breakdown

How path traversal in prompt loading exposes local files

LangChain’s prompt-loading helpers accept file paths from configuration and read template or example content directly from disk. When those paths are not canonicalised or restricted to a base directory, an attacker can replace a harmless template reference with traversal sequences and point the loader at sensitive local files. In this case, the risk is not only arbitrary file read in the classic sense. The files most likely to be exposed are configuration artifacts that routinely contain credentials, tokens, and infrastructure details. That makes the vulnerability especially relevant in AI systems that let users upload or share prompt configurations.

Practical implication: Restrict prompt file loading to approved directories and treat user-supplied prompt configs as untrusted input.

Why deserialisation bugs become secret extraction paths

LangChain’s serialisation model uses marker fields to distinguish framework objects from ordinary dictionaries. If attacker-controlled data survives a serialise-deserialise round trip with those markers intact, the reviver logic may reinterpret it as a trusted object and resolve embedded references such as secrets from environment variables. This is a classic trust-boundary failure: the framework assumes structure implies authenticity. In AI systems, that matters because model outputs, tool responses, and metadata fields can all carry attacker influence back into internal processing paths. Once deserialisation logic reaches for secrets automatically, the framework becomes a credential oracle.

Practical implication: Never deserialise untrusted AI outputs or metadata into structures that can resolve secrets or framework objects.

How SQL injection appears in checkpoint and memory stores

Stateful agent systems often persist chat history, tool results, and checkpoint metadata in databases so workflows can resume later. If user-controlled values are turned into SQL filter keys or query fragments without parameterisation and validation, the checkpoint layer becomes an injection point. In agentic AI, that is particularly dangerous because the memory store can hold the most sensitive conversation and workflow context in the system. The issue is not limited to one database engine. Any design that treats dynamic metadata as query structure can turn persistence into a control-plane vulnerability.

Practical implication: Parameterise checkpoint queries and validate metadata keys before they are used in persistence logic.

Threat narrative

Attacker objective: The attacker wants to turn the AI framework into a data theft and credential harvesting layer that exposes secrets, memory, and internal files.

Entry occurs when attacker-controlled prompt configuration or framework metadata reaches LangChain through a shared template, tool output, or checkpoint workflow.
Escalation happens when path traversal, deserialisation, or SQL injection converts that input into file reads, secret resolution, or database manipulation.
Impact is the extraction of filesystem content, environment secrets, and conversation history from the AI application and its supporting infrastructure.

Shai Hulud npm malware campaign — Shai Hulud campaign: npm malware exposed secrets on GitHub.
Reviewdog GitHub Action supply chain attack — reviewdog/action-setup GitHub Action supply chain attack exposed secrets.

Read our 52 NHI Breaches Analysis report for a comprehensive view of breaches impacting Non-Human Identities including AI Agents.

NHI Mgmt Group analysis

AI frameworks are now data infrastructure, not just developer libraries. When a framework routes prompts, memory, files, and tools, it sits inside the enterprise trust boundary whether teams document it or not. That means NHI governance has to extend to framework runtime paths, not stop at account provisioning. Practitioners should treat framework code as an access path with its own controls.

Classic AppSec flaws are reappearing inside agentic AI stacks. Path traversal, deserialisation, and SQL injection are not new, but their impact changes when they sit underneath autonomous workflows that touch secrets and customer data. The category shift is important: AI systems amplify old bugs by giving them privileged context, persistence, and more places to land. Security teams should stop assuming AI risk only means model abuse.

Ephemeral trust in AI runtimes creates identity blast radius. A prompt loader, checkpoint store, or metadata parser can become a secondary identity boundary that silently inherits access to secrets and data. That is why the framework layer needs the same scrutiny as service accounts and tokens. Practitioners should map every framework action to the data and identities it can reach.

Visibility is the control gap that turns framework bugs into enterprise incidents. Many organisations cannot tell where LangChain is deployed, what version is running, or what data flows through it. Without that inventory, patching becomes reactive and data exposure becomes unmeasurable. The practical conclusion is simple: no NHI programme is complete until AI middleware is discoverable, scoped, and monitored.

From our research:
96% of technology professionals identify AI agents as a growing security threat, and 66% believe this risk is immediate, according to AI Agents: The New Attack Surface report.
Only 52% of companies can track and audit the data their AI agents access, leaving 48% with a complete blind spot for compliance and breach investigation.
That visibility gap is already shaping the response path, as AI Agents: The New Attack Surface report shows governance still trails deployment.

What this signals

LangChain visibility debt: the practical risk is not just a vulnerable package, but a framework layer that many organisations cannot inventory at all. With 98% of companies planning to deploy even more AI agents within the next 12 months, per AI Agents: The New Attack Surface report, the governance gap will widen unless teams can map AI middleware to data classes and responsible owners.

Security teams should expect agentic workflows to pull more of the identity stack into scope, including memory stores, shared prompt repositories, and tool execution paths. That means the programme has to connect runtime discovery with NHI controls, secret handling, and data classification rather than treating them as separate projects.

The operational signal to watch is not whether an AI model is safe in isolation, but whether the surrounding framework can read, retain, or replay data that should have remained ephemeral. Align this work with the NIST AI Risk Management Framework and the OWASP Agentic AI Top 10 so engineering and governance share the same control language.

For practitioners

Inventory AI framework dependencies Identify every service, pipeline, and internal tool that imports LangChain or LangGraph directly or transitively. Record versions, deployment locations, and the data classes each instance can touch, including prompts, memory stores, and environment secrets.
Restrict prompt loading to trusted paths Block user-controlled file paths and enforce base-directory checks for any prompt or template loader. Review shared prompt marketplaces, configuration APIs, and chain-loading features for traversal risk before they are exposed to production users.
Harden secret handling in deserialisation flows Disable secret resolution for untrusted objects and audit every code path that serialises model output, tool responses, or metadata. Treat additional_kwargs and response_metadata as attacker-influenced until proven otherwise.
Parameterise agent memory queries Validate every metadata key before it reaches SQL and use parameterised queries in checkpoint stores. Review persistence layers for any place where user-controlled strings can become query structure or filter names.
Patch and verify transitive exposure Upgrade affected packages, then re-scan downstream libraries and wrappers that may bundle vulnerable versions indirectly. Confirm the fix across build artifacts, container images, and deployed environments, not just source repositories.

Key takeaways

AI frameworks can function as hidden data pipes, which means routine coding flaws may expose files, secrets, and conversation history.
The article’s 847 million combined downloads figure shows that the attack surface is not niche, and the visibility gap is already a governance problem.
Teams need inventory, path restriction, secret-hardening, and checkpoint query validation before agentic workflows become harder to contain.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Agentic AI Top 10 address the attack and risk surface, while NIST AI RMF and NIST CSF 2.0 set the governance and control requirements practitioners need to meet.

Framework	Control / Reference	Relevance
OWASP Agentic AI Top 10	A2	Prompt loading and tool use expose agentic input trust failures.
NIST AI RMF	GV.1	Agentic frameworks need accountable governance for data and secret handling.
NIST CSF 2.0	PR.AC-4	Framework access to files, memory, and secrets maps to least-privilege enforcement.

Validate agent inputs and constrain tool and prompt paths before they reach runtime execution.

Key terms

AI Framework Runtime: The runtime is the execution layer where an AI framework loads prompts, stores memory, calls tools, and moves data between components. In practice, it becomes part of the trust boundary because it can read files, secrets, and persisted state while orchestrating agent behavior.
Prompt Loading Path: A prompt loading path is the mechanism an AI framework uses to fetch templates, examples, or configuration from files or external sources. When that path accepts untrusted input without restriction, it can become a file disclosure channel rather than a harmless convenience feature.
Checkpoint Store: A checkpoint store is the persistence layer that saves agent state, conversation history, and workflow context so execution can resume later. Because it often contains sensitive data and metadata, weak query handling or access control in this layer can turn memory into an attack surface.
Deserialisation Trust Boundary: A deserialisation trust boundary is the point where stored or received data is converted back into executable application objects. In AI systems, crossing that boundary unsafely can let attacker-controlled structures trigger secret lookup, object reconstruction, or other privileged behavior.

Deepen your knowledge

LangChain and LangGraph risk management are covered in our NHI Foundation Level course, the industry's only accredited NHI security programme. If your team is trying to govern AI middleware with existing IAM habits, the course helps reset the control model.

This post draws on content published by Cyera: LangDrained, three paths to your data through LangChain, the world's most popular AI framework. Read the original.

NHIMG Editorial Note
Published by the NHIMG editorial team on 2026-04-07.
NHI Mgmt Group — the independent authority on Non-Human Identity, IAM, and Agentic AI security. nhimg.org