Why do legacy network controls fall short for data security in AI environments?

Why This Matters for Security Teams

Legacy network controls assume that risk is concentrated at the boundary: inspect the packet, trust the segment, and block what looks suspicious. AI environments break that assumption because data can be copied into prompts, retrieved from embeddings, written to caches, and re-exposed through outputs without ever traversing a neat perimeter in the way classic monitoring expects. That is why data-centric controls matter more than network-only enforcement, especially when sensitive material is reused across agents, tools, and shared model workflows.

Current guidance from NIST SP 800-207 Zero Trust Architecture emphasizes continuous verification over implicit trust, which maps closely to AI data flows. NHIMG research on The State of Secrets in AppSec also shows how often sensitive material survives outside intended controls, with remediation lag and fragmented oversight making network visibility an incomplete signal. In practice, many security teams discover data exposure only after prompts, retrieval layers, or model outputs have already propagated it across systems.

How It Works in Practice

The practical shift is to govern data where it is used, not only where it moves. That means classifying sensitive inputs before they enter an LLM, restricting which datasets a model or agent can retrieve, and enforcing policy at the application, orchestration, and storage layers. Network segmentation still has value, but it becomes one control among several rather than the primary guardrail.

Security teams usually combine the following controls:

Data classification and labeling so prompts, training data, and retrieved context inherit policy.

Tokenization, masking, or redaction before content reaches models or agents.

Runtime policy checks for retrieval, tool calls, and output handling.

Short retention windows for prompts, logs, and caches that may otherwise become shadow stores.

Access controls tied to data sensitivity, not just network location or application tier.

This is where identity and policy governance intersect with AI. A model or agent may be authenticated, but that does not answer what data it should see, remember, or reproduce. That is why the State of Non-Human Identity Security matters: control gaps often begin with visibility and entitlement drift, then show up later as data overexposure. The DeepSeek breach illustrates the broader lesson that sensitive material can surface through AI pathways that are operationally legitimate but still unsafe from a data governance standpoint. These controls tend to break down when AI tools are allowed broad retrieval access to fragmented data estates because the system can faithfully execute policy on the network while still leaking meaning through prompts and outputs.

Common Variations and Edge Cases

Tighter data controls often increase operational overhead, requiring organisations to balance stronger confidentiality against developer speed, model usefulness, and user experience. That tradeoff becomes sharper in retrieval-augmented generation, multi-agent workflows, and analytics environments where context is intentionally reused.

Best practice is evolving, but current guidance suggests a few edge cases need special handling. First, encrypted transport alone does not help if the model can still see the plaintext after decryption. Second, content filtering is not the same as data governance: a prompt filter may block obvious secrets, yet still allow a model to reconstruct sensitive patterns from nearby context. Third, logs and telemetry often become a hidden copy of the same data, so retention and access rules must extend beyond the live inference path.

For organisations following Ultimate Guide to NHIs — Standards, the main lesson is to align data controls with identity-aware enforcement and short-lived access rather than relying on network zones alone. That is especially important when AI systems interact with third-party services or shared caches, where no universal standard exists yet for how much context should be retained versus discarded.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

NIST CSF 2.0, NIST AI RMF and NIST Zero Trust (SP 800-207) set the governance and control requirements practitioners need to meet.

Framework	Control / Reference	Relevance
NIST CSF 2.0	PR.DS	Data security outcomes are the core issue in AI environments.
NIST AI RMF	GOV	AI governance must define how sensitive data is handled across model use.
NIST Zero Trust (SP 800-207)	ID	Zero Trust supports continuous verification instead of perimeter-only trust.

Map AI data flows and enforce protection on prompts, outputs, caches, and logs under PR.DS.

#1 Authority in NHI Education, Research and Advisory, empowering organizations to tackle the critical risks posed by Non-Human Identities (NHIs), including AI Agents.

Why do legacy network controls fall short for data security in AI environments?

Why This Matters for Security Teams

How It Works in Practice

Common Variations and Edge Cases

Standards & Framework Alignment

Related resources from NHI Mgmt Group