Subscribe to the Non-Human & AI Identity Journal

Data-centric zero trust

A zero trust model that treats the data itself as the primary control boundary. Rather than relying mainly on network location or device trust, it asks whether a subject, workload, or AI system should access specific data for a specific purpose under current policy.

Expanded Definition

Data-centric zero trust shifts the policy boundary from the network perimeter to the data object, record, or dataset. In NHI and Agentic AI environments, that means access decisions are evaluated against current purpose, identity, context, and policy, not simply whether a workload sits inside a trusted subnet. This aligns with the core model in NIST SP 800-207 Zero Trust Architecture, but the data-centric variant adds stronger emphasis on protecting the information itself through classification, encryption, tokenisation, purpose limitation, and fine-grained authorisation.

Definitions vary across vendors because some products describe any policy-based access layer as data-centric zero trust, while others require explicit binding between data labels, subject identity, and allowed use. In NHI security, that distinction matters: a service account, API key, or AI agent may be authenticated yet still be inappropriate for a particular dataset. The model becomes especially important when secrets, prompts, training data, and retrieval sources are all governed differently.

The most common misapplication is treating network segmentation as if it were data control, which occurs when teams assume that an internal workload automatically deserves broad access to sensitive data.

Examples and Use Cases

Implementing data-centric zero trust rigorously often introduces more policy design and metadata management, requiring organisations to weigh tighter data protection against higher operational complexity and slower onboarding.

  • An AI agent is allowed to retrieve customer records only after policy verifies the request purpose, tenant scope, and current session risk, rather than merely checking that the agent is running inside a trusted cluster.
  • A build pipeline can read configuration secrets from a controlled vault, but it cannot export them to logs, artifacts, or downstream tools unless the data policy explicitly permits that transformation.
  • A service account may query payment data for reconciliation, yet row-level or field-level controls block full-table reads and prevent reuse of that data outside the approved workflow.
  • SPIFFE-based workload identity is used to establish who the caller is, while a separate data policy determines whether the caller may touch the requested dataset; see Guide to SPIFFE and SPIRE.
  • Policy teams classify documents and embeddings so retrieval-augmented generation only returns approved content, using the governance guidance in Ultimate Guide to NHIs – Standards alongside NIST SP 800-207 Zero Trust Architecture.

Why It Matters in NHI Security

Data-centric zero trust closes a major gap in NHI governance: machine identities are often over-privileged, long-lived, and too broadly trusted once authenticated. NHIMG research shows that 97% of NHIs carry excessive privileges and 80% of identity breaches involved compromised non-human identities such as service accounts and API keys, which makes broad data access especially dangerous. The principle is reinforced by NHIMG guidance in the Ultimate Guide to NHIs – Key Research and Survey Results, where the operational problem is not just identity issuance but durable control over what those identities can actually read, write, or exfiltrate.

For Agentic AI, the risk is even sharper because the system may chain tool calls, retrieve data repeatedly, and propagate sensitive information into outputs, memory, or downstream workflows. A data-centric model gives security teams a way to constrain exposure at the point of use rather than relying on perimeter assumptions that fail when credentials are stolen or workloads are repurposed. Organisations typically encounter the need for this control only after a service account, pipeline, or agent has already accessed sensitive data beyond its intended purpose, at which point data-centric zero trust becomes operationally unavoidable to address.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Non-Human Identity Top 10 address the attack and risk surface, while NIST CSF 2.0 and NIST Zero Trust (SP 800-207) set the governance and control requirements practitioners need to meet.

Framework Control / Reference Relevance
NIST CSF 2.0 PR.AC-3 Access enforcement should be limited by context and least privilege for data use.
NIST Zero Trust (SP 800-207) 0 NIST zero trust defines continuous verification and policy-driven access decisions.
OWASP Non-Human Identity Top 10 NHI-01 Over-privileged non-human identities directly undermine data-centric zero trust.

Tie data access to verified identity, context, and least privilege before allowing retrieval or processing.