Sensitive data discovery in the cloud is shifting toward DSPM

By NHI Mgmt Group Editorial TeamPublished 2026-02-02Domain: Governance & RiskSource: Cyera

TL;DR: Sensitive data discovery and classification is expanding quickly, with 39% of surveyed organisations already using it, 22% in pilot or proof of concept, 22% planning deployment in the next 12 months, and 71% expecting to increase spending, according to Cyera. Manual implementation remains a bottleneck as cloud deployments accelerate.

At a glance

What this is: This is a cloud DSPM report showing that sensitive data discovery and classification is moving from early adoption into broader investment, with manual implementation still slowing operational rollout.

Why it matters: It matters because IAM, NHI, and cloud teams need data visibility that can keep pace with workload sprawl, privilege decisions, and identity-driven access paths across environments.

By the numbers:

39% of those surveyed are using sensitive data discovery and classification.
22% are in pilot/proof of concept.
71% say they will increase their spending on it in the next 12 months.
Only 13% of organisations feel extremely prepared for the reality of agentic AI despite the majority racing toward autonomous adoption.

👉 Read Cyera's report on sensitive data discovery and classification in the cloud

Context

Sensitive data discovery and classification is the process of finding where sensitive information lives, identifying what it is, and mapping how it is exposed across cloud estates. In practice, many programmes still rely on manual tagging and ad hoc policy application, which does not scale as cloud usage grows and identity-driven access paths multiply.

Cyera's report frames that implementation gap as the central blocker: demand is rising, but organisations struggle to operationalise discovery fast enough to keep pace with deployment velocity. For IAM and governance teams, that means data security posture cannot sit apart from identity controls, because access decisions become harder to defend when data location and sensitivity are not visible.

The primary issue is not whether organisations value DSPM. It is whether they can make discovery, classification, and governance repeatable enough to support cloud-scale operations without relying on labour-intensive workflows.

Key questions

Q: How should security teams use sensitive data discovery to reduce cloud risk?

A: Security teams should use discovery output to identify which identities can actually reach sensitive datasets, then narrow access based on business need. The value is not in cataloguing data alone. It is in connecting classification to entitlement review, so overexposed storage, shared accounts, and broad workload permissions can be prioritised for remediation.

Q: Why does sensitive data classification often fail in cloud environments?

A: It often fails because cloud estates change faster than manual review cycles can keep up. Data is duplicated across services, copied into backups, and accessed through multiple identities, which makes one-time tagging incomplete. When classification is not continuous, organisations end up with stale labels, blind spots, and weak policy enforcement.

Q: What do teams get wrong about deploying DSPM?

A: Teams often treat DSPM as a data cataloguing project instead of a governance control. That misses the point. Classification only becomes useful when it informs access scope, recertification priorities, and response decisions. Without those links, the programme produces visibility without reduction in exposure.

Q: How do organisations decide which datasets to govern first?

A: They should start with datasets that are both sensitive and reachable by broad or persistent identity grants. That means crown-jewel records, shared cloud storage, and replicated copies that are accessible by service accounts or workloads. Prioritising by exposure and sensitivity gives the fastest risk reduction.

Technical breakdown

Why manual data discovery breaks down in cloud environments

Cloud environments change too quickly for periodic, human-led discovery to keep a reliable picture of sensitive data. New storage, ephemeral workloads, shared services, and cross-account access create a moving target, so static inventories become stale almost as soon as they are created. Sensitive data classification also depends on context, because the same dataset can be low risk in one system and highly exposed in another. DSPM exists to make discovery and classification continuous rather than episodic, but the workflow still depends on coverage, policy quality, and integration across cloud services.

Practical implication: replace one-time discovery projects with continuous scanning and classification across all cloud storage and workload layers.

How DSPM changes the relationship between data visibility and identity

DSPM is most useful when it links data sensitivity to who and what can reach it. That includes human users, service accounts, workload identities, and increasingly AI-driven actors that can access cloud data through API-based paths. Without identity context, classification tells you what exists but not who can act on it, which limits risk reduction. When data discovery is paired with access intelligence, teams can identify overexposed datasets, privilege creep, and orphaned access paths that do not show up in traditional inventory exercises.

Practical implication: connect discovery outputs to entitlement and access review processes so exposure can be acted on, not just reported.

Why classification quality matters more than raw coverage

Coverage alone is not enough if the classification engine cannot distinguish truly sensitive data from operational noise. Cloud programmes often contain mixed datasets, duplicated objects, backup copies, and replicated records across multiple services, so poor classification produces both false positives and blind spots. That creates governance fatigue and weakens trust in the control. Effective DSPM needs repeatable rules, exception handling, and integration with policy enforcement so the result is usable by security, compliance, and cloud operations teams.

Practical implication: validate classification accuracy on high-value datasets first, then expand policy enforcement after the false-positive rate is understood.

Cisco DevHub NHI breach — IntelBroker exploited exposed Cisco credentials, API tokens and keys in DevHub.
Zacks Investment Research breach — Zacks breach exposed 12M customer records including credentials.

Read our 52 NHI Breaches Analysis report for a comprehensive view of breaches impacting Non-Human Identities including AI Agents.

NHI Mgmt Group analysis

Data visibility is becoming an identity governance problem, not just a storage problem. Sensitive data discovery only matters when organisations can connect classification outcomes to access paths, service accounts, and workload permissions. In cloud environments, the risk is not simply unknown data. It is unknown data that remains reachable through persistent identity grants. Practitioners should treat DSPM as part of entitlement governance, not a separate reporting layer.

Manual classification does not scale against cloud sprawl. The report's adoption numbers show rising interest, but the implementation model still depends heavily on human effort. That creates a structural gap between how fast cloud assets appear and how slowly governance teams can confirm what they contain. The implication is that data security programmes built on periodic review will lag behind operational reality.

Sensitive data discovery becomes far more valuable when it reduces identity blast radius. Once sensitive data is mapped, teams can prioritise which accounts, service principals, and workloads should lose broad access first. That is the practical value of DSPM for IAM leaders: it turns abstract data classification into a concrete reduction in standing exposure.

Multi-domain governance is now the baseline expectation. Cloud data, identity entitlements, and operational controls can no longer be managed as separate workstreams if the organisation wants a defensible control picture. The market signal here is not just more DSPM adoption. It is that the next stage of governance will be measured by how well identity and data signals are joined together.

NHI governance will increasingly depend on data sensitivity context. Service accounts and workload identities often have broader reach than human users, and that makes hidden sensitive data especially consequential. As more organisations formalise cloud data classification, NHI programmes will need to use that context to narrow access scope and reduce uncontrolled machine-to-data exposure.

From our research:
70% of organisations grant AI systems more access than they would give a human employee performing the exact same job, according to The 2026 Infrastructure Identity Survey.
52% of respondents see AI security decision-making power shifting toward platform and infrastructure teams rather than the executive suite.
For adjacent guidance on lifecycle discipline, see NHI Lifecycle Management Guide for how visibility, provisioning, and offboarding fit together.

What this signals

Identity context is becoming the missing layer in cloud data governance. Once classification tells you where sensitive data sits, the next question is who can reach it and under what conditions. That means cloud security teams should expect DSPM findings to feed directly into entitlement review, not remain isolated in a data programme dashboard.

With 70% of organisations already granting AI systems more access than they would give a human employee performing the exact same job, per the 2026 Infrastructure Identity Survey, exposure management is no longer only about human users. The same data visibility issues now affect AI-driven access paths, which makes joint governance across data, workload identity, and emerging agentic systems a practical requirement.

Cloud programmes should expect governance work to shift from periodic review to continuous prioritisation. The organisations that will manage this well are the ones that can connect discovery outputs to access decisions, then keep refining classification quality as their estate changes. That is where data security becomes operational rather than descriptive.

For practitioners

Map sensitive data to effective identity reach Link classification results to human, service account, and workload entitlements so teams can see which identities can reach the highest-value datasets. Prioritise accounts with broad cross-cloud access and privilege that exceeds their data need.
Replace manual discovery with continuous coverage Move from one-time classification exercises to recurring scans across cloud storage, backup locations, and shared services. Focus on the systems where data duplication and replication make manual inventories unreliable.
Use exposure to drive access review order Start access recertification with datasets that are both sensitive and broadly reachable, then work outward to lower-risk data. This turns classification into a governance queue instead of a static report.
Validate classification quality on high-value data first Test the accuracy of rules against crown-jewel datasets before extending the policy set across the full estate. Track false positives, false negatives, and unresolved exceptions so the control remains operationally useful.

Key takeaways

Sensitive data discovery is maturing, but manual implementation remains the main barrier to scale in cloud environments.
Classification is only operationally useful when it is tied to identity reach, access review, and exposure reduction.
Cloud teams should treat DSPM as part of governance execution, not as a standalone inventory or reporting exercise.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Non-Human Identity Top 10 address the attack and risk surface, while NIST CSF 2.0 and NIST Zero Trust (SP 800-207) set the governance and control requirements practitioners need to meet.

Framework	Control / Reference	Relevance
NIST CSF 2.0	PR.DS	Data security outcomes depend on knowing where sensitive data resides and who can access it.
OWASP Non-Human Identity Top 10	NHI-03	Non-human credentials often widen data reach beyond intended scope.
NIST Zero Trust (SP 800-207)	AC-4	Zero trust requires access decisions based on explicit context, including data sensitivity.

Tie data classification to conditional access so sensitive datasets are only reachable by justified identities.

Key terms

Sensitive Data Discovery: Sensitive data discovery is the process of locating where protected or regulated information exists across systems, storage, and workflows. In cloud environments, it must be continuous because assets appear, move, and replicate quickly, making one-off inventories unreliable for governance or incident response.
Sensitive Data Classification: Sensitive data classification is the act of assigning sensitivity labels or policy categories to data so organisations can apply the right controls. Effective classification is not just tagging. It has to be accurate enough to inform access decisions, retention handling, and remediation priorities.
Data Security Posture Management: Data Security Posture Management, or DSPM, is the control discipline for finding sensitive data, assessing exposure, and tracking whether protections match the data's risk. It becomes most useful when paired with identity context, because data risk is shaped by who and what can reach the data.
Identity Blast Radius: Identity blast radius is the amount of data, systems, or operations an identity can affect if it is over-privileged or misused. In cloud governance, reducing blast radius means tying sensitive data visibility to entitlement scope so broad access is easier to spot and remove.

Deepen your knowledge

Sensitive data discovery and classification in cloud environments is a core topic in our NHI Foundation Level course, the industry's only accredited NHI security programme. If your team is trying to connect data visibility with identity governance, it is a relevant starting point.

This post draws on content published by Cyera: Securing More Data in More Places With Sensitive Data Discovery and Classification in the Cloud. Read the original.

NHIMG Editorial Note
Published by the NHIMG editorial team on 2026-02-02.
NHI Mgmt Group — the independent authority on Non-Human Identity, IAM, and Agentic AI security. nhimg.org