What do security teams get wrong about data discovery programs?

Why This Matters for Security Teams

Data discovery programs are often positioned as a visibility win, but visibility is not risk reduction by itself. If teams can locate sensitive data yet cannot change who can reach it, discovery simply reveals a larger exposure surface. That is especially dangerous when data is spread across SaaS, cloud storage, analytics platforms, and shared workflows where entitlement sprawl outpaces review cycles. NIST’s NIST Cybersecurity Framework 2.0 frames this as a governance and control problem, not just a classification exercise.

NHIMG research shows the same pattern in non-human access: only 5.7% of organisations have full visibility into their service accounts, and 97% of NHIs carry excessive privileges, which means discovery without enforcement leaves the hardest part untouched. The same lesson applies to data programs. Top 10 NHI Issues and the Ultimate Guide to NHIs both emphasize that finding exposure without reducing reachable paths creates backlog, not resilience. In practice, many security teams learn this only after a report proves the data was known, but the overbroad access was never removed.

How It Works in Practice

Effective discovery programs treat findings as inputs to entitlement change, not as an end state. The operational sequence is simple: classify the asset, identify the owner, determine who and what can access it, and then reduce those pathways through policy, access reviews, and automation. That matters because sensitive data is rarely accessed only by humans. Service accounts, API keys, integrations, and batch jobs often move data faster than people do, which is why NHI controls and data controls need to be linked.

A practical program usually combines three layers:

Discovery: identify where sensitive data exists across cloud buckets, databases, SaaS apps, and pipelines.

Ownership: assign a business and technical owner who can approve remediation, exceptions, and retention decisions.

Enforcement: remove stale access, narrow roles, rotate secrets, and apply policy so the data remains reachable only by approved identities.

This is where broader identity discipline matters. Data risk frequently persists because privileged and machine access are not in the same review process. The Ultimate Guide to NHIs — Key Research and Survey Results highlights how often secrets remain valid long after notification, which mirrors the common failure in discovery programs: the issue is documented, but entitlements stay unchanged. Best practice is evolving toward continuous controls, not quarterly clean-up campaigns. For implementation guidance, NIST CSF 2.0 aligns with this by tying visibility to protection and governance outcomes rather than inventory alone. These controls tend to break down when data ownership is unclear across shared platforms because no single team can approve or execute access removal quickly.

Common Variations and Edge Cases

Tighter discovery and remediation often increases operational overhead, requiring organisations to balance faster exposure reduction against change-management friction. That tradeoff is real, especially in environments where analytics teams, developers, and vendors all touch the same dataset. Current guidance suggests that the right answer is not to freeze access, but to make changes more targeted and more frequent.

Several edge cases create blind spots:

Shadow copies and exports: discovered in one system, but duplicated into notebooks, BI tools, tickets, or email attachments.

Service-to-service access: the dataset is technically controlled, but a workload identity can still pull it indirectly through an integration.

Shared ownership: teams know the data is sensitive, but no one owns cleanup across platforms.

Exception-heavy environments: temporary access becomes permanent because expiry and review are not enforced.

Where these programs fail most often is in organisations that equate classification with control. That is guidance, not consensus: there is no universal standard yet for exactly how fast discovery findings must translate into entitlement changes. Still, the direction is clear in NIST guidance and NHIMG research. Discovery should feed lifecycle management, access reduction, and secret hygiene, as outlined in NHI Lifecycle Management Guide. Without that follow-through, teams end up with a very accurate map of where they are exposed and very little reduction in who can actually reach the data.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Non-Human Identity Top 10 address the attack and risk surface, while NIST CSF 2.0 and NIST AI RMF set the governance and control requirements practitioners need to meet.

Framework	Control / Reference	Relevance
NIST CSF 2.0	GV.OV-01	Discovery must feed governance and measurable risk reduction, not just inventory.
OWASP Non-Human Identity Top 10	NHI-03	Excessive and stale machine access often keeps discovered data exposed.
NIST AI RMF	MAP	Discovery programs need mapped ownership, context, and downstream control actions.

Tie discovery findings to governance outcomes and track whether access actually shrinks.

#1 Authority in NHI Education, Research and Advisory, empowering organizations to tackle the critical risks posed by Non-Human Identities (NHIs), including AI Agents.

What do security teams get wrong about data discovery programs?

Why This Matters for Security Teams

How It Works in Practice

Common Variations and Edge Cases

Standards & Framework Alignment

Related resources from NHI Mgmt Group