Subscribe to the Non-Human & AI Identity Journal

Notifications
Clear all

Data awareness vs classification: what IAM teams need to do


(@nhi-mgmt-group)
Member Moderator
Joined: 1 year ago
Posts: 2827
Topic starter  

TL;DR: Legacy data classification breaks down across unstructured content, GenAI workflows, and contextual business risk, according to Cyera, while Gartner says 75% of organisations with GenAI projects will shift focus to unstructured data security by 2026. Static labels are no longer enough when the security problem is understanding meaning, ownership, and reuse at scale.

NHIMG editorial — based on content published by Cyera: The End of Classification as We Know It: Data Awareness Over Data Labels

Questions worth separating out

Q: How should security teams govern unstructured data for GenAI use cases?

A: Security teams should govern unstructured data by mapping content to business context, human relevance, and downstream AI use paths, not by relying on labels alone.

Q: Why do static labels fail to protect sensitive enterprise content?

A: Static labels fail because they describe a file’s category, not its meaning.

Q: How can organisations tell if classification is working well enough?

A: Classification is working only if it reliably identifies the assets that actually drive business, legal, or competitive risk, including unstructured documents and semantically sensitive material.

Practitioner guidance

  • Inventory unstructured data sources first Map where contracts, roadmaps, Slack exports, PDFs, and machine-generated artifacts live before you try to improve labels.
  • Tie sensitivity to business ownership Associate sensitive files with the business unit, product line, region, or legal domain they affect so policy can reflect real accountability, not just file location.
  • Test classifier performance on meaning, not patterns Measure whether your tools can identify crown-jewel content when keywords, filenames, and regex patterns fail.

What's in the full article

Cyera's full article covers the operational detail this post intentionally leaves for the source:

  • How their document-level classification flow identifies sensitivity in unstructured files without relying on file names or regex
  • How their models connect content to business units, regions, and human subjects to improve contextual policy decisions
  • How they describe the shift from static labels to a data intelligence layer for AI security
  • Why their approach treats precision and recall as operational requirements at scale

👉 Read Cyera's analysis of why classification is giving way to data awareness →

Data awareness vs classification: what IAM teams need to do?

Explore further

View Full Forum →  |  NHI Foundation Course →



   
Quote
(@mr-nhi)
Member Moderator
Joined: 4 weeks ago
Posts: 1125
 

Static classification is no longer a reliable governance premise for AI-era data security. The old assumption was that sensitive data could be identified by labels, patterns, and manual review before exposure became meaningful. That assumption fails when the most important information is unstructured, business-specific, and constantly reused by GenAI systems. The implication is that governance must move from label management to contextual data awareness.

A few things that frame the scale:

  • The average estimated time to remediate a leaked secret is 27 days, despite 75% of organisations expressing strong confidence in their secrets management capabilities, according to The State of Secrets in AppSec.
  • Only 44% of developers are reported to follow security best practices for secrets management, which helps explain why governance gaps persist even when confidence is high.

A question worth separating out:

Q: Should organisations prioritise data awareness over manual tagging?

A: Yes, because manual tagging does not scale to distributed content, frequent collaboration, and AI-driven reuse. Data awareness gives organisations a better view of what the data means, who it relates to, and how it may be misused. Manual tagging may still support exceptions, but it should not be the primary control.

👉 Read our full editorial: Data awareness is replacing classification in AI security



   
ReplyQuote
Share: