TL;DR: AI tools are already moving 84 percent of enterprise data, and nearly 72 percent of those tools are classified as high or critical risk, according to Cyera's AI Security for Dummies special edition. The governance gap is not the model itself but the data and access paths AI can reach, which makes data-first controls the practical starting point.
NHIMG editorial — based on content published by Cyera: The AI Worked Perfectly. That Was the Problem
By the numbers:
- 84 percent of enterprise data is already flowing through AI tools.
- 72 percent of those tools are classified as, fied as high or critical risk.
Questions worth separating out
Q: What breaks when AI systems can reach too many data sources?
A: The main failure is not that the model becomes inaccurate, but that authorised access turns into unintended disclosure.
Q: Why do AI tools complicate IAM governance?
A: They complicate IAM because the access subject may be a human user, an embedded service, or an AI system pulling data on behalf of a workflow.
Q: How do security teams know whether AI access is actually working safely?
A: Look for three signals: complete discovery of the AI estate, clear mapping of source data to each system, and logs that prove what was accessed and why.
Practitioner guidance
- Inventory all AI systems and their access paths Classify public AI, embedded AI, and homegrown AI separately, then map the documents, APIs, databases, and email sources each one can reach.
- Review compound access paths before production use Test whether an AI response can combine data from multiple sources into a disclosure that would not be obvious from any single entitlement review.
- Require audit logs and data lineage for AI retrieval Make evidence of access mandatory for systems that touch sensitive data, including who approved the setup, what source was queried, and how the response was assembled.
What's in the full article
Cyera's full article covers the operational detail this post intentionally leaves for the source:
- How Cyera distinguishes public AI, embedded AI, and homegrown AI risk in practice
- The article's operational order for discovery, access control, incident response, and proof
- Why audit logs, data lineage, and accountability are treated as deployment prerequisites
- The source's own framing of what organisations should secure first when AI reaches sensitive data
👉 Read Cyera's analysis of why AI security starts with data access →
AI data access risk: what IAM teams need to change now?
Explore further
AI security is an access-governance problem before it is a model-governance problem. The article is right to centre the data the system can reach, because that is where most enterprise exposure actually lives. When an AI assistant can aggregate from multiple repositories, the risk comes from authorised breadth, not just malicious input. Practitioners should treat AI access as a governance boundary that must be defined before the model is trusted with real work.
A few things that frame the scale:
- The average estimated time to remediate a leaked secret is 27 days, despite 75% of organisations expressing strong confidence in their secrets management capabilities, according to the State of Secrets in AppSec.
- Only 44% of developers are reported to follow security best practices for secrets management, exposing a significant behaviour gap in day-to-day application control.
A question worth separating out:
Q: Should organisations treat embedded AI and homegrown AI the same way?
A: No. Embedded AI often inherits risk from SaaS defaults, while homegrown AI introduces custom retrieval paths and local accountability gaps. Both need governance, but the control points differ. Teams should standardise the review model while still tracking the specific identity and data path for each class.
👉 Read our full editorial: AI security starts with data access, not model control