AI governance gaps widen as policy outpaces guardrails

By NHI Mgmt Group Editorial TeamPublished 2025-09-03Domain: General NHISource: Cyera

TL;DR: The White House’s AI Action Plan pushes faster AI adoption, infrastructure build-out, and secure-by-design expectations, but the article argues that governance will increasingly fall to industry as federal guardrails loosen and questions about training data, bias, and sensitive-data use remain unresolved, according to Cyera. The practical issue is not ambition but whether organisations can prove their AI data, model, and agent controls are trustworthy enough to absorb the policy shift.

At a glance

What this is: This analysis argues that the White House’s AI Action Plan accelerates AI adoption but leaves trust, data governance, and accountability to industry.

Why it matters: It matters because IAM, data security, and AI governance teams now have to connect human, NHI, and emerging agent controls around training data, access, and oversight.

By the numbers:

The Plan cuts federal science funding by 34 percent, including math and physics at $289 million, engineering at $127 million, computer science at $85 million, and technology at $18 million.
Cyera says its AI-native data security platform helps Global 2000 companies discover and classify sensitive data in AI training sets, identify AI applications and agents, and prevent them from ingesting or sharing PII, intellectual property, or other sensitive data.

👉 Read Cyera’s analysis of the White House AI Action Plan and AI governance

Context

The core problem is simple: faster AI adoption does not automatically produce trustworthy AI. When policy prioritises acceleration over guardrails, organisations inherit responsibility for proving what data models were trained on, where that data came from, and whether sensitive material is being reused or exposed in downstream workflows. For identity and access teams, that shifts AI from an innovation topic into a governance problem that spans human access, non-human identities, and agent-driven data use.

This is especially important because AI systems do not just consume data, they create new access paths to it. Training sets, model pipelines, application integrations, and AI agents can all become enforcement points or leakage points, depending on how identity, entitlement, and data controls are designed. In practice, the question is whether current governance models can keep up with systems that learn, infer, and act across shared data and tool boundaries.

Key questions

Q: How should organisations govern AI systems that can access sensitive training data?

A: Organisations should treat sensitive training data as a governed input, not a loose repository. That means classifying the data before ingestion, approving who can use it, recording where it is consumed, and blocking unnecessary reuse in downstream AI workflows. Governance has to happen at the intake point, because once sensitive content enters the model pipeline, later controls are much weaker.

Q: Why do AI applications and agents create new access risks for IAM teams?

A: AI applications and agents can request data, call tools, and move information across systems faster than traditional review cycles were designed to track. That creates new non-human identity surfaces that need inventory, entitlement control, and evidence of what each identity can touch. If IAM only governs human users, it misses the identities actually executing the AI workflow.

Q: What do security teams get wrong about secure-by-design AI governance?

A: They often treat secure-by-design as a policy label instead of an enforceable operating model. Real security requires least privilege, logging, data minimisation, and output controls that can be tested and audited. Without those controls, secure-by-design becomes a statement of intent rather than proof that the AI stays inside approved boundaries.

Q: How do AI data controls differ from traditional access control?

A: Traditional access control focuses on who can open a system or file. AI data control must also govern what gets ingested, what gets retained in model behaviour, and what can be reproduced in outputs. That makes the control surface broader, because risk can appear both before training and after deployment.

Technical breakdown

Training data governance and sensitive-data provenance

AI trust depends on knowing what entered the training corpus, who approved it, and whether the resulting model retains hidden traces of sensitive information. In governance terms, provenance is not a documentation exercise. It is an access-control issue because data that should never be ingested can later be reproduced, inferred, or exposed by the model. For security teams, classification alone is not enough. The control problem is verifying that training inputs, fine-tuning sets, and retrieval sources are bounded by policy before model behaviour is set.

Practical implication: tie sensitive-data review to model ingestion gates, not only to post-training audits.

AI applications, agents, and identity boundaries

The article points to AI applications and agents as distinct objects that need discovery and control. That matters because an AI agent is not just a workload label, it can become a non-human identity that requests data, invokes tools, and shares outputs across systems. Once that happens, identity boundaries matter as much as model boundaries. If the organisation cannot inventory which AI entities exist, which data sources they can reach, and what they are allowed to emit, the governance model is already behind the architecture.

Practical implication: maintain a live inventory of AI applications, agents, and the data sources each one can reach.

Secure-by-design for AI needs enforceable controls, not slogans

Secure-by-design only has value when it is translated into concrete controls such as least privilege, data minimisation, logging, and policy enforcement around model inputs and outputs. The article highlights the tension between innovation goals and oversight gaps, which is where many AI programmes fail. They treat trust as a policy statement instead of an operational state that can be measured, reviewed, and constrained. In modern AI environments, governance has to cover data, identity, and runtime behaviour together.

Practical implication: require measurable controls for AI data access, output filtering, and runtime logging before broad adoption.

Moltbook AI agent keys breach — Moltbook breach exposed 1.5M AI agent keys.
AI LLM hijack breach — attackers used stolen AWS access keys to hijack Anthropic LLM models on Bedrock.

Read our 52 NHI Breaches Analysis report for a comprehensive view of breaches impacting Non-Human Identities including AI Agents.

NHI Mgmt Group analysis

AI governance is becoming an identity problem, not just a policy problem. Once AI systems are allowed to train on sensitive material and act across enterprise tools, the key question is no longer whether the model is clever. It is which identities can reach which data, under what policy, and with what proof of restraint. That shifts the centre of gravity from model quality to access governance, classification, and runtime enforcement.

Training data provenance is now part of the security control plane. A model cannot be trusted if the organisation cannot account for what was ingested, who approved it, and whether sensitive content was filtered before training or fine-tuning. This is where AI governance intersects directly with NHI and data security, because ingestion pipelines and AI agents can quietly extend access beyond the original data owner’s intent. Practitioners need to treat the training set as governed content, not a passive asset.

Secure-by-design becomes hollow when it is not backed by enforceable data boundaries. The article’s argument is that industry will carry more of the burden as policy loosens, but burden without controls is just exposure. The discipline now is to prove least privilege for AI-connected identities, constrain sensitive-data flows, and maintain evidence that AI usage stays within defined boundaries. Without that proof, governance is aspirational rather than operational.

Representative AI requires representative control, not just representative training data. The bias examples in the article show that data curation matters, but governance failure is broader than fairness. If organisations only curate inputs while leaving access paths open, they create systems that are both untrustworthy and hard to defend. The practical conclusion is that AI programmes must align data quality, identity governance, and policy enforcement or accept residual risk as a design choice.

From our research:
The average estimated time to remediate a leaked secret is 27 days, despite 75% of organisations expressing strong confidence in their secrets management capabilities, according to The State of Secrets in AppSec.
Only 44% of developers are reported to follow security best practices for secrets management, exposing a significant developer behaviour gap.
That gap matters because secret leakage, data provenance, and AI workflow access are converging, as explored in Top 10 NHI Issues.

What this signals

Identity blast radius: AI programmes should now be measured by how far a model, agent, or integration can move sensitive data once it has legitimate access. That is a governance signal, not a model-performance metric, and it should be tracked alongside classification coverage and approval evidence.

The policy environment is moving faster than most control planes, which means organisations need internal evidence for where AI data lives, who can touch it, and how access is revoked. The practical benchmark is whether security can reconstruct the full path from data source to model output without relying on manual recollection.

As AI adoption grows, the most exposed programmes will be the ones that separated data governance from identity governance. Bringing those controls together is now a prerequisite for trustworthy scale, especially where AI agents and service identities share the same back-end systems.

For practitioners

Inventory AI-connected identities and data paths Map every AI application, agent, service account, and API key that can touch training, retrieval, or output workflows. Record the data sources each one can access, the systems it can write to, and where human approval is still required.
Gate sensitive data before model ingestion Require classification and policy checks at the point where training data, fine-tuning data, or retrieval content enters the AI pipeline. Block ingestion of restricted data unless there is a documented business case and approved exception.
Apply least privilege to AI tool use Limit every AI-connected identity to the smallest set of tools, prompts, datasets, and export paths needed for its task. Review whether the AI can move data between systems that were never intended to be joined.
Log AI data access and outputs together Correlate identity activity with the exact data objects read, transformed, or emitted by the system. Separate model telemetry from business audit logs so reviewers can reconstruct both who acted and what data was exposed.
Create an exception process for AI trust boundaries Define who can approve departures from standard data controls, how exceptions expire, and what evidence proves the AI returned to policy. Make the exception register visible to security, privacy, and model owners.

Key takeaways

The article’s central warning is that faster AI adoption without stronger guardrails shifts trust responsibility from policy makers to enterprises.
The scale problem is not theoretical, because the piece highlights federal science funding cuts while AI systems increasingly depend on sensitive training data and connected identities.
The decisive control is evidence-based governance at ingestion, identity, and runtime, not aspirational secure-by-design language.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Non-Human Identity Top 10 address the attack and risk surface, while NIST AI RMF and NIST CSF 2.0 set the governance and control requirements practitioners need to meet.

Framework	Control / Reference	Relevance
NIST AI RMF		AI governance, trust, and accountability are the article’s core themes.
OWASP Non-Human Identity Top 10	NHI-01	AI applications and agents behave like non-human identities with data access.
NIST CSF 2.0	PR.AC-4	The article centres on access control and trust boundaries for AI workflows.

Apply AI RMF GOVERN and MAP functions to prove ownership of AI risk and data-use decisions.

Key terms

Training Data Provenance: Training data provenance is the record of where model inputs came from, who approved them, and whether sensitive content was filtered before use. In AI governance, provenance is evidence of control, not just documentation. It helps security teams prove that the model was trained within policy boundaries.
AI-connected Identity: An AI-connected identity is a non-human identity used by an AI application or agent to access data, tools, or services. It may be a service account, token, or API key. The governance challenge is that these identities can move data at machine speed and often outlive the review process built for humans.
Identity Blast Radius: Identity blast radius is the amount of data, tools, and systems an identity can affect once it is granted access. For AI systems, the blast radius includes both the data the identity can read and the outputs it can generate or share. Reducing it is a core governance objective.
Secure-by-Design: Secure-by-design means building controls into the system from the start rather than adding them after deployment. For AI, that includes classification, least privilege, logging, and boundary enforcement around data ingestion and output. Without those controls, the phrase describes intent, not verified security.

Deepen your knowledge

AI data provenance and access governance are core topics in our NHI Foundation Level course, the industry's only accredited NHI security programme. If you are building controls for AI systems that touch sensitive data and non-human identities, it is worth exploring.

This post draws on content published by Cyera: It’s Up to Industry to Regulate AI: The White House’s AI Action Plan is long on ambition, but short on guardrails. Read the original.

NHIMG Editorial Note
Published by the NHIMG editorial team on 2025-09-03.
NHI Mgmt Group — the independent authority on Non-Human Identity, IAM, and Agentic AI security. nhimg.org