AI data security for federal agencies is becoming an access problem

By NHI Mgmt Group Editorial TeamPublished 2025-08-26Domain: Governance & RiskSource: Cyera

TL;DR: Federal agencies have more than doubled AI use since 2023, with HHS, VA, DHS, and DOI representing half of reported use cases, while cloud sprawl and legacy systems keep data visibility patchy, according to Cyera. The core issue is no longer just compliance, but proving continuous control over sensitive data as AI expands access paths.

At a glance

What this is: This is a Cyera commentary on federal AI adoption that argues data security must move from compliance posture to continuous visibility and control.

Why it matters: It matters because federal AI programmes change how sensitive data is accessed, moved, and governed across NHI, autonomous, and human identity paths.

By the numbers:

Federal agencies have more than doubled their use of AI since 2023.

👉 Read Cyera's commentary on trust, AI, and federal data security

Context

Federal AI adoption is no longer experimental. The governance problem is that AI systems expand the number of access paths to sensitive data while many agencies still operate across legacy platforms, cloud services, and partial visibility.

In practice, that turns data security into an identity problem as much as a data problem. Federal programmes need to know who or what can reach sensitive records, how access is granted, and whether those access paths can be proven continuously under oversight pressure.

Key questions

Q: How should agencies govern sensitive data used by AI systems?

A: They should govern it as a combined data and identity problem. That means discovering where sensitive data resides, mapping every identity that can reach it, and enforcing policy continuously across service accounts, APIs, and human workflows. If the organisation cannot prove access paths in operation, it does not have enough control for federal AI use.

Q: Why do AI programmes create more risk around sensitive federal data?

A: AI programmes increase risk because they multiply the number of access paths to the same information. Data may move through cloud services, automation, and delegated identities before a human ever sees the result. The real governance challenge is not only exposure, but whether the agency can explain and verify that exposure continuously.

Q: What do security teams get wrong about compliance in federal AI environments?

A: They often treat compliance as proof of security, when it is only evidence that controls existed at a point in time. Federal AI environments need continuous visibility, because the question is whether data protection holds while systems are actively querying, transforming, and sharing information across multiple identity layers.

Q: What should organisations do before expanding AI access to sensitive records?

A: They should validate classification, entitlement scope, and logging for every identity that can touch the records. If service accounts or delegated workflows already have broad access, reduce that reach first. Otherwise the AI programme inherits pre-existing overexposure and turns it into a higher-frequency governance problem.

Technical breakdown

Why data visibility breaks down in federal AI environments

Federal environments combine legacy infrastructure, cloud services, and distributed data stores, so sensitive information no longer sits behind a single perimeter. Data security posture depends on discovering where sensitive data lives, where it moves, and which identities can reach it. When AI systems are added without corresponding governance, those access paths multiply faster than teams can map them. The technical failure is not just exposure, but incomplete context: security teams may see a dataset, a connector, or an API call, yet still not know whether the underlying identity is approved to use that data for AI processing.

Practical implication: inventory sensitive datasets and the identities touching them before expanding AI access.

How AI changes access governance for sensitive federal data

AI systems do not just consume data, they often mediate decisions about what to retrieve, summarise, and pass along. That shifts governance from static access control toward continuous authorisation, because the risk is not only who can log in, but what the system can do once access is granted. For NHI and agentic workflows, the identity boundary becomes the tool boundary: APIs, service accounts, and downstream automations can all inherit the same data exposure. This is why traditional compliance evidence is necessary but insufficient for operational trust.

Practical implication: bind AI access to narrowly scoped identities and review the downstream tool chain, not just the initial login.

What continuous proof of control means in practice

Continuous proof means being able to show that data access, classification, and policy enforcement are active throughout the lifecycle of a request, not only at audit time. In federal settings, that is especially important because oversight expectations are high and tolerance for delayed response is low. The architecture needs data discovery, policy enforcement, monitoring, and reporting to work together, otherwise agencies end up with security controls that exist on paper but not in real workflows. This is a governance gap, not a tooling checkbox.

Practical implication: require evidence that data controls are observable, testable, and reportable across the full AI request lifecycle.

DeepSeek breach — DeepSeek breach exposed 1M+ log lines and sensitive secret keys.
Codefinger AWS S3 ransomware attack — Codefinger used compromised AWS credentials to encrypt S3 buckets via SSE-C.

Read our 52 NHI Breaches Analysis report for a comprehensive view of breaches impacting Non-Human Identities including AI Agents.

NHI Mgmt Group analysis

Continuous proof of control is now the real federal trust boundary. Cyera’s framing is correct that compliance alone does not answer the operational question federal agencies now face: can they prove data is protected while AI systems are actively using it? In federal missions, the risk is not simply exposure, but the inability to demonstrate control across a distributed data path. Practitioners should treat evidence of control as a runtime requirement, not an audit artifact.

AI expands the identity surface around sensitive data even when the data model has not changed. Once AI is introduced, service accounts, APIs, connectors, and delegated workflows become part of the effective access path. That means data security, NHI governance, and human oversight have to be analysed together, because the same record may be reachable through multiple identity types. The practitioner implication is to govern the access chain, not just the application front end.

Visibility gaps become governance failures when agencies cannot explain who accessed what for AI use. The article points to a familiar federal pattern: legacy systems and modern cloud platforms produce partial visibility, and AI amplifies the consequence of that blindness. This is not a generic security concern. It is a traceability problem that affects accountability, policy enforcement, and public trust. Teams should assume that missing context is itself a control failure.

Data security posture management needs to be treated as an identity control plane for AI workloads. When sensitive data is dispersed across cloud and on-premises systems, the practical security question becomes whether the organisation can continuously map access rights to data sensitivity and mission purpose. That is where data security, IAM, and governance converge. The implication for practitioners is to stop separating data visibility from identity governance in programme design.

Federal AI governance will increasingly be judged by operational evidence, not policy statements. Agencies are under higher scrutiny than most enterprises, and the article reflects that reality. Saying a programme is secure is not enough if the organisation cannot prove who or what touched the data, when, and under which controls. Practitioners should expect continuous evidence collection to become a baseline expectation, not a maturity differentiator.

From our research:
1 in 4 organisations are already investing in dedicated NHI security capabilities, with an additional 60% planning to do so within the next twelve months, according to The State of Non-Human Identity Security.
85% of organisations lack full visibility into third-party vendors connected via OAuth apps, showing how quickly governance gaps can widen around delegated access.
That visibility problem is why the lifecycle guide matters: Ultimate Guide to NHIs , Lifecycle Processes for Managing NHIs helps teams connect access, offboarding, and ongoing control.

What this signals

Data security for AI is converging with identity governance, not replacing it. Federal programmes that keep data classification separate from access governance will struggle to prove control once AI systems become normal consumers of sensitive records. The useful response is to treat data discovery, policy enforcement, and identity review as one control plane.

Visibility is becoming the first test of trust in AI-enabled government. When agencies cannot see where sensitive data moves, they cannot explain mission risk, and they cannot defend their controls under oversight. Teams should prepare for reporting requirements that demand evidence across human, NHI, and delegated access paths rather than static policy statements.

Governance programmes should expect the NHI problem to surface inside AI use cases. Service accounts, tokens, and connectors are often the hidden layer that determines whether AI can touch mission data at all. If those identities are over-scoped today, AI expansion will simply expose and amplify the existing entitlement problem.

For practitioners

Map AI data access paths end to end Document every identity, service account, API, and connector that can reach sensitive datasets used in AI workflows. Include downstream transfers and summarisation steps so the access chain is visible, not just the application endpoint.
Tie data classification to identity policy Require sensitive federal datasets to carry policy labels that drive access decisions, logging, and review scope. Use those labels to distinguish mission data from lower-risk content and to narrow entitlement sprawl.
Make continuous proof part of governance evidence Collect runtime evidence for who accessed what, through which identity, and under which control set. Use that evidence in oversight reporting so controls can be demonstrated continuously rather than reconstructed after the fact.
Review NHI and delegated access before AI expansion Check whether service accounts, tokens, and delegated workflows already have broad data reach. If they do, reduce their scope before AI systems inherit those entitlements and turn them into higher-volume exposure paths.

Key takeaways

Federal AI adoption is turning data security into a continuous identity governance problem rather than a perimeter problem.
The scale of the shift is already material, with federal agencies having more than doubled AI use since 2023.
Practitioners need runtime proof of control across data, NHI, and delegated workflows before expanding AI access.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Non-Human Identity Top 10 address the attack and risk surface, while NIST CSF 2.0 and NIST Zero Trust (SP 800-207) set the governance and control requirements practitioners need to meet.

Framework	Control / Reference	Relevance
NIST CSF 2.0	PR.DS	AI data security depends on protecting data across movement and storage in federal environments.
NIST Zero Trust (SP 800-207)	PR.AC-4	AI access paths rely on continuous authorisation across identities and workloads.
OWASP Non-Human Identity Top 10	NHI-03	Delegated and machine identities often hold the data access that AI workflows inherit.

Review non-human credentials and reduce scope before AI systems consume those entitlements.

Key terms

Data security posture management: Data security posture management is the discipline of finding, classifying, and protecting sensitive data across cloud and on-premises environments. In AI programmes, it also has to show which identities can reach that data and whether access is continuously governed, not just inventoried.
Delegated access: Delegated access is access exercised by one identity on behalf of another system, workflow, or user. In AI environments, that can include service accounts, APIs, and automation that inherit permissions and move data without direct human interaction at each step.
Continuous authorisation: Continuous authorisation is the practice of re-evaluating access throughout a session or workflow rather than only at the point of login. For AI and NHI use cases, it is the difference between granting access once and proving that access remains appropriate as data moves.
Identity surface: Identity surface is the full set of human, machine, and delegated identities that can influence access to a system or dataset. The broader the surface, the harder it is to prove who or what touched sensitive data and under which controls.

Deepen your knowledge

Federal AI data governance and NHI access control are core topics in our NHI Foundation Level course, the industry's only accredited NHI security programme. If you are building control evidence for AI-enabled environments, it is a practical place to start.

This post draws on content published by Cyera: Trust in the Age of AI: Why Cyera Is Bringing Data Security to the Federal Frontlines. Read the original.

NHIMG Editorial Note
Published by the NHIMG editorial team on 2025-08-26.
NHI Mgmt Group — the independent authority on Non-Human Identity, IAM, and Agentic AI security. nhimg.org