Subscribe to the Non-Human & AI Identity Journal

What should IAM teams ask about AI products that handle sensitive data?

Ask where data is stored, whether it is retained after inference, whether it is used for training, and whether deletion is enforceable across the full service stack. Also ask how access is provisioned and revoked, because data controls without identity controls leave a governance gap.

Why This Matters for Security Teams

AI products that process sensitive data are not just storage and privacy questions. They also change who can reach the data, when access exists, and whether the service can prove deletion across logs, caches, embeddings, and downstream replicas. IAM teams should treat these products as identity-controlled systems first, because data protections without identity enforcement leave gaps that attackers and internal users can exploit. NIST Cybersecurity Framework 2.0 is useful here because it frames governance, access control, and monitoring as connected duties rather than separate checkboxes.

This is especially important after incidents like the DeepSeek breach, which show how quickly sensitive content can spill beyond its intended boundary once retention and access rules are unclear. NHI Management Group’s research also shows that most organisations still feel behind on non-human access governance, and that gap matters directly when AI services are granted broad service credentials. The practical issue is not whether a product has a privacy policy, but whether its identity model can support enforceable control over data flow, retention, and revocation. In practice, many security teams discover the failure only after a model output, backup copy, or exposed integration token has already made the data reachable elsewhere.

How It Works in Practice

For IAM teams, the right review starts with the service identity, not just the vendor contract. Ask whether the AI product authenticates via workload identity, scoped API credentials, or shared tenant-level access. If the service uses long-lived secrets, the blast radius is usually larger than the provider admits. If it supports short-lived credentials, confirm whether those credentials are issued per request, per session, or per environment, and whether revocation actually propagates across inference, logging, storage, and support tooling.

Current guidance suggests mapping the AI product to the same access lifecycle used for other high-risk NHIs: provision narrowly, bind access to a known workload, and revoke automatically when the task, environment, or contract ends. This is where identity controls and data controls must be tested together. A product may claim that data is not used for training, but IAM teams still need to ask how that promise is enforced technically, who can override it, and whether deletion requests reach every copy. The NHI Management Group research on Ultimate Guide to NHIs — Key Research and Survey Results is a useful reminder that inconsistent access management remains common even before AI-specific complexity is added.

  • Confirm where data is stored, including transient caches, vector stores, and audit logs.
  • Ask whether the product retains prompts, outputs, or embeddings after inference, and for how long.
  • Require a clear answer on whether customer data is used to train models, fine-tune systems, or improve vendor services.
  • Verify whether deletion is enforceable across the full service stack, not only the primary application database.
  • Review how access is provisioned, approved, monitored, and revoked for both human and non-human identities.

Where possible, align the review to the NIST AI Risk Management Framework and compare vendor answers against observable controls, not marketing language. These controls tend to break down when the AI product is embedded in a multi-tenant platform with shared back-end services, because deletion and revocation often stop at the customer-facing layer.

Common Variations and Edge Cases

Tighter data controls often increase operational overhead, requiring organisations to balance stronger governance against integration speed and vendor complexity. That tradeoff becomes sharper when the AI product is used by many business units, because each team may want different retention periods, access scopes, and exception handling.

There is no universal standard for this yet, but best practice is evolving toward context-based questions: what data class is involved, what identity is calling the service, what action is being taken, and what evidence exists that deletion is complete. For agentic or automated AI products, this is even more important because the service may chain tool calls or trigger secondary storage systems without a human in the loop. In those environments, identity governance should also consider whether the workload is backed by short-lived credentials, whether access decisions are evaluated at runtime, and whether exceptions are time-bound rather than permanent. The 2024 Non-Human Identity Security Report is relevant here because it shows that organisations still struggle with dynamic ephemeral credentials and consistent access across hybrid environments, which is exactly where AI products often operate. The LLMjacking: How Attackers Hijack AI Using Compromised NHIs research is a reminder that exposed credentials can become an AI attack path very quickly. In practice, IAM teams are usually forced to resolve these edge cases only after a vendor integration, retention dispute, or credential leak has already surfaced.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Non-Human Identity Top 10 address the attack and risk surface, while NIST CSF 2.0 and NIST AI RMF set the governance and control requirements practitioners need to meet.

Framework Control / Reference Relevance
OWASP Non-Human Identity Top 10 NHI-03 Retention and revocation depend on controlling NHI credential lifecycle.
NIST CSF 2.0 PR.AC-4 AI products need access governance tied to identity and privilege scope.
NIST AI RMF Sensitive-data AI products require governance, mapping, and ongoing risk review.

Use short-lived NHI credentials and verify revocation reaches every AI service dependency.