Governance, Ownership & Risk

How do platform teams and IAM teams split responsibility for AI compute governance?

By NHI Mgmt Group Editorial Team Updated June 7, 2026 Domain: Governance, Ownership & Risk

Platform teams should own scheduling, isolation, and runtime telemetry, while IAM teams should own entitlement scope, revocation, and credential provenance. The split only works if both groups share the same view of compute as an identity event, not just an infrastructure event. That is the practical way to keep governance close to execution.

Why This Matters for Security Teams

AI compute governance fails when platform engineering and IAM treat the same workload as two separate problems. Platform teams usually control where agents run, what they can touch at runtime, and how noisy or risky workloads are isolated. IAM teams control whether the workload should exist at all, what it is allowed to do, and how credentials are issued and revoked. If that split is unclear, either side can create blind spots that look like normal operations until an incident surfaces.

For AI systems, this matters because compute is not just infrastructure, it is the execution surface for an identity event. An agent can request tools, chain actions, and reuse context faster than a human review cycle can respond. Current guidance increasingly aligns with NIST Cybersecurity Framework 2.0, which emphasizes governance and shared accountability across operational domains, but organisations still have to define who owns which control. NHIMG’s research on Top 10 NHI Issues shows how quickly governance breaks down when identity, secrets, and runtime control are managed as separate tracks.

In practice, many security teams discover this only after a workload has already overreached its intended scope and the investigation reveals that neither team owned the full path from entitlement to execution.

How It Works in Practice

The cleanest split is functional, not organisational. Platform teams own the runtime environment: cluster scheduling, node isolation, pod or container boundaries, network segmentation, telemetry, and enforcement hooks that prevent one workload from interfering with another. IAM teams own the identity plane: workload registration, entitlement design, approval workflow, secret issuance, revocation, and proof of credential provenance. The handoff between them should be explicit enough that every running AI workload can be traced back to an approved identity, a defined purpose, and a current authorization state.

For agentic workloads, this works best when compute is treated as a short-lived identity lifecycle. A platform team may start a job, but IAM should define whether that job receives a static service account, a short-lived token, or a just-in-time credential. Best practice is evolving toward workload identity and ephemeral credentials, because static access is too coarse for autonomous systems. In practical terms, that means binding runtime instances to cryptographic identity, using request-time policy checks, and revoking access as soon as the task completes. Standards-oriented approaches such as SPIFFE and SPIRE are often used to prove what the workload is, while IAM policy engines determine what it may do at that moment.

Useful operating patterns include:

Platform owns isolation, quotas, network paths, and runtime logging.
IAM owns identity issuance, approval, rotation, and deprovisioning.
Both teams share a common asset inventory for compute identities and secrets.
Policy decisions happen at request time, not only during deployment.
Every privileged workload has an owner, an expiry, and a revocation path.

NHIMG’s Ultimate Guide to NHIs — Lifecycle Processes for Managing NHIs is a useful reference for aligning identity lifecycle controls with operational ownership, especially where runtime telemetry and credential governance need to meet. These controls tend to break down when shared clusters host mixed-trust workloads and no team owns the full chain from admission to revocation, because exceptions become the normal path.

Common Variations and Edge Cases

Tighter compute governance often increases operational overhead, so organisations have to balance speed against control coverage. That tradeoff becomes sharper in environments with ephemeral training jobs, multi-tenant GPU pools, or fast-moving agent pipelines where tasks are spun up and torn down continuously. There is no universal standard for exactly where the platform boundary ends and the IAM boundary begins, but current guidance suggests the split should follow the control that can most reliably reduce blast radius.

A few edge cases deserve special handling. In serverless or managed AI services, the platform layer may be partially abstracted, so IAM must rely more heavily on service-linked roles, API-level entitlements, and downstream data access policies. In multi-agent systems, one agent may launch another, which means platform telemetry alone is not enough unless IAM also governs delegation and credential chaining. In high-assurance settings, the organisation may also need separate approval for model access, data access, and compute access, because a single runtime permission can hide three distinct risks.

NHIMG’s 2024 ESG Report: Managing Non-Human Identities reported that 72% of organisations have experienced or suspect a breach of non-human identities, which is a useful reminder that governance gaps are not theoretical. The practical answer is to treat platform and IAM as joint owners of one control surface, even when their task lists differ. The split is least stable in shared AI platforms where compute is provisioned by one team, but credentials are delegated by another without a single revocation authority.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Agentic AI Top 10 and CSA MAESTRO address the attack and risk surface, while NIST AI RMF set the governance and control requirements practitioners need to meet.

Framework	Control / Reference	Relevance
OWASP Agentic AI Top 10		Agentic workloads need runtime controls across identity, tools, and execution.
CSA MAESTRO		MAESTRO frames shared governance for autonomous AI systems and their control planes.
NIST AI RMF		AI RMF supports governance, accountability, and operational risk management for AI systems.

Map platform and IAM controls to agent runtime boundaries, then enforce per-task authorization and revocation.

Deepen Your Knowledge

Ultimate Guide to NHIs → NHI Foundation Course → Discussion Forum →

NHIMG Editorial Note
Reviewed and updated by the NHIMG editorial team on June 7, 2026.
NHI Mgmt Group — the #1 independent authority on Non-Human Identity, IAM, and Agentic AI security. nhimg.org

#1 Authority in NHI Education, Research and Advisory, empowering organizations to tackle the critical risks posed by Non-Human Identities (NHIs), including AI Agents.

Get in Touch

Quick Links

FAQ

NHI 101 Articles

Legal & Policies

How do platform teams and IAM teams split responsibility for AI compute governance?

Why This Matters for Security Teams

How It Works in Practice

Common Variations and Edge Cases

Standards & Framework Alignment

Related resources from NHI Mgmt Group