Notifications

Clear all

AI infrastructure and GPU identity: what IAM teams need to know

Last Post

RSS

NHI Mgmt Group

(@nhi-mgmt-group)

Member Moderator

Joined: 1 year ago

Posts: 12212

Topic starter 07/06/2026 8:24 pm

TL;DR: Modal’s interview argues that AI applications need infrastructure abstractions built for bursty GPU workloads, rapid container startup, and global scale, because traditional Kubernetes and cloud patterns make inference harder than necessary, according to WorkOS. The identity lesson is that compute elasticity changes control boundaries, so governance must follow workload behaviour rather than static infrastructure assumptions.

NHIMG editorial — based on content published by WorkOS: Modal is building AI infrastructure that doesn't get in the way

Questions worth separating out

Q: How should security teams govern bursty AI workloads in cloud environments?

A: Security teams should govern bursty AI workloads by tying access, logging, and revocation to the job lifecycle rather than to the underlying host or cluster.

Q: Why do AI infrastructure platforms create new identity governance risks?

A: AI infrastructure platforms create new identity governance risks because they hide orchestration complexity while concentrating trust in the layer that schedules work and attaches permissions.

Q: What breaks when workload identity is managed like a static server identity?

A: When workload identity is managed like a static server identity, access reviews, approval cycles, and rotation assumptions all lag behind actual execution.

Practitioner guidance

Map workload identity to job lifecycle events Tie permissions, logging, and revocation to container start, model load, and job termination so access does not persist beyond the compute session.
Review orchestration-layer trust boundaries Identify which identity decisions now happen in the platform abstraction, including scheduling, placement, and permission attachment, and document the controls around each step.
Limit effective privilege for bursty AI workloads Scope cloud roles and service credentials to the smallest possible job purpose, then verify that fast scale-out does not silently widen access.

What's in the full article

WorkOS's full interview covers the operational detail this post intentionally leaves for the source:

Eric Bernhardsson's explanation of why compute-intensive AI needs new infrastructure abstractions for GPUs and inference.
The discussion of how Modal uses gVisor for workload isolation and why that matters for platform trust boundaries.
The practical trade-offs of replacing Kubernetes and Docker with a higher-level orchestration layer for AI workloads.
The interview's view of which AI customer segments are driving revenue now and how that changes infrastructure demand.

👉 Read WorkOS's interview on Modal's compute-native AI infrastructure thesis →

AI infrastructure and GPU identity: what IAM teams need to know?

Explore further

View Full Forum → | NHI Foundation Course →

Quote

Topic Tags

Mr NHI

(@mr-nhi)

Member Moderator

Joined: 2 months ago

Posts: 11787

07/06/2026 10:17 pm

Compute elasticity is now an identity design constraint, not just an infrastructure preference. AI platforms that can spin up thousands of GPUs within minutes force identity teams to govern access around runtime behaviour rather than fixed assets. Traditional IAM assumptions about stable hosts, long-lived sessions, and predictable scheduling become weaker as workload timing becomes the primary control variable. Practitioners should treat elasticity as a governance boundary, not only a performance feature.

A few things that frame the scale:

53% of security leaders expect AI to run major portions of their infrastructure autonomously within the next three years, according to The 2026 Infrastructure Identity Survey.
Another finding from the same survey shows that 67% of organisations still rely heavily on static credentials despite the risks they pose to agentic AI deployments.

A question worth separating out:

Q: How do platform teams and IAM teams split responsibility for AI compute governance?

A: Platform teams should own scheduling, isolation, and runtime telemetry, while IAM teams should own entitlement scope, revocation, and credential provenance. The split only works if both groups share the same view of compute as an identity event, not just an infrastructure event. That is the practical way to keep governance close to execution.

👉 Read our full editorial: Modal shows why AI infrastructure needs compute-native identity controls

ReplyQuote

Forum Statistics

11 Forums

13.5 K Topics

25.8 K Posts

63 Online

135 Members

Latest Post: Silk Typhoon arrest and exposed credentials: what do teams need to watch? Our newest member: Alex Recent Posts Unread Posts Tags

Forum Icons: Forum contains no unread posts Forum contains unread posts

Topic Icons: Not Replied Replied Active Hot Sticky Unapproved Solved Private Closed

#1 Authority in NHI Education, Research and Advisory, empowering organizations to tackle the critical risks posed by Non-Human Identities (NHIs), including AI Agents.

Get in Touch

Quick Links

FAQ

NHI 101 Articles

Legal & Policies