TL;DR: Open source models are closing the capability gap with closed systems and pushing more organisations toward in-house inference as monthly spend rises into five figures, according to WorkOS's interview with Baseten. The governance question is no longer whether AI workloads will scale, but which identity, access, and infrastructure controls will govern them when they do.
NHIMG editorial — based on content published by WorkOS: Baseten is betting big on open source models
Questions worth separating out
Q: How should security teams govern in-house AI inference workloads?
A: Security teams should govern in-house AI inference workloads as non-human identities with scoped permissions, named ownership, and lifecycle controls.
Q: Why do open source models increase identity governance pressure?
A: Open source models increase identity governance pressure because they make it easier to bring AI execution inside the enterprise boundary.
Q: What breaks when AI workloads scale without lifecycle controls?
A: When AI workloads scale without lifecycle controls, old credentials and broad privileges tend to remain in place after the system changes.
Practitioner guidance
- Map AI inference workloads to workload identities Inventory every model-serving environment, the service accounts that operate it, and the credentials used for deployment, inference, logging, and telemetry.
- Separate model access from platform administration Split permissions so the teams that tune prompts, models, or routing cannot also grant themselves broad infrastructure access.
- Review in-house AI workloads as part of NHI lifecycle governance Include AI inference environments in joiner-mover-leaver, access review, and offboarding processes.
What's in the full article
WorkOS's full article covers the operational detail this post intentionally leaves for the source:
- The interview context behind Baseten's infrastructure strategy and how the company describes its inference stack.
- Specific examples of why teams shift from foundation models to open source deployments as workloads scale.
- The voice and multimodal use cases Baseten says are becoming important in enterprise AI environments.
- The article's discussion of international model development and GPU optimisation challenges.
👉 Read WorkOS's interview on Baseten's open source model and inference shift →
Open source models and inference infrastructure: what changes now?
Explore further
Open source model adoption is turning AI infrastructure into an NHI governance problem. Once organisations run models on their own GPUs and internal platforms, the relevant question is no longer which model they buy but which non-human identities can deploy, call, and modify those systems. That shifts AI from a usage decision into an access and accountability problem, and practitioners need to govern the runtime estate accordingly.
A few things that frame the scale:
- The average estimated time to remediate a leaked secret is 27 days, despite 75% of organisations expressing strong confidence in their secrets management capabilities, according to The State of Secrets in AppSec.
- Only 44% of developers are reported to follow security best practices for secrets management, exposing a significant developer behaviour gap, according to The State of Secrets in AppSec.
A question worth separating out:
Q: Should teams treat model-serving platforms like privileged infrastructure?
A: Yes. Model-serving platforms often sit on top of GPU clusters, cloud services, and internal data paths, which means they can reach sensitive systems even when the model itself looks isolated. Treating them as privileged infrastructure forces change control, least privilege, and monitoring to apply to the full execution path, not just the model endpoint.
👉 Read our full editorial: Open source model adoption is shifting AI workloads in-house