TL;DR: AI security in production still lacks clear visibility, policy enforcement, and runtime control where models actually run, according to Aqua Security, even as 70% of AI applications are deployed in containers and 97% of security organisations plan to increase spending on securing AI use cases. The governance problem is no longer model experimentation; it is securing AI workloads as operational assets with enforceable controls.
At a glance
What this is: This is Aqua Security’s case for shifting AI security from point controls to runtime governance where AI workloads execute.
Why it matters: It matters because IAM, NHI, and security teams now have to govern AI systems as production identities and workloads, not just as applications with a user interface.
By the numbers:
- Over 70% of AI applications are deployed in containers, running on Kubernetes and cloud native infrastructure.
👉 Read Aqua Security’s analysis of operationalizing AI security for production workloads
Context
AI security for production workloads is the discipline of controlling models, prompts, and execution paths where AI actually runs. The article argues that existing tools are too fragmented to give security leaders centralized visibility or policy enforcement across the workload layer, which is where AI governance starts to matter for identity and access control.
That gap matters because AI systems in production sit inside cloud-native environments that already rely on identity, policy, and runtime enforcement. When the security model stops at the app edge or the developer workflow, teams lose the ability to govern what the workload does once it is live, which is why AI security now intersects directly with NHI, workload identity, and operational IAM.
For teams building a broader identity programme, the question is no longer whether AI has a place in security architecture. The question is whether current governance models can still answer who or what is acting, what it is allowed to do, and how those permissions are enforced at runtime.
Key questions
Q: How should security teams govern AI workloads running in production?
A: They should govern AI workloads at the runtime layer, where prompts, model outputs, and data access actually occur. That means combining visibility, policy enforcement, and response controls inside the execution environment, not relying only on perimeter tools, SDK hooks, or pre-deployment review. Production AI should be treated as an operational workload with identity and access boundaries.
Q: Why do perimeter tools fall short for AI security?
A: Perimeter tools can see traffic, but they usually cannot see the model’s internal execution context or how policies are applied after the request enters the workload. That creates a blind spot for prompt handling, unsafe outputs, and data exposure. If enforcement stops at the edge, security leaders lose control of the most important decisions.
Q: What breaks when AI governance is limited to developer workflows?
A: What breaks is enforceability. Developers can follow secure patterns, but production AI still needs controls that inspect and govern live behaviour. If security only exists in code review or pipeline checks, teams cannot stop policy violations once the workload is running and interacting with sensitive data.
Q: How do runtime AI controls fit with workload identity programmes?
A: They fit as an extension of workload governance, because AI systems in production behave like machine workloads that need identity, access, and policy boundaries. The same discipline used for service accounts and cloud-native workloads should define what AI can do, where it can do it, and how violations are detected.
Technical breakdown
Runtime AI workload governance in cloud-native environments
The article’s core technical point is that AI security has to operate inside the workload boundary, not only around it. In cloud-native systems, models often run in containers and Kubernetes, where policy enforcement can inspect behaviour at execution time rather than relying on static configuration or developer discipline. That matters because prompts, model outputs, and downstream actions are all runtime events. If security cannot observe and enforce at that layer, governance becomes advisory rather than operational.
Practical implication: Security teams need controls that can inspect and enforce policy inside AI workloads rather than only at ingress or code review.
Why AI firewall and SDK-only models leave a visibility gap
The article contrasts three control layers: AI firewalls, SDK-level controls, and workload-layer protection. Firewalls can see external interactions, and SDKs can embed controls into applications, but neither gives full visibility into the infrastructure where inference runs or where sensitive data is handled. That creates a governance gap because the decisive behaviour happens after the request passes the perimeter. Runtime protection inside the workload closes that gap by tying policy to execution, not just to traffic.
Practical implication: Do not treat perimeter filtering or application SDK controls as complete AI governance if the workload layer remains opaque.
Policy enforcement without slowing development pipelines
A central challenge in AI security is enforcing controls without forcing major code changes or development friction. The article points to an eBPF-based agent approach that monitors workload behaviour from inside the runtime environment, allowing teams to observe activity and apply policy without changing application code. Technically, that shifts security from pre-deployment review into live operational control, which is critical for AI systems that change quickly and often.
Practical implication: Prefer runtime enforcement patterns that preserve developer velocity while still making policy enforceable in production.
Threat narrative
Attacker objective: The attacker objective is to exploit blind spots in production AI runtime governance so model behaviour or data handling escapes policy control.
- Entry occurs when AI workloads are deployed in cloud-native infrastructure and begin interacting with sensitive data in production.
- Escalation happens when the organization lacks centralized runtime visibility, allowing prompt handling, model usage, or output behaviour to drift outside policy.
- Impact is the inability to detect unsafe model use quickly enough to prevent data exposure, policy violations, or uncontrolled AI behaviour at execution time.
Breaches seen in the wild
- Cisco DevHub NHI breach — IntelBroker exploited exposed Cisco credentials, API tokens and keys in DevHub.
- ASP.NET machine keys RCE attack — 3,000+ exposed ASP.NET machine keys enabled remote code execution.
Read our 52 NHI Breaches Analysis report for a comprehensive view of breaches impacting Non-Human Identities including AI Agents.
NHI Mgmt Group analysis
AI security is becoming a workload governance problem, not a model-safety side project. The article is describing a shift from perimeter and developer controls to enforcement at the point of execution. That is the right framing because production AI systems behave like governed workloads, not isolated tools. For identity teams, the practical conclusion is that AI security now belongs in the same operational conversation as workload identity and runtime access control.
Runtime visibility is the named concept this market keeps circling but rarely operationalises cleanly. The article shows why: organisations can see the app, the SDK, or the network edge, yet still miss what the model is doing inside the workload. That blind spot is where policy breaks down and where identity governance loses enforcement power. Practitioners should treat runtime visibility as a control plane requirement, not a monitoring preference.
The old assumption that AI can be governed after deployment is no longer defensible. The article assumes teams can retrofit visibility and policy once AI moves into production, but that only works if the workload layer remains stable and observable. In practice, AI behaviour is runtime behaviour, so governance that waits for post-deployment review is already behind the event. The implication is that security architecture must account for live execution, not just approved design.
Cloud-native AI security will converge with NHI and workload identity governance. The article’s emphasis on containers, Kubernetes, and runtime policy shows that AI systems now inherit the same identity and access problems as other machine workloads. That convergence matters because the controls that govern service identities, workload permissions, and runtime policy are increasingly the same controls that govern AI execution. Practitioners should align AI security with existing NHI and cloud-native identity models rather than building a disconnected programme.
The market is moving from detection to enforceable control, and that changes procurement criteria. Security leaders are no longer satisfied with products that only report AI usage or wrap a policy statement around the edge. They need controls that can act where the workload runs and where the data is handled. The implication is that AI security evaluations should now test for runtime enforcement depth, not just visibility claims.
From our research:
- 1 in 4 organisations are already investing in dedicated NHI security capabilities, with an additional 60% planning to do so within the next twelve months, according to The State of Non-Human Identity Security.
- Lack of credential rotation is cited as the top cause of NHI-related attacks by 45% of organisations, followed by inadequate monitoring and logging at 37% and over-privileged accounts at 37%.
- For the broader control model behind production identity governance, see Ultimate Guide to NHIs , Standards for the frameworks that anchor workload and machine identity decisions.
What this signals
Runtime control will become the differentiator for AI security programmes. Teams that only detect AI usage will continue to miss the point, because the control failure happens when the workload executes. With 1 in 4 organisations already investing in dedicated NHI security capabilities, according to our research, the market is moving toward enforceable runtime governance, not visibility alone.
AI workloads should be folded into the same identity governance model used for machine identities. The operational boundary is converging, and security teams that keep AI separate from workload identity will create parallel control stacks with inconsistent policy. For practical alignment, the NIST Cybersecurity Framework 2.0 remains a useful structure for mapping govern, protect, detect, respond, and recover around AI runtime control.
For practitioners
- Map where AI actually runs Inventory every production location where inference, prompt handling, or model orchestration occurs, including containers and Kubernetes clusters. Treat these environments as governance targets, not just hosting targets.
- Separate edge controls from workload controls Document which policies are enforced by AI firewalls or SDKs and which are enforced inside the runtime environment. Close the gap where the workload can act beyond what perimeter controls can observe.
- Align AI security with workload identity Use the same identity and access governance model you apply to machine workloads to define who or what can invoke models, pass prompts, or consume outputs in production.
- Test for policy enforcement at execution time Require evidence that controls can inspect and stop unsafe behaviour without changes to application code or developer workflows. If enforcement only exists before deployment, it is not enough for live AI systems.
Key takeaways
- AI security becomes a workload governance problem once models move into production and interact with sensitive data at runtime.
- Edge controls and SDK hooks can help, but they do not replace enforceable visibility inside the runtime environment where decisions are actually made.
- Identity and access teams should treat AI workloads like governed machine identities and test whether policy can still act after deployment.
Standards & Framework Alignment
This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.
OWASP Agentic AI Top 10 and OWASP Non-Human Identity Top 10 address the attack and risk surface, while NIST CSF 2.0 set the governance and control requirements practitioners need to meet.
| Framework | Control / Reference | Relevance |
|---|---|---|
| OWASP Agentic AI Top 10 | Runtime policy and safe execution are core to agentic AI governance. | |
| NIST CSF 2.0 | PR.AC-4 | Access control and policy enforcement must apply inside AI workloads. |
| OWASP Non-Human Identity Top 10 | NHI-03 | AI workloads in production behave like machine identities needing lifecycle control. |
Treat AI workloads as governed non-human identities and define their runtime permissions explicitly.
Key terms
- Runtime AI security: Runtime AI security is the practice of controlling model behaviour, prompt handling, and data access while the system is actively executing. It focuses on live enforcement in production, where the real risk is what the workload does after deployment, not only what it was allowed to do in design.
- Workload-layer visibility: Workload-layer visibility is the ability to observe activity inside the running environment where AI inference or orchestration occurs. It gives security teams evidence about model use, prompt flow, and policy behaviour at execution time, which is stronger than monitoring only network traffic or application logs.
- Policy enforcement at execution time: Policy enforcement at execution time means security controls can inspect and stop behaviour while the workload is running. For AI systems, that is essential because prompt processing and output generation are live actions that cannot be governed reliably by pre-deployment checks alone.
- Cloud-native AI workload: A cloud-native AI workload is an AI system deployed in containers, Kubernetes, or similar infrastructure where inference and orchestration happen in production environments. These workloads inherit the same identity, access, and runtime governance issues as other machine-operated services.
Deepen your knowledge
NHI governance, agentic AI identity, and machine identity security are core topics in our NHI Foundation Level course, the industry's only accredited NHI security programme. If you are responsible for identity security strategy or programme maturity, it is worth exploring.
This post draws on content published by Aqua Security: Operationalizing AI Security: Protecting Workloads Where AI Runs. Read the original.
Published by the NHIMG editorial team on 2025-07-22.
NHI Mgmt Group — the independent authority on Non-Human Identity, IAM, and Agentic AI security. nhimg.org