How should organisations respond when AI compute is being used as delivery infrastructure?

They should contain the cluster, revoke any exposed access paths, inspect repository-driven update channels, and separate build and runtime authority immediately. The goal is to cut off the delivery mechanism before the attacker can refresh payloads or spread laterally to additional clusters.

Why This Matters for Security Teams

When AI compute is used as delivery infrastructure, the cluster is no longer just a place where workloads run. It becomes the attacker’s distribution point for payload refresh, lateral movement, and repeatable access. That shifts the response from “contain the app” to “contain the identity, build path, and update path.” Current guidance from the NIST Cybersecurity Framework 2.0 still applies, but the operational priority is faster because autonomous systems can reintroduce malicious artefacts without human pacing.

NHIMG’s reporting on the DeepSeek breach shows why delivery channels matter: once a system can reach repositories, models, or orchestration layers, the compromise can persist even after a single host is remediated. The practical mistake is treating AI infrastructure like a conventional compute node rather than an execution environment with embedded trust paths. In practice, many security teams encounter repeat compromise only after the delivery channel has already been reused to seed the next payload.

How It Works in Practice

The immediate response is to break the attacker’s ability to use the cluster as a distribution mechanism. That means isolating affected nodes, invalidating credentials, and cutting repository or registry access until the trust chain is verified. For AI workloads, the main question is not only what ran, but what was allowed to publish new artefacts, pull dependencies, or trigger downstream jobs. The response should include build system review, runtime segmentation, and separation of duties between update authority and execution authority.

Practitioners should treat access paths as dynamic delivery controls, not static admin settings. A useful sequence is:

Quarantine the compute plane and any connected orchestration layer.
Revoke secrets, tokens, and service credentials that can publish, pull, or promote code or model artefacts.
Inspect CI/CD hooks, repository-driven update channels, and container registry permissions.
Confirm whether the AI system can call tools, spawn jobs, or write to shared storage.
Rebuild from known-good images and rotate all secrets before returning the cluster to service.

Identity and policy controls should be validated at request time. That aligns with the NIST Cybersecurity Framework 2.0 emphasis on protective and recovery outcomes, but in agentic environments the better model is ephemeral authority rather than standing privilege. NHIMG’s The State of Secrets in AppSec highlights the scale of the secrets problem, and that matters here because any long-lived token embedded in the delivery chain can be reused to refresh malicious workloads even after the first compromise is detected. These controls tend to break down when the cluster is tightly coupled to production release automation because revocation can interrupt legitimate deployments faster than teams can separate trust domains.

Common Variations and Edge Cases

Tighter containment often increases outage risk, requiring organisations to balance rapid interruption of attacker activity against the availability of legitimate AI services. Best practice is evolving for autonomous workloads, and there is no universal standard for this yet. Some environments can afford a full cluster freeze; others need a staged response that preserves evidence while disabling outbound publish rights and tool access first.

Edge cases usually involve shared infrastructure. If the same cluster supports training, inference, and deployment, separating build and runtime authority becomes harder but more important. If the AI system uses repository-based model updates, signed artefact verification and policy-as-code checks should be applied before anything is promoted. For AI-heavy environments, the DeepSeek breach is a reminder that delivery infrastructure can be as sensitive as the application itself, while identity governance remains the deciding factor in whether the attacker can persist.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Agentic AI Top 10 and CSA MAESTRO address the attack and risk surface, while NIST AI RMF set the governance and control requirements practitioners need to meet.

Framework	Control / Reference	Relevance
OWASP Agentic AI Top 10	A03	Agent delivery abuse is a core agentic AI attack path.
CSA MAESTRO	IAM	MAESTRO covers identity and authority separation for AI systems.
NIST AI RMF		AI RMF governance supports runtime risk control for autonomous systems.

Apply AI RMF governance to define containment, revocation, and recovery decisions for AI delivery infrastructure.

How should organisations respond when AI compute is being used as delivery infrastructure?

Why This Matters for Security Teams

How It Works in Practice

Common Variations and Edge Cases

Standards & Framework Alignment

Related resources from NHI Mgmt Group