Subscribe to the Non-Human & AI Identity Journal

What breaks when AML pipeline scripts can be modified in linked storage?

The execution boundary breaks. If storage-backed invoker scripts are writable, an attacker can change the code that AML later runs on compute, turning a storage permission into code execution authority. The control failure is not just weak access control on blobs. It is allowing runtime artefacts to remain mutable after they have become part of the job’s trusted execution path.

Why This Matters for Security Teams

When AML pipeline scripts live in linked storage, the trust boundary shifts from “data at rest” to “code at runtime.” That means a storage write permission can become direct execution authority if the script is later mounted, copied, or invoked by compute. This is the same structural problem seen in secret sprawl and CI/CD compromise: once a mutable artefact is treated as trusted input, an attacker does not need to defeat the job runner first. NHI Management Group has documented how often sensitive material ends up in exposed operational paths in the Ultimate Guide to Non-Human Identities, and NIST CSF 2.0 frames this as a governance and integrity issue, not only an access issue. See NIST Cybersecurity Framework 2.0 for the broader control lens.

Practitioners often focus on who can read the storage account, but the real risk is who can alter what the compute node will trust later. In practice, many security teams encounter pipeline script tampering only after a job has already executed modified code, rather than through intentional change control.

How It Works in Practice

The failure mode begins when a pipeline treats storage as both source repository and runtime dependency. If an AML job loads a script from linked storage at execution time, then the script is no longer just content. It is a live control surface. An attacker who can write to that location can inject data exfiltration, alter model parameters, disable logging, or redirect outputs without needing shell access to the compute environment.

The safer pattern is to make the runtime artefact immutable before execution. That usually means one of three approaches: promote scripts into a controlled build artefact, pin execution to a signed version, or copy the script into a read-only execution bundle before the job starts. Current guidance suggests combining this with short-lived credentials, tightly scoped workload identity, and policy checks at request time rather than relying on static RBAC alone. The reason is simple: a pipeline script is not a human user with a stable role. It is an operational workload whose risk changes by job, dataset, and time.

For teams designing controls, the most useful questions are whether the storage path is writeable after approval, whether changes are versioned and reviewed, and whether the runner verifies integrity before execution. A practical hardening sequence is:

  • Separate authoring storage from execution storage.
  • Require signed artefacts or content hashing before job launch.
  • Use least-privilege workload identities for the AML runner.
  • Log and alert on any post-approval modification to pipeline scripts.
  • Revoke or rotate any secrets exposed to the job if tampering is detected.

NHIMG’s research on the CI/CD pipeline exploitation case study shows how trusted build paths are frequently abused once mutable artefacts are reachable. These controls tend to break down when linked storage doubles as a collaboration workspace because multiple teams, automation jobs, and service principals share the same writable path.

Common Variations and Edge Cases

Tighter script immutability often increases delivery friction, requiring organisations to balance release speed against the risk of runtime tampering. That tradeoff is especially visible in experimentation-heavy AML environments where notebooks, scripts, and data drops are constantly changing. There is no universal standard for this yet, but best practice is evolving toward separating exploratory work from production execution and enforcing a controlled promotion step.

Edge cases appear when scripts are generated dynamically, chained from multiple storage locations, or partially templated by upstream jobs. In those environments, integrity controls need to cover the whole chain, not just the final file. A hash check on the last script is not enough if a preceding template, environment variable, or dependency package can still be modified after approval. The same concern applies when temporary access is granted for debugging: short-lived access is safer than standing access, but it still needs explicit revocation and auditability.

This is also where “storage permission” language becomes misleading. If a storage object is executable, the permission model is closer to code deployment than file sharing. The practical response is to treat mutable pipeline artefacts like release candidates, not documents. For broader governance context, the Guide to the Secret Sprawl Challenge is a useful reminder that operational convenience often outlives the original trust decision.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Non-Human Identity Top 10 and CSA MAESTRO address the attack and risk surface, while NIST AI RMF set the governance and control requirements practitioners need to meet.

Framework Control / Reference Relevance
OWASP Non-Human Identity Top 10 NHI-03 Mutable pipeline scripts create secret and identity exposure paths.
CSA MAESTRO Agentic workloads need runtime trust boundaries and integrity checks.
NIST AI RMF This is a runtime integrity and governance risk for AI workloads.

Apply AI RMF governance to require approval, traceability, and integrity checks before execution.