TL;DR: Azure Databricks pipelines commonly rely on static service principal secrets to reach Azure APIs and SaaS targets, creating long-lived credential exposure and difficult-to-audit access, according to Aembit. Ephemeral, policy-based workload identity changes the control model by replacing stored secrets with short-lived tokens tied to verified runtime identity.
NHIMG editorial — based on content published by Aembit: Azure Databricks pipeline secrets are the real identity risk
By the numbers:
- 97% of NHIs carry excessive privileges, increasing unauthorised access and broadening the attack surface.
- 91.6% of secrets remain valid five days after the targeted organisation is notified, showing a critical gap in remediation procedures.
- 96% of organisations store secrets outside of secrets managers in vulnerable locations including code, config files, and CI/CD tools.
Questions worth separating out
Q: How should security teams replace static secrets in Databricks pipelines?
A: Replace stored service principal secrets with workload attestation and short-lived token exchange.
Q: Why do static credentials in data pipelines create NHI risk?
A: Static credentials create NHI risk because they persist beyond the business need that created them.
Q: What breaks when pipeline identity is not scoped tightly enough?
A: When identity is too broad, one job can inherit access intended for another job on the same cluster.
Practitioner guidance
- Remove embedded service principal secrets from pipeline configuration Replace hardcoded credentials in Databricks jobs with verified workload identity and short-lived token exchange so the pipeline never stores reusable secrets in code, variables, or job settings.
- Scope access at the pipeline level where workloads share a cluster Use process identifiers or job-specific identity claims so a narrow pipeline cannot inherit the broader access profile of another workload running on the same cluster.
- Audit every downstream service touched by Databricks pipelines Map which pipelines call Microsoft Graph, Azure resources, Salesforce, or other APIs, then verify whether each connection still depends on a standing secret rather than runtime attestation.
What's in the full article
Aembit's full blog post covers the operational detail this post intentionally leaves for the source:
- Step-by-step Databricks attestation flow using Azure Managed Identity and Databricks-issued OIDC tokens
- Policy examples for scoping one pipeline to Microsoft Graph while separating another from Salesforce access
- Configuration paths through the console, API, and Terraform provider for teams managing identity as code
- SIEM export details for attestation events, policy decisions, and downstream credential requests
👉 Read Aembit's analysis of secretless Azure Databricks pipeline access →
Azure Databricks pipeline secrets: what IAM teams need to know?
Explore further
Static pipeline secrets are a standing NHI exposure problem, not a Databricks convenience issue. The article describes a familiar pattern in which a pipeline authenticates with a reusable service principal secret that remains valid across months or years. That is not just poor hygiene, it is a governance model that assumes the credential's lifecycle will match the workload's lifecycle. In reality, data pipelines change faster than their secrets do. The practitioner conclusion is that any design relying on embedded client secrets creates persistent NHI risk by default.
A few things that frame the scale:
- 91.6% of secrets remain valid five days after the targeted organisation is notified, showing a critical gap in remediation procedures, according to Ultimate Guide to NHIs.
- Only 5.7% of organisations have full visibility into their service accounts, which is why pipeline-linked secrets are so often missed in inventory and offboarding.
A question worth separating out:
Q: Who is accountable when a pipeline secret is leaked or reused?
A: Accountability sits with the team that allowed standing access to persist after the original operational context changed. If the secret remains valid after offboarding, project completion, or environment reuse, the control failure is governance, not simply exposure detection.
👉 Read our full editorial: Azure Databricks pipeline secrets are the real identity risk