Vertex AI misconfigurations expose privilege escalation in cloud AI

By NHI Mgmt Group Editorial TeamPublished 2026-06-04Domain: Agentic AI & NHIsSource: Unosecur

TL;DR: Misconfigured Vertex AI service agents can be over-permissioned, allowing attackers to create malicious jobs, escalate privileges, and exfiltrate sensitive data, according to Unosecur. The governance failure is not AI itself but cloud identity design that grants platform accounts more access than their job requires.

At a glance

What this is: This is an independent analysis of how Vertex AI service agent misconfigurations can create privilege escalation paths and sensitive data exposure.

Why it matters: It matters because AI platform identities now sit inside the same governance problem space as service accounts, workload identity, and least-privilege access models across modern IAM programmes.

👉 Read Unosecur's analysis of Vertex AI misconfigurations and privilege escalation

Context

Vertex AI misconfiguration becomes an identity governance problem the moment a platform service agent receives broader access than the job actually needs. In that state, the issue is not model quality or deployment speed but whether cloud IAM has made the service account too powerful to contain.

For IAM, IGA, and cloud security teams, the practical concern is that AI platform accounts often inherit permissions that were designed for convenience rather than task scope. Once those identities can create jobs, reach metadata, or touch storage and BigQuery paths they should not own, privilege escalation becomes a direct governance outcome rather than an edge case.

Key questions

Q: How should security teams restrict Vertex AI service agents without breaking workloads?

A: Start by separating platform convenience from business necessity. Give each Vertex AI service agent only the permissions required for one job class, one data path, and one environment. Then test whether the workload still functions if storage, metadata, or job creation rights are removed. If it fails, the role was too broad.

Q: Why do over-permissioned AI platform identities increase breach risk?

A: Because they turn a single workload compromise into an identity pivot. When an AI service account can create jobs, access metadata, and reach data stores, attackers can move from execution to credential reuse and exfiltration without needing to break the AI model itself. That is a privilege design failure, not an AI model failure.

Q: What signals show that an AI workload identity is operating beyond its intended scope?

A: Look for unexpected job creation, unusual container image sources, metadata reads from training environments, and access to storage or analytics services that the workload does not normally use. A workload identity that starts touching adjacent resources is usually the earliest sign that scope has drifted.

Q: Who should own governance when AI platform accounts are over-privileged?

A: Ownership should sit jointly with cloud IAM, platform engineering, and the security team, because the failure spans role design, workload execution, and monitoring. Vertex AI identities are not just application accounts. They are non-human identities with cloud-wide consequences when lifecycle and access reviews are weak.

Technical breakdown

How Vertex AI service agents become an escalation path

Vertex AI relies on service agents to perform platform actions on behalf of the environment, which makes them a form of non-human identity. The security problem appears when those identities are granted project-wide or platform-wide permissions that exceed the specific training or deployment task. Attackers do not need to break the AI model to abuse this layer. They only need a path into a misconfigured job or container, after which the over-privileged service agent can be used to reach additional resources, metadata services, and downstream data stores.

Practical implication: inventory every Vertex AI-related service agent and reduce each one to the narrowest task-scoped role.

Custom training jobs and container injection as the entry point

Custom training jobs extend flexibility by letting users run code and containers in managed environments, but that flexibility also expands the attack surface. If permissions allow unauthorized job creation, an attacker can inject harmful commands or swap in a compromised custom container to execute arbitrary actions inside the platform context. That is not a model compromise. It is a control-plane abuse problem in which identity permissions are the real enforcement boundary. Once the job starts with the wrong authority attached, every subsequent action inherits that mistake.

Practical implication: restrict custom job creation to approved identities and validate container provenance before execution.

Why metadata access and data exfiltration follow privilege escalation

Cloud metadata services and attached storage paths are high-value targets because they often contain credentials, tokens, and data access paths that extend far beyond the original workload. In a misconfigured Vertex AI deployment, a successful escalation can turn a training job into a springboard for credential theft and exfiltration from Cloud Storage or BigQuery. The architecture failure is cumulative. A single over-broad role may not look dangerous in isolation, but combined with code execution inside a managed job it becomes a route to sensitive data and broader cloud compromise.

Practical implication: treat metadata exposure, storage access, and dataset permissions as part of the same identity boundary.

Threat narrative

Attacker objective: The attacker aims to turn a platform-level AI workload into a cloud identity pivot that exposes credentials and sensitive business data.

Entry occurs when an attacker gains a foothold through a misconfigured Vertex AI job path, such as unauthorized custom job creation or container-based execution.
Credential access follows when the over-permissioned service agent or attached environment exposes metadata, tokens, or adjacent cloud permissions that the attacker can reuse.
Impact occurs when those credentials are used to escalate privileges, exfiltrate sensitive data, and extend access into Cloud Storage, BigQuery, or other cloud resources.

Moltbook AI agent keys breach — Moltbook breach exposed 1.5M AI agent keys.
Azure Key Vault privilege escalation exposure — Azure Key Vault Contributor role misconfiguration enabled privilege escalation.

Read our 52 NHI Breaches Analysis report for a comprehensive view of breaches impacting Non-Human Identities including AI Agents.

NHI Mgmt Group analysis

Over-permissioned AI platform identities are now a first-order cloud governance problem. Vertex AI service agents are not just operational plumbing. When they are granted access beyond the job boundary, they become a reusable control-plane identity that can be driven into privilege escalation and data access paths. That makes AI platform governance part of IAM, not a separate AI security conversation. Practitioners should treat every platform-managed agent as a scoped identity with measurable blast radius.

Least privilege fails in AI platforms when job execution authority is broader than job intent. The permissions model was designed for scheduled or operator-driven actions where intent is known in advance. That assumption breaks when custom jobs can be created, altered, or chained into unintended execution paths inside a managed AI environment. The implication is that access scope must be judged against runtime behaviour, not just declared workflow purpose.

Control-plane abuse through custom training jobs is a named failure mode, not a generic cloud risk. This article shows how the aiplatform.customJobs.create path can become an escalation primitive when role design is too loose. The problem is not merely that permissions exist. It is that identity and execution authority are fused in a way that lets a low-trust action trigger high-trust infrastructure behaviour. Practitioners should map every AI job path to its privilege boundary.

Vertex AI misconfiguration exposes identity blast radius across storage, metadata, and downstream data platforms. Once a service agent can touch adjacent resources, the compromise path expands far beyond the initial job. That is why AI workload governance must be evaluated as an identity chain, not as a single application control. The practical conclusion is that cloud AI security and workload identity governance must be reviewed together.

OWASP NHI guidance applies directly to AI service agents because the breach pattern is fundamentally non-human identity abuse. Over-privileged service accounts, secret exposure, and weak lifecycle governance are the same structural issues that appear across machine identity incidents. Vertex AI simply makes the failure more visible because the platform can combine code execution with broad cloud permissions. Practitioners should align AI platform identities to the same governance standards used for other NHIs.

From our research:
19% of organisations give AI systems dramatically more access than human employees, nearly one in five granting unrestricted privilege, according to The 2026 Infrastructure Identity Survey.
Only 44% of organisations have implemented any policies to manage their AI agents, despite 92% agreeing that governing AI agents is critical to enterprise security.
For adjacent context, the The 52 NHI breaches Report shows how identity mistakes become breach paths long before defenders notice.

What this signals

Vertex AI-style misconfiguration will keep showing up wherever AI workloads inherit cloud permissions by default. The programme implication is simple: identity teams need to review AI platform accounts with the same scrutiny they apply to service accounts, API keys, and other workload identities. When a platform agent can create jobs and touch data stores, the blast radius is already larger than most teams assume.

Identity blast radius: the useful metric here is not how many AI workloads exist, but how far each one can move once compromised. Teams should measure whether a job identity can reach metadata, storage, analytics, and orchestration layers without additional approval. That is the difference between a contained workload and a reusable compromise path.

With 67% of organisations still relying heavily on static credentials despite the risks they pose to agentic AI deployments, per The 2026 Infrastructure Identity Survey, the governance gap is structural. Vertex AI misconfigurations are one expression of a broader problem: cloud AI systems are being onboarded faster than identity controls are being redesigned.

For practitioners

Scope every Vertex AI service agent to task-level permissions Replace broad platform-wide roles with minimum required permissions for each training, deployment, and orchestration path. Review attached roles for customJobs, storage, and metadata access together so one identity cannot move laterally across the environment.
Block unauthorized custom job creation Limit who can create or modify custom training jobs, and require approval for workloads that can run arbitrary code or custom containers. Treat job creation as a privileged action because it can become the first step in privilege escalation.
Validate container provenance before execution Require trusted image sources, signed artifacts, and controlled registries for any container used in Vertex AI workflows. A compromised container turns a managed AI job into a code execution path with inherited identity privileges.
Monitor metadata and storage access as one control surface Alert on unusual reads from metadata services, Cloud Storage, and BigQuery from AI workload identities. These signals often appear after the initial job abuse and can indicate credential theft or data exfiltration in progress.
Map AI platform identities to your NHI governance model Include service agents, workload identities, and job-specific credentials in the same inventory, review cycle, and offboarding process you use for other non-human identities. This prevents AI platform accounts from becoming unmanaged standing privileges.

Key takeaways

Vertex AI misconfiguration is an identity problem first and an AI problem second, because over-privileged service agents create the escalation path.
The evidence points to a familiar pattern across cloud AI: job creation, container abuse, metadata access, and data exfiltration can all chain through one weak identity boundary.
Teams should narrow AI platform roles, control custom job creation, and treat workload identities as part of the same governance model as the rest of NHI.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Non-Human Identity Top 10 address the attack and risk surface, while NIST Zero Trust (SP 800-207) and NIST CSF 2.0 set the governance and control requirements practitioners need to meet.

Framework	Control / Reference	Relevance
OWASP Non-Human Identity Top 10		Covers over-privileged service agents and secret-adjacent identity abuse in AI workloads.
NIST Zero Trust (SP 800-207)	PR.AC-4	Directly addresses least-privilege access and trust boundaries for AI workload identities.
NIST CSF 2.0	PR.AC-1	Access control governance is central to preventing misuse of AI platform identities.

Map AI platform accounts into access governance reviews and remove unnecessary entitlements.

Key terms

Vertex AI service agent: A platform-managed identity that Vertex AI uses to act on behalf of the environment. In practice, it is a non-human identity with permissions that can reach jobs, data stores, and metadata services. If those permissions are too broad, the agent becomes a governance boundary rather than a convenience feature.
Custom training job: An AI workload that runs user-defined code or containers inside a managed environment. It is useful because it provides flexibility, but it also creates a privileged execution path. When creation rights are too broad, it can become the entry point for code execution and identity abuse.
Identity blast radius: The amount of damage a compromised identity can cause before it is contained. For AI workloads, it includes not only the original job but also adjacent storage, metadata, and analytics paths. The smaller the blast radius, the less a single misconfiguration can turn into cloud-wide exposure.
Control-plane abuse: Misuse of management-layer permissions to create, alter, or extend infrastructure behaviour without legitimate authorization. In AI platforms, it happens when job creation or orchestration rights are strong enough to drive privileged actions. The attack is about identity and execution authority, not model performance.

What's in the full article

Unosecur's full blog covers the operational detail this post intentionally leaves for the source:

Specific IAM Analyzer examples showing how excessive Vertex AI permissions are detected in practice
Identity Threat Detection and Response patterns for spotting unauthorized job creation and abnormal service-agent activity
Dynamic policy enforcement detail for tightening roles without interrupting AI workflows
Configuration guidance for Bring Your Own Service Account setups in Vertex AI environments

👉 Unosecur's full post covers the misconfiguration paths, service-agent risks, and mitigation steps in more detail

Deepen your knowledge

NHI governance, agentic AI identity, and machine identity security are core topics in our NHI Foundation Level course, the industry's only accredited NHI security programme. If you are responsible for identity security strategy or NHI governance in your organisation, it is worth exploring.

NHIMG Editorial Note
Published by the NHIMG editorial team on 2026-06-04.
NHI Mgmt Group — the independent authority on Non-Human Identity, IAM, and Agentic AI security. nhimg.org