Google Vertex AI agents still need cloud-neutral authorization

By NHI Mgmt Group Editorial TeamPublished 2026-06-12Domain: Agentic AI & NHIsSource: Descope

TL;DR: Google Vertex AI Agent Engine provides strong workload identity and Google Cloud authorization, but stops short of per-request application authorization, fleet-wide credential management, and resource authorization for MCP and backend APIs, according to Descope. The gap is that agent identity alone does not decide what each request may do, for which user, and with whose approval.

At a glance

What this is: This is an analysis of what Google Vertex AI Agent Engine covers for agent identity and where application authorization, credential vaulting, and MCP resource access still need another layer.

Why it matters: It matters because IAM teams need to separate workload attestation from request-level authorization when agents call multiple services, handle user context, or operate across clouds.

👉 Read Descope's analysis of cloud-neutral authorization for Vertex AI agents

Context

Google Vertex AI Agent Engine gives agents an attested SPIFFE-based identity and binds Google Cloud access tokens to that runtime identity, but that only solves part of the governance problem. The remaining question is not whether the workload is real, but what that workload should be allowed to do on each request across application resources, MCP servers, and third-party services.

That distinction matters for NHI governance because agent identity, token issuance, credential storage, and approval all sit in different control layers. If those layers are collapsed into a single cloud IAM model, teams end up with strong runtime identity inside one platform and weak authorization everywhere else. For teams building agentic systems, the control boundary has to follow the token flow, not the vendor boundary.

Key questions

Q: How should security teams govern AI agents that call multiple APIs and MCP servers?

A: They should separate runtime identity, application authorisation, and credential retrieval into distinct controls. The agent can be attested by the platform, but the scopes it receives for each request should be decided at issuance against the invoking user, tenant, and target resource. That prevents broad platform identity from becoming broad application privilege.

Q: Why is cloud IAM alone not enough for agentic workloads?

A: Cloud IAM only governs what a workload may do inside a cloud control plane. Agentic workloads also need request-level decisions for user context, consent, and resource scopes across MCP servers and external APIs. Without that extra layer, the application has to re-implement authorisation logic in code, which is harder to audit and easier to drift.

Q: What breaks when agent credentials are stored only in the runtime environment?

A: Long-lived secrets become available to any tool path, prompt leak, or misconfigured connector that can reach the runtime. A credential broker pattern reduces that exposure by keeping raw secrets in a vault and exchanging them only when policy approves the request. The key failure is uncontrolled secret persistence, not just weak storage.

Q: Who is accountable for delegated access decisions in agent workflows?

A: Accountability should rest on the system that issues the scope, the approver if one is involved, and the team operating the policy. Logging the agent’s action alone is not enough because it does not explain why the privilege existed. Teams need records that link the user, tenant, approval, and granted scope.

Technical breakdown

SPIFFE-based agent identity in Vertex AI

Google’s agent identity model uses SPIFFE-style cryptographic attestation so an agent running in Vertex AI can be tied to a workload identity rather than a shared secret or generic service account. The key security property is that the identity is provisioned by the runtime and bound to certificates, which makes token theft less useful outside that runtime. But this solves workload authenticity, not request authorisation. The agent is known to the platform, yet the platform still does not decide what that agent may do in an application context for a given user.

Practical implication: treat runtime attestation as a baseline control, not as permission to skip per-request authorisation.

Cloud IAM is not application authorization

Google Cloud IAM answers what an identity can access in Google Cloud resources, but it is not a full resource authorization server for your own APIs or MCP servers. In this model, roles are assigned ahead of time and the per-user question is pushed into application code. That creates a governance split: infrastructure access is centrally managed, while user-scoped token decisions, scope issuance, and consent remain bespoke. The consequence is that application authorization becomes inconsistent whenever agents move outside one cloud or one control plane.

Practical implication: separate cloud roles from application scopes and evaluate both at issuance.

Credential vaulting and token exchange for agent tools

Agentic systems need a credential broker because models and tools should not hold long-lived secrets in their environment. A vault-backed exchange pattern keeps raw API keys, OAuth client credentials, and delegated tokens out of the model context, while tools retrieve scoped credentials when needed. That reduces exposure, but the security value comes from the exchange policy, not just from storage. If retrieval is not authorised at issuance, the vault becomes a secret dispenser rather than a governance control.

Practical implication: enforce policy at token exchange time and remove static secrets from agent environments.

NHI Mgmt Group analysis

Cloud-attested agent identity does not equal governed agent authority. Vertex AI’s SPIFFE-based attestation answers the question of whether the agent is authentic inside Google Cloud, but not the question of whether the agent should be allowed to act for a specific user or resource. That matters because identity proof and authorisation are not interchangeable control layers. Practitioners should resist treating runtime attestation as a complete governance model for agentic access.

The missing control is issuance-time decisioning, not another login flow. The article shows that Google Identity Platform authenticates users, but it does not function as an authorisation server for the resources the agent will call. That leaves a policy gap between sign-in and token use, especially for MCP and backend APIs. The practical conclusion is that request-level scope control must sit where the token is born, not where the user signs in.

Cloud-neutral agent governance is becoming the real control plane for NHI fleets. The strongest design pattern here is not vendor integration, but separation of runtime identity from fleet-wide authority. One system proves the workload, another governs what the workload may do across clouds, tenants, and applications. That reflects where NHI governance is heading: identity is increasingly distributed, but policy still has to be consistent.

Delegated access needs an audit trail that explains why a scope existed. Logging that an agent called a resource is not enough if the organisation cannot show who authorised the scope, under what tenant, and with which approval. The article’s value is in highlighting that denied scopes, CIBA approvals, and token issuance inputs are the real governance artefacts. Practitioners should demand logs that capture authorisation rationale, not only execution traces.

Named concept: application authorization drift. This is the gap that appears when cloud IAM is mistaken for end-to-end agent governance and application-specific scopes are left to ad hoc code paths. The drift is structural because the control plane that knows the workload is not the same plane that decides user-scoped access across APIs and MCP servers. The implication is that teams need a single policy model for issuance, exchange, and approval across the fleet.

From our research:
80% of organisations report their AI agents have already performed actions beyond their intended scope, including accessing unauthorised systems (39%), inappropriately sharing sensitive data (31%), and revealing access credentials (23%), according to AI Agents: The New Attack Surface report.
52% of companies can track and audit the data their AI agents access, leaving 48% with a complete blind spot for compliance and breach investigation.
That is why practitioners should pair runtime identity controls with a governance model for scope, approval, and audit, as explored in OWASP NHI Top 10.

What this signals

Cloud-neutral agent governance is becoming the practical requirement for teams that run agents across more than one runtime. If the policy layer only exists in one cloud, the organisation ends up with uneven control quality by deployment location, not by risk. That is why the control plane has to follow the token flow, including issuance, exchange, and approval, rather than stopping at the workload boundary.

Application authorization drift is the failure mode to watch. Once per-request scope decisions are pushed into application code, teams lose consistency across MCP servers, internal APIs, and external connections. The more agentic the workflow becomes, the more that drift shows up as inconsistent consent handling, incomplete logs, and privileges that are broader than the user or tenant intended.

With 80% of organisations already seeing AI agents act beyond intended scope, per AI Agents: The New Attack Surface report, this is no longer a theoretical architecture discussion. Security teams should assume that unattended agent actions will surface governance gaps faster than traditional application testing does.

For practitioners

Map the control boundary before deploying agents Document which decisions are handled by cloud IAM, which are handled by application scopes, and which are handled by approval workflows before any production rollout. Keep the token flow explicit from user sign-in to agent session to tool credential retrieval.
Remove static secrets from agent tool paths Replace hardcoded API keys and long-lived tokens with a vault-backed exchange pattern so tools fetch scoped credentials only when policy allows. Ensure the raw secret never enters model context or environment variables.
Issue scopes at the moment of use Evaluate user, tenant, agent, and requested scope together at token issuance so the agent cannot request broader access than the invoking identity already holds. Deny over-scoped exchanges rather than trying to police them later in downstream APIs.
Require approval artefacts for sensitive actions Use out-of-band approval for elevated operations and keep the approver, binding message, and granted token in one record. That makes delegated authority auditable instead of implied by session activity.

Key takeaways

Vertex AI provides strong runtime identity, but runtime identity alone does not solve request-level authorisation for agentic workflows.
The most material control gap is issuance-time scope decisioning across users, tenants, agents, and resources.
Teams should govern agent fleets with a cloud-neutral policy layer that covers approval, vaulting, and audit across every runtime.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Non-Human Identity Top 10 address the attack and risk surface, while NIST Zero Trust (SP 800-207) and NIST AI RMF set the governance and control requirements practitioners need to meet.

Framework	Control / Reference	Relevance
OWASP Non-Human Identity Top 10	NHI-02	The article centers on credential handling and scoped access for non-human identities.
NIST Zero Trust (SP 800-207)	PR.AC-4	The post argues for continuous, request-level authorisation rather than broad standing roles.
NIST AI RMF		Agentic governance requires explicit accountability for autonomous-like decision paths.

Apply per-request access decisions to agent sessions and keep cloud roles separate from application scopes.

Key terms

Agentic Identity: Agentic identity is the governance layer that determines how an AI agent proves who it is, what it may access, and when it may act. In practice, it combines runtime attestation, scoped authorisation, and credential exchange so the agent can operate without carrying broad standing secrets.
Credential Vault: A credential vault stores secrets, tokens, and client credentials outside the application or model runtime. For agentic systems, the vault matters because it prevents long-lived credentials from being exposed to prompts, logs, or tool code, while still allowing policy-controlled retrieval when a request is valid.
Issuance-time Policy: Issuance-time policy is the decision made before a token or credential exists. Instead of checking access only after a request reaches an API, the policy engine evaluates the user, agent, tenant, and scope first, which is how request-level authorisation stays auditable and bounded.
CIBA: CIBA, or Client-Initiated Backchannel Authentication, is a flow for out-of-band approval when an action needs human confirmation without a browser session. For agentic workflows, it is useful when the agent triggers a sensitive task and the organisation needs a separate approval record linked to the issued token.

Deepen your knowledge

NHI governance, agentic AI identity, and machine identity lifecycle are core topics in our NHI Foundation Level course, the industry's only accredited NHI security programme. If you are responsible for identity security strategy or NHI governance in your organisation, it is worth exploring.

This post draws on content published by Descope: Build identity-aware agents with Google Vertex AI, ADK and Descope. Read the original.

NHIMG Editorial Note
Published by the NHIMG editorial team on 2026-06-12.
NHI Mgmt Group — the independent authority on Non-Human Identity, IAM, and Agentic AI security. nhimg.org