Gemini CLI prompt injection shows the new AI tool attack surface

By NHI Mgmt Group Editorial TeamPublished 2025-11-17Domain: Breaches & IncidentsSource: Cyera

TL;DR: Command injection and prompt injection flaws in Google’s Gemini CLI showed that AI development tools can turn model interaction into system-level compromise; the issues were fixed through Google’s Vulnerability Rewards Program, according to Cyera. Access review processes assume access persists long enough to be reviewed, but AI tools can translate prompt content into privileged execution in a single session.

At a glance

What this is: Cyera’s research shows that Gemini CLI could be driven from prompt content into arbitrary command execution, exposing development systems, credentials, and model data.

Why it matters: IAM, NHI, and security teams need to treat AI development tools as privileged execution paths, not just interfaces, because the trust boundary now spans prompts, local shells, and sensitive identities.

By the numbers:

Lack of credential rotation is cited as the top cause of NHI-related attacks by 45% of organisations, followed by inadequate monitoring and logging (37%) and over-privileged accounts (37%).
When AWS credentials are exposed publicly, attackers attempt access within an average of 17 minutes and as quickly as 9 minutes in some cases.

👉 Read Cyera's research on Gemini CLI command and prompt injection

Context

Gemini CLI sits at the point where natural language, developer workflow, and local command execution meet. That makes it a security boundary, not just a convenience layer. In practice, any tool that can turn user-controlled input into shell activity must be governed as a privileged identity surface, especially when it can reach credentials, configuration files, and model artifacts.

The risk here is not limited to classic command injection. Prompt injection expands the attack surface by letting untrusted content influence downstream execution decisions, which is why AI development tools need the same identity discipline applied to NHI-controlled automation. The first question for practitioners is not whether the model is clever enough, but whether the execution path is bounded enough to contain misuse.

Key questions

Q: How should security teams govern AI development tools that can execute shell commands?

A: Security teams should govern AI development tools as privileged execution surfaces, not as simple interfaces. That means separating prompt handling from command execution, removing shell interpolation, constraining filesystem access, and logging tool-driven actions as identity events. If the tool can reach credentials or repositories, it needs the same containment discipline used for sensitive NHI workloads.

Q: Why do prompt injection flaws become more dangerous when a CLI can access local secrets?

A: Prompt injection becomes materially more dangerous when the tool can act with inherited local privileges. In that case, the attacker is not only influencing text output. They are steering a process that may already have access to secrets, code, and deployment artefacts. That turns a model-level issue into a compromise path for adjacent systems.

Q: What do teams get wrong about command injection in AI tooling?

A: Teams often assume command injection is only a code-quality issue inside the application. In AI tooling, it is also an identity issue because the process runs with delegated authority from the user or environment. If a filename or prompt can alter execution, the real failure is the trust boundary, not just the syntax check.

Q: Who is accountable when an AI CLI tool turns a prompt into system-level access?

A: Accountability sits with the organisation that granted the process its privileges and approved the workflow architecture. The human may type the prompt, but the tool is the actor that executed the command under inherited authority. That is why AI CLI governance needs ownership, review, and containment rules at the platform level.

Technical breakdown

Command injection in CLI wrappers

A CLI wrapper becomes dangerous when it interpolates user-controlled paths or arguments into shell commands. In this case, the vulnerable pattern is shell execution with string concatenation, where metacharacters in a filename can change the command the process runs. That is a classic command injection condition, but the identity angle matters: the process often inherits the developer’s privileges, environment variables, and filesystem reach. Once the shell executes, the attacker is no longer probing the model. They are using the tool as a privileged execution bridge.

Practical implication: avoid shell interpolation for tool installation and execution paths, and treat path handling as an authorization boundary.

Prompt injection and command substitution

Prompt injection becomes operationally meaningful when model output is consumed by a system component that can act on it. If validation only blocks one command substitution form, such as $(), but leaves backticks or other equivalent syntaxes open, the policy is incomplete. The problem is not simply that the model can be tricked. The problem is that the tool accepts model-influenced text and turns it into execution context without a full policy model for dangerous tokens, side effects, and downstream command composition.

Practical implication: validate the full execution grammar, not just one syntax variant, before model-influenced text reaches a shell.

AI development tools as privileged identity surfaces

AI command-line tools often inherit access to local credentials, configuration, repositories, and deployment assets. That makes them closer to a workload identity than a simple application frontend. Once the tool can read secrets or invoke code paths, compromise reaches beyond the immediate session into adjacent systems and supply-chain artefacts. The broader lesson is that AI development environments collapse multiple trust boundaries into one workflow, so identity, endpoint, and application controls must be designed together rather than layered after deployment.

Practical implication: segment AI dev tooling from production secrets and apply least-privilege execution to the surrounding workstation and CI environment.

Threat narrative

Attacker objective: The attacker aims to convert an AI development workflow into privileged command execution that exposes credentials, code, and model data.

Entry occurs when an attacker supplies a malicious filename or prompt content that reaches the Gemini CLI execution path. Credential access follows when the compromised tool runs with the developer’s local privileges and can touch environment variables, files, or model artifacts. Impact occurs when the attacker uses that execution path to read secrets, modify code, or pivot into development and deployment systems.

MongoBleed breach — MongoBleed exposed secrets across 87K MongoDB servers.
IOS app secrets leakage report — iOS apps leaking hardcoded secrets and credentials endangering user privacy.

Read our 52 NHI Breaches Analysis report for a comprehensive view of breaches impacting Non-Human Identities including AI Agents.

NHI Mgmt Group analysis

Prompt injection is now an execution problem, not just a model problem: Cyera’s findings show that natural language inputs become security events when the toolchain converts them into shell activity. That means the control objective is no longer only model safety, but the integrity of the execution boundary around the model. Practitioners should treat AI development tools as identity-bearing systems with explicit trust boundaries, not as passive interfaces.

Shell interpolation is the new identity exposure window: The vulnerable pattern here is not merely unsafe code, but the assumption that developer-controlled commands remain developer-controlled after model mediation. Once a CLI process can transform filenames or prompts into execution, the attack surface shifts to the runtime privilege of the process itself. The implication is that workflow convenience and privilege containment must be designed as one problem, not separate ones.

Least privilege for AI tooling must include the workstation and execution chain: A model-facing tool that can reach credentials, repositories, and deployment assets behaves like a high-trust workload identity. If it inherits broad local permissions, compromise of the tool becomes compromise of everything adjacent to it. That makes workstation hardening, secret isolation, and execution policy part of AI governance, not just endpoint hygiene.

Runtime validation gaps are now part of NHI governance: The broader category issue is incomplete input-to-execution validation in AI developer tools. This is a governance failure because the identity that executed the action was not the human at the keyboard alone, but the tool acting under inherited authority. Practitioners should classify AI CLI processes as governed non-human actors wherever they can invoke privileged actions.

Attack surface compression is the named concept here: Gemini CLI compresses prompt, code, and shell into one trust zone, which means a single flaw can cross from text handling into command execution and data exposure. That compression increases blast radius because model interaction, validation, and privilege use happen in the same session. The implication is that teams must separate inference, prompting, and execution wherever they can.

From our research:
When AWS credentials are exposed publicly, attackers attempt access within an average of 17 minutes and as quickly as 9 minutes in some cases, according to LLMjacking: How Attackers Hijack AI Using Compromised NHIs.
Our survey data also shows: 85% of organisations lack full visibility into third-party vendors connected via OAuth apps, with 38% reporting no or low visibility and 47% reporting partial visibility.
For the wider control picture: Review the 52 NHI Breaches Analysis for recurring exposure patterns that turn identity trust into breach opportunity.

What this signals

Attack surface compression: AI developer tools are collapsing prompt handling, code execution, and secret access into one workflow, which makes containment harder to reason about. Teams should expect more scrutiny of CLI tools, local agents, and extension-based workflows because the privilege boundary is moving closer to the prompt itself.

The practical response is to define which AI-assisted actions are allowed to touch shells, repositories, and deployment systems, then isolate everything else. That means keeping model interaction separate from execution authority and aligning control ownership across IAM, endpoint, and application security.

For practitioners already mapping NHI exposure, the warning is simple: if a tool can use a secret, it behaves like a governed identity. The next step is to decide whether the process has explicit lifecycle controls, runtime logging, and revocation paths before it reaches production workflows.

For practitioners

Separate prompt handling from command execution Run AI developer tools in constrained environments where model output cannot directly reach a shell. Use explicit allowlists for any command that the tool is permitted to invoke, and keep sensitive workspace paths outside the execution context.
Remove shell interpolation from installation and wrapper logic Rewrite CLI operations to call processes with argument arrays rather than constructed strings. Review every path that handles filenames, plugins, or extensions for metacharacter handling before it reaches execution.
Treat AI dev tools as privileged NHI workloads Apply least privilege to the process, workstation, and surrounding automation. Isolate development credentials, restrict file access to installation directories, and monitor command execution from AI-assisted workflows as a distinct identity event.
Audit prompt-to-action pathways for bypasses Test backticks, chained commands, subshells, and any alternate syntax that can reintroduce command execution after validation. Focus on whether the validator blocks the whole grammar of dangerous execution, not just one token pattern.

Key takeaways

Gemini CLI showed how prompt injection and command injection can turn an AI developer tool into an execution path with inherited privileges.
The practical risk is not limited to the model layer, because the process can reach credentials, repositories, and configuration files.
Teams need to treat AI tooling as a governed non-human identity surface and remove shell interpolation wherever commands are constructed.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Agentic AI Top 10 and OWASP Non-Human Identity Top 10 address the attack and risk surface, while NIST CSF 2.0 set the governance and control requirements practitioners need to meet.

Framework	Control / Reference	Relevance
OWASP Agentic AI Top 10	A1	Prompt injection and tool misuse are core agentic application risks.
OWASP Non-Human Identity Top 10	NHI-03	Shell interpolation and secret exposure are NHI credential risk patterns.
NIST CSF 2.0	PR.AC-4	The tool inherits privileged access that must be constrained and monitored.

Remove inherited privilege from AI tooling and rotate any exposed secrets immediately.

Key terms

Prompt injection: Prompt injection is a way of supplying instructions that change how an AI system behaves at runtime. In an execution-capable tool, the risk is not just altered output. The injected content can influence actions, tool use, or downstream commands if the system fails to separate instructions from untrusted input.
Command injection: Command injection occurs when attacker-controlled data is inserted into a shell command and changes what the process executes. In AI tooling, that often happens through wrappers, plugins, or installation flows that turn paths or prompts into shell strings. The impact is privilege abuse through the process’s inherited authority.
AI development tool: An AI development tool is a local or integrated system that lets developers interact with models from within their workflow. When it can read files, invoke commands, or access credentials, it becomes a privileged non-human identity surface and needs governance like any other high-trust automation.
Execution boundary: An execution boundary is the point where input becomes action. In secure AI systems, that boundary must stop untrusted text from turning into shell commands, file writes, or privileged API calls. If the boundary is weak, model interaction and system compromise can happen in the same session.

Deepen your knowledge

AI developer tooling security is covered in the NHI Foundation Level course, the industry's only accredited NHI security programme. If you are governing prompt-driven execution paths or local AI workflows, it is worth exploring.

This post draws on content published by Cyera: From Prompt to Exploit, which discloses command and prompt injection vulnerabilities in Gemini CLI. Read the original.

NHIMG Editorial Note
Published by the NHIMG editorial team on 2025-11-17.
NHI Mgmt Group — the independent authority on Non-Human Identity, IAM, and Agentic AI security. nhimg.org