Copilot Studio Agents Exploited in New CoPhish OAuth Token Theft

A new phishing technique known as CoPhish was disclosed in October 2025. This attack abuses Copilot Studio agents to trick users into granting OAuth consents and thereby steals their OAuth tokens.

Because the agents are hosted on legitimate Microsoft domains and the phishing flow uses bona fide‑looking interfaces, victims are more likely to trust them.

Researchers from Datadog Security Labs disclosed the technique, warning that the flexibility of Copilot Studio introduces new, undocumented phishing risks that target OAuth-based identity flows. has confirmed the issue and said they plan to fix the underlying causes in a future update.

What Happened

Attackers crafted malicious Copilot Studio agents that abuse the platform’s “demo website” functionality. Because these agents are hosted on copilotstudio.microsoft.com, the URLs appear legitimate, making it easy for victims to trust and interact with them.

When a user interacts with the agent, it can present a “Login” or “Consent” button. If the user clicks it, they are redirected to an OAuth consent screen, made to look like a valid Microsoft or enterprise permission prompt, requesting access permissions.

Once the user grants consent, their OAuth token is silently forwarded to the attacker’s infrastructure, often via an HTTP request configured inside the malicious agent. The exfiltration can use legitimate Microsoft infrastructure, making it harder to detect through conventional network monitoring.

Crucially: this is not a software vulnerability but a social‑engineering abuse of legitimate platform features (custom agents + OAuth consent flows) to steal credentials.

How the Attack Works

An attacker (or compromised tenant) creates a malicious agent in Copilot Studio and enables the “demo website” sharing feature.
The agent’s “Login” topic is configured to redirect users to an OAuth consent flow, often masquerading as a legitimate login / authorization prompt.
The attacker distributes the agent link via phishing, email, chat, internal messages, relying on the legitimate Microsoft domain to avoid suspicion.
A victim clicks “Login” and consents, unknowingly granting permissions. The OAuth token returned is immediately forwarded, silently, to an attacker-controlled backend.
With the stolen token, attackers can access mail, chat, calendars, files, and other resources via OAuth/Graph APIs, potentially granting full tenant compromise depending on permissions.

Because everything is hosted under Microsoft’s own domain and appears legitimate, the attack bypasses phishing filters, domain‑based detection, and many usual safety nets.

What’s at Risk

Because OAuth tokens grant access to platforms, services, and data, stolen tokens from a successful CoPhish attack can lead to:

Unauthorized access to corporate resources: emails, chats, calendar, cloud files, internal documents, etc.
Persistent access until the token is revoked or expires, enabling long-term espionage or data theft.
Lateral movement: attackers with a stolen token could impersonate users, possibly including privileged users, to access more resources or escalate privileges.
Evading detection: because the token is exfiltrated via legitimate Microsoft infrastructure, traffic may appear benign, bypassing many standard security controls.

Given how easy it is to deploy Copilot Studio agents and how many organizations use Microsoft cloud tools, the potential blast radius is wide, from small teams to large enterprises.

What Organizations Should Do Right Now

To defend against CoPhish and similar AI‑based token‑theft attacks:

Enforce strict consent policies, require admin approval for any new OAuth apps or Copilot agent consents.
Lock down Copilot Studio, disable public sharing or “demo website” features, or restrict agent creation to trusted users only.
Monitor enrollment, consent, and token issuance events in your identity platform (e.g. Microsoft Entra ID) for anomalous or unexpected entries.
Revoke and rotate tokens periodically, especially after user role changes, or if unexpected consents arise.
Educate teams about the risk, treat AI agents as high‑privilege identities, not just convenience tools. Automated features can carry real security consequences.
Implement least‑privilege principles and granular permission scopes: only grant what’s strictly needed.

Overview

In March 2025, a major supply-chain attack compromised the popular GitHub Action tj-actions/changed-files, used by roughly 23,000 repositories.

The malicious update caused CI/CD secrets, like API keys, tokens, AWS credentials, npm/Docker credentials and more, to be dumped into build logs. For projects whose logs were publicly accessible (or visible to others), this meant secrets were exposed to outsiders.

A subsequent review estimated that at least 218 repositories confirmed leaked secrets.

What Happened

On March 14, 2025, attackers pushed a malicious commit to the tj-actions/changed-files repository. They retroactively updated several version tags so that all versions (even older ones) referenced the malicious commit, meaning users did not need to explicitly update to “vulnerable” version: their existing workflows were already poisoned. The compromised action included code that, when run during CI workflows, dumped the GitHub Actions runner process memory into workflow logs, exposing secrets, tokens, environment variables, and other sensitive data. On March 15, 2025, after the breach was detected, GitHub removed the compromised version and the action was restored once the malicious code was removed.

Because many projects rely on GitHub Actions and many developers use default tags (like @v1) rather than pinning to commit SHAs, the attack had a broad reach, potentially impacting any workflow that ran the action during the infection window.

How It Happened

Supply-chain compromise – Attackers first compromised a bot account used to maintain the action (the @tj-actions-bot), gaining push privileges.
Malicious commit & tag poisoning – They committed malicious code that dumps secrets, then retroactively updated version tags so all previous and current versions were tainted.
Execution during CI/CD workflows – When any affected repository ran its CI workflow using the action, the malicious code executed and printed secret data into build logs.
Secrets exposed – Logs could contain GitHub tokens, AWS credentials, npm/Docker secrets, environment variables, and other sensitive data — instantly readable by anyone with access to the logs.
Wide potential blast radius – Given the popularity of the action, thousands of repositories were at risk, even those unaware of the change, underscoring the danger of trusting widely-used dependencies without lock-down.

Possible Impact & Risks

Secret leakage – CI/CD secrets, including API keys, cloud credentials, tokens, exposed publicly. Attackers could use these for cloud account access, code or package registry abuse, infrastructure compromise, or further supply-chain attacks.
Compromise of downstream systems – With stolen credentials, attackers could breach production environments, publish malicious packages, or manipulate deployment pipelines.
Widespread supply-chain distrust – The breach erodes trust in open-source automation tools and GitHub Actions. Projects relying on third-party actions must now treat them as potential risk vectors.
Developer & enterprise exposure – Both open-source and private organizations using the compromised action may be impacted, especially if they exposed logs publicly or reused leaked secrets across systems.

Recommendations

If you use GitHub Actions, especially third-party ones, here are essential steps to protect yourself:

Audit your workflows – Identify if your repositories referenced tj-actions/changed-files, especially via mutable tags (e.g. @v1). If yes, treat credentials used in those workflows as potentially compromised.
Rotate all secrets – Tokens, API keys, cloud credentials, registry credentials) that were used during the period between March 14–15, 2025.
Pin actions to immutable commit SHAs – Rather than version tags, to avoid retroactive tag poisoning.
Review all third-party actions before adding to workflows, and prefer actions from trusted authors with minimal permissions.

Use least-privilege tokens and ephemeral credentials in CI/CD, avoid granting broad access via long-lived secrets.

Restrict access to workflow logs, especially for public repositories, avoid storing sensitive data in logs.
Enable secret-scanning and auditing for CI/CD pipelines, watch for suspicious logs or leak indicators.
Treat automation agents and CI identity (bots, actions) as non-human identities (NHIs) – apply the same governance, monitoring, and security hygiene as for real user accounts.

Final Thoughts

The March 2025 breach of tj-actions/changed-files underlines a harsh but clear truth: software supply chains, including CI/CD tools and automation frameworks, are first-class attack surfaces. A single compromised action can leak secrets, expose credentials, and undermine trust in entire ecosystems.

For developers, organizations, and security teams, the lesson is urgent and unavoidable: never treat dependencies or automation tools as inherently safe. Always enforce inventory, least-privilege, version pinning, secret hygiene, and regular audits, especially for machine identities, third-party code, and CI/CD pipelines.

In July 2025, security researchers disclosed a troubling breach involving Amazon Q, Amazon’s AI-powered coding agent embedded in the Visual Studio Code extension used by nearly a million developers. A malicious actor had successfully injected a covert “data-wiping” system prompt directly into the agent’s codebase, effectively weaponizing the AI assistant itself.

What Happened

The incident began on July 13, 2025, when a GitHub user operating under the alias lkmanka58 submitted a pull request to the Amazon Q extension repository. Despite being an untrusted contributor, the PR was accepted and merged, a sign of misconfigured repository permissions or insufficient review controls within the workflow.

Inside the merged code was a malicious “system prompt” designed to manipulate the AI agent embedded in Amazon Q. The prompt instructed the agent to behave as a “system cleaner,” issuing explicit commands to delete local file-system data and wipe cloud resources using AWS CLI operations. In effect, the attacker attempted to weaponize the AI model into functioning as a destructive wiper tool.

Four days later, on July 17, 2025, Amazon published version 1.84.0 of the VS Code extension to the Marketplace, unknowingly distributing the compromised version to users worldwide. It wasn’t until July 23 that security researchers observed suspicious behavior inside the extension and alerted AWS. This triggered an internal investigation, and by the next day, AWS had removed the malicious code, revoked associated credentials, and shipped a clean update (v1.85.0).

According to AWS, the attacker’s prompt contained formatting mistakes that prevented the wiper logic from executing under normal conditions. As a result, the company states there is no evidence that any customer environment suffered data deletion or operational disruption.

The True Root Cause

What makes this breach uniquely alarming is not just the unauthorized code change, it’s the fact that the attacker weaponized the AI coding agent itself.

Unlike traditional malware, which executes code directly, this attack relied on manipulating the agent’s system-level instructions, repurposing Amazon Q’s AI behaviors into destructive actions. The breach demonstrated how:

AI agents can be social-engineered or artificially steered simply through malicious system prompts.
Developers increasingly trust AI-driven tools, giving them broad access to local machines and cloud environments.
A compromised AI agent becomes a powerful attacker multiplier, capable of interpreting and running harmful natural-language commands.

Even though AWS later clarified that the injected prompt was likely non-functional due to formatting issues, meaning no confirmed data loss occurred, the exposure risk alone was severe.

What Was at Risk

Had the malicious prompt executed as intended, affected users and organizations faced potentially severe consequences:

Local data destruction – The prompt aimed to wipe users’ home directories and local files, risking irreversible data loss.
Cloud infrastructure wiping – The injected commands included AWS CLI instructions to terminate EC2 instances, delete S3 buckets, remove IAM users, and otherwise destroy cloud resources tied to an AWS account.
Widespread distribution – With nearly one million installs, the compromised extension could have impacted a large developer population, especially those using Amazon Q for projects tied to critical infrastructure, production environments, or cloud assets.
Supply-chain confidence erosion – The breach undermines trust in AI-powered or open-source development tools, a single malicious commit can compromise thousands of users instantly.

Recommendations

If you use Amazon Q, or any AI-powered coding extension / agent, treat this incident as a wake-up call. Essential actions:

Update to the clean version (1.85.0) immediately. If you have 1.84.0 or earlier, remove it.
Audit extension use and permissions – treat extensions as potential non-human identities. Restrict permissions where possible; avoid granting unnecessary filesystem or cloud-access privileges.
Review and lock down CI/CD, dev workstations, and cloud credentials – never assume that an IDE or plugin is “safe.” Use vaults, environment isolation, and minimal permissions.
Vet open-source contributions carefully – apply stricter review and validation for pull requests in critical tools; avoid blindly trusting automated merges or simplified workflows.
Segment environments – avoid using AI extensions on machines or environments that store production data or credentials.
Monitor logs and cloud resource activity – watch for suspicious deletions, cloud resource termination, or unexpected CI jobs after tool updates.

Final Thoughts

The breach of Amazon Q reveals a troubling reality: as AI tools continue to integrate deeply into development workflows, they become part of the enterprise threat landscape, not just optional helpers. A single bad commit, merged without proper checks, can transform a widely trusted extension into a potential weapon against users.

This isn’t just about one extension, it’s about the broader risks of machine identities, AI-powered tools, supply-chain trust, and code governance in modern DevOps environments. As complexity grows, so must our security practices.

Because the agents are hosted on legitimate Microsoft domains and the phishing flow uses bona fide‑looking interfaces, victims are more likely to trust them.

What Happened

Crucially: this is not a software vulnerability but a social‑engineering abuse of legitimate platform features (custom agents + OAuth consent flows) to steal credentials.

What’s at Risk

Because OAuth tokens grant access to platforms, services, and data, stolen tokens from a successful CoPhish attack can lead to:

Unauthorized access to corporate resources: emails, chats, calendar, cloud files, internal documents, etc.
Persistent access until the token is revoked or expires — enabling long-term espionage or data theft.
Lateral movement: attackers with a stolen token could impersonate users — possibly including privileged users — to access more resources or escalate privileges.
Evading detection: because the token is exfiltrated via legitimate Microsoft infrastructure, traffic may appear benign, bypassing many standard security controls.

Given how easy it is to deploy Copilot Studio agents and how many organizations use Microsoft cloud tools, the potential blast radius is wide — from small teams to large enterprises.

How the Attack Works

An attacker (or compromised tenant) creates a malicious agent in Copilot Studio and enables the “demo website” sharing feature.
The agent’s “Login” topic is configured to redirect users to an OAuth consent flow, often masquerading as a legitimate login / authorization prompt.
The attacker distributes the agent link via phishing, email, chat, internal messages, relying on the legitimate Microsoft domain to avoid suspicion.
A victim clicks “Login” and consents, unknowingly granting permissions. The OAuth token returned is immediately forwarded, silently, to an attacker-controlled backend.
With the stolen token, attackers can access mail, chat, calendars, files, and other resources via OAuth/Graph APIs, potentially granting full tenant compromise depending on permissions.

Because everything is hosted under Microsoft’s own domain and appears legitimate, the attack bypasses phishing filters, domain‑based detection, and many usual safety nets.

Why It Matters

The CoPhish attack is a major wake-up call that AI agent platforms like Copilot Studio can be weaponized, not just for convenience.
It exposes a gap in current security, legitimate features (custom agents, OAuth consent) becoming attack vectors through social engineering and workflow abuse.
As more organizations adopt AI-based automation and assistants, the risk associated with misuse of OAuth tokens grows — token theft can lead to data breaches, compliance violations, and wide-scale compromise.
Traditional security measures, firewalls, network monitoring, email filters — are insufficient, because the malicious activity leverages trusted infrastructure and legitimate domains.

What Organizations Should Do Right Now

To defend against CoPhish and similar AI‑based token‑theft attacks:

Enforce strict consent policies, require admin approval for any new OAuth apps or Copilot agent consents.
Lock down Copilot Studio, disable public sharing or “demo website” features, or restrict agent creation to trusted users only.
Monitor enrollment, consent, and token issuance events in your identity platform (e.g. Microsoft Entra ID) for anomalous or unexpected entries.
Revoke and rotate tokens periodically, especially after user role changes, or if unexpected consents arise.
Educate teams about the risk, treat AI agents as high‑privilege identities, not just convenience tools. Automated features can carry real security consequences.
Implement least‑privilege principles and granular permission scopes: only grant what’s strictly needed.

How NHI Mgmt Group Can Help

Incidents like this underscore a critical truth, Non-Human Identities (NHIs) are now at the center of modern cyber risk. OAuth tokens, AWS credentials, service accounts, and AI-driven integrations act as trusted entities inside your environment, yet they’re often the weakest link when it comes to visibility and control.

At NHI Mgmt Group, we specialize in helping organizations understand, secure, and govern their non-human identities across cloud, SaaS, and hybrid environments. Our advisory services are grounded in a risk-based methodology that drives measurable improvements in security, operational alignment, and long-term program sustainability.

We also offer the NHI Foundation Level Training Course, the world’s first structured course dedicated to Non-Human Identity Security. This course gives you the knowledge to detect, prevent, and mitigate NHI risks.

If your organization uses third-party integrations, AI agents, or machine credentials, this training isn’t optional; it’s essential.

Final Thoughts

The CoPhish attack highlights a new and evolving threat: AI-powered agents themselves can become attack vectors. What makes this breach particularly concerning is that attackers exploited trusted Microsoft infrastructure and OAuth consent flows, meaning traditional defenses like phishing filters or domain verification are largely ineffective.

Organizations must recognize that AI assistants, Copilot agents, and other automated tools are effectively non-human identities with privileges. Just like service accounts or API tokens, they require strict governance, monitoring, and access control.