TL;DR: An AI agent using natural-language instructions to identify vulnerable open-source projects, compromise CI/CD pipelines, and publish a malicious extension that turned developers' own AI tools into credential-stealing accomplices is shown in Pillar Security's analysis of the hackerbot-claw campaign. The breach reveals that access review and workflow assumptions break when an actor can probe, pivot, and exfiltrate at machine speed without a stable human approval loop.
NHIMG editorial — based on content published by Pillar Security: Hackerbot-Claw adversarial agent targets top GitHub repos
By the numbers:
- Pillar Security found an 11-second gap between fork creation and first push, showing machine-speed exploitation across the campaign.
- 59 seconds, retried probes every 59 seconds, indicating tightly looped reconnaissance rather than manual trial and error.
- 11 minutes., n pivoted from confirmed code execution to escalated payload exfiltration in 11 minutes.
Questions worth separating out
Q: What breaks when AI agents are allowed to act inside privileged CI/CD workflows?
A: What breaks is the assumption that repository inputs are still untrusted once a workflow starts running.
Q: Why do AI agents complicate least-privilege governance?
A: AI agents complicate least privilege because their action sequence is not fully knowable at provisioning time.
Q: How do security teams know when an AI instruction file has become a security control?
A: You know it has when changing that file can alter review outcomes, commit behaviour, or data handling.
Practitioner guidance
- Remove elevated trust from forked CI/CD inputs Audit workflows that process branch names, filenames, pull request metadata, or composite actions from forks while holding repository-level credentials.
- Classify AI instruction files as policy artifacts Track files such as CLAUDE.md, repository-level prompts, and review instructions as governed security objects.
- Monitor AI CLI process spawning on developer endpoints Detect claude, codex, gemini, copilot, and kiro-cli processes launched by IDEs or extensions rather than by a user shell.
What's in the full article
Pillar Security's full research covers the operational detail this post intentionally leaves for the source:
- Step-by-step breakdown of each exploitation phase across Microsoft, DataDog, Trivy, and other targets.
- Code-level details of the malicious VSCode extension, including how it launched AI CLI tools with permissive flags.
- The full artifact timeline, including session identifiers, scoreboard infrastructure, and publish-chain evidence.
- Detection and response observations from the confirmed Claude Code defense event and the Trivy remediation path.
👉 Read Pillar Security's analysis of the hackerbot-claw AI agent campaign →
AI agents in CI/CD: what governance gap are teams missing?
Explore further