TL;DR: Cursor Bugbot reviews Claude Code pull requests with a separate model, catching different logic, security, and style issues than the generator itself, while reporting 1 million beta PR reviews, 1.5 million issues found, a 76% resolution rate, and 35% autofix merge rate, according to WorkOS. The governance lesson is that review diversity matters more than model sophistication when AI-generated output becomes the baseline.
NHIMG editorial — based on content published by WorkOS: Using Cursor Bugbot to autoreview and fix Claude Code PRs
By the numbers:
- The resolution rate is up to 76%, and the average number of issues caught per run has nearly doubled.
Questions worth separating out
Q: How should teams prevent AI code reviewers from reproducing the same blind spots as the generator?
A: Use independent reviewers with different model behavior, different prompts, or different enforcement logic.
Q: When does automated code review become a governance risk instead of a productivity gain?
A: It becomes a governance risk when the reviewer can change code, not just comment on it, and when the organization cannot explain who authorised those changes.
Q: What do teams get wrong about policy files for AI review workflows?
A: They often treat policy files as documentation, when they are actually enforcement logic.
Practitioner guidance
- Separate generation from review ownership Do not let the same model or the same policy source generate code and certify it without an independent review step.
- Treat autofix as privileged change execution If automated fixes can commit to branches, place them behind branch protections, audit logs, and rollback procedures.
- Encode review standards in hierarchical rules Use repository-level review policy files to define blocking patterns, required tests, and dependency-change checks.
What's in the full article
WorkOS's full post covers the operational detail this post intentionally leaves for the source:
- Step-by-step setup flow for enabling Bugbot across GitHub and GitLab repositories.
- Autofix configuration choices, including create-new-branch versus commit-to-existing-branch behaviour.
- Repository rule hierarchy examples that show how project, team, and user rules interact.
- Network and firewall guidance for environments that need inbound access restrictions or private connectivity.
👉 Read WorkOS's guide to cross-model AI code review with Cursor Bugbot →
Cross-model AI code review: are your PR reviews keeping up?
Explore further
Same-model review creates correlated blind spots: The article is not just about better code review, it is about the control failure that happens when the generator and the reviewer share the same bias surface. That is a familiar weakness in identity governance too, where a single source of truth can still produce a single line of reasoning. The practitioner takeaway is that independence is a control property, not a staffing detail.
A few things that frame the scale:
- 44% of developers are reported to follow security best practices for secrets management, according to The State of Secrets in AppSec.
- Only 27 days is the average estimated time to remediate a leaked secret, despite 75% of organisations expressing strong confidence in their secrets management capabilities.
A question worth separating out:
Q: How do security teams know whether cross-model review is actually working?
A: Look for reduced repeat defects, fewer unreviewed edge cases, and a clear drop in low-value manual cleanup before human review starts. If the system only produces comments without changing the quality of the baseline PR, the control is adding noise rather than independent verification.
👉 Read our full editorial: Cross-model AI code review exposes the limits of single-model judgment