AI code review for security scales by splitting recall from proof

By NHI Mgmt Group Editorial TeamPublished 2026-06-29Domain: Best PracticesSource: 1Password

TL;DR: Years of ProdSec review history have been turned into 343 rules across 16 vulnerability categories by a SAGE pipeline using a multi-model Finder, Critic, and Judge, while cutting review time and hardening against prompt injection, according to 1Password. The core lesson is that AI-assisted security review only works when discovery, verification, and adjudication are separated.

At a glance

What this is: 1Password outlines SAGE, a three-stage AI security review pipeline that separates finding, critique, and final judgment to scale code review.

Why it matters: It matters because IAM, NHI, and autonomous-programme teams are all facing the same governance problem: review quality collapses when one system is asked to discover, verify, and approve access-sensitive changes at once.

By the numbers:

We gathered nearly 9,000 pull requests that spanned over five years of ProdSec code reviews.
SAGE v1 now has access to 343 rules that cover 16 vulnerability categories.

👉 Read 1Password's full post on SAGE's AI security review pipeline

Context

AI-assisted code review is becoming a governance problem, not just a productivity problem, because security teams are being asked to review more changes with less human time. In this case, the pressure came from product growth and AI coding assistants, which increased the volume of security-relevant pull requests faster than manual review could scale.

The important identity question is not whether AI can help with review, but how to separate discovery from approval when the system is analysing code that can change authentication logic, secrets handling, logging, and other security-critical paths. That is the same pattern IAM teams see whenever automation is given governance responsibilities without clear control boundaries.

Key questions

Q: How should security teams design AI review pipelines for code changes?

A: Security teams should separate finding, critique, and final approval into distinct stages with different inputs and decision thresholds. That structure lets the system stay broad during discovery, strict during adjudication, and auditable throughout. The goal is not to automate trust, but to preserve evidence quality as review moves toward a final security decision.

Q: Why do single-model security review workflows create governance risk?

A: Single-model workflows create governance risk because the same model is asked to discover issues, interpret evidence, and decide whether the issue is real. That collapses segregation of duties and makes both false positives and false negatives more likely. Security review works better when each stage has a narrower role and an explicit handoff.

Q: What do teams get wrong about using AI for security code review?

A: Teams often assume that a powerful model is enough, when the real control problem is workflow design. Without stage separation, scoped context, and a clean approval boundary, the model’s output becomes hard to trust and hard to audit. The issue is not AI assistance itself, but whether governance survives the handoff from detection to decision.

Q: How can organisations reduce prompt-injection risk in AI-assisted review?

A: Organisations should limit what each stage can see and do, and avoid giving one model unrestricted access to raw code, rules, and final judgment at the same time. Scoped inputs reduce the chance that malicious text inside a pull request can steer the whole process. Human review should remain available for edge cases and final escalation.

Technical breakdown

Why single-pass AI review fails for security code

Single-pass review asks one model to find issues and prove them at the same time, which creates a recall-versus-precision conflict. High recall wants broad, speculative detection, while high precision wants strict evidence and context. In security review, that tension matters because false positives waste engineer time and false negatives leave real defects unreviewed. 1Password’s design choice to separate these tasks reflects a basic control principle: the same identity or model should not be both the discoverer and the final adjudicator when security evidence is incomplete.

Practical implication: split finding from approval so the system can be noisy early and strict late.

How progressive disclosure reduces prompt-injection risk

Progressive disclosure means each stage sees only the information it needs. In SAGE, the Finder works from a compact rule index, the Critic sees the structured finding plus relevant code hunks and the full rule body, and the Judge receives the finding and critique without chain-of-thought spillover. That architecture narrows the attack surface because a malicious prompt embedded in code has less opportunity to steer the whole pipeline. It also reduces model coupling, which makes verification more objective.

Practical implication: restrict each AI stage to the minimum context needed for its job.

Why model diversity matters in AI security governance

Using different providers for detection and critique is a governance decision as much as a technical one. It reduces the chance that one model’s blind spots, training bias, or failure mode will be repeated across the entire pipeline. This is similar to segregation of duties in IAM: the same control cannot be the sole source of discovery, challenge, and final approval. The design also makes vendor substitution easier, which matters when the tool is embedded in security operations rather than used as a demo.

Practical implication: design AI review pipelines so no single model controls every decision stage.

NHI Mgmt Group analysis

Security review pipelines fail when one model is forced to act as both detector and judge. That is the core governance lesson in this design. Security review needs broad signal capture first and narrow adjudication second, because evidence quality changes across the review flow. The implication is that AI review has to be treated as a control chain, not a single control.

Progressive disclosure is a stronger control pattern than blanket context sharing. When the Finder, Critic, and Judge each receive only the artefacts required for their stage, the pipeline becomes easier to trust and easier to audit. This is the same structural logic that underpins least privilege in identity systems. Practitioners should evaluate AI review tools by stage separation, not by model size alone.

Agentic and AI-assisted governance tools should not inherit human review assumptions by default. Manual ProdSec review relied on people remembering context, challenging findings, and knowing which code paths mattered. Automated review changes that operating model, so the programme has to make those assumptions explicit in process design. Teams should treat the review workflow itself as an identity-governed control plane.

Identity security teams should pay attention to how code-review AI handles sensitive paths, not just generic vulnerability categories. The article’s real differentiator is contextual security knowledge, including repository-specific rules and trust boundaries. That is a preview of where governance is heading: systems that understand local context will outperform generic scanners, but only if their approval logic remains separable and auditable.

Runtime governance gap: SAGE shows that the most useful AI security control is not broader automation, but a controlled split between discovery, challenge, and final decision. That distinction is what keeps security review from becoming a black box. Practitioners should look for workflows that preserve accountability at each stage.

From our research:
The average estimated time to remediate a leaked secret is 27 days, despite 75% of organisations expressing strong confidence in their secrets management capabilities, according to The State of Secrets in AppSec.
Only 44% of developers are reported to follow security best practices for secrets management, exposing a significant developer behaviour gap.
For a broader identity lens, read Ultimate Guide to NHIs , Lifecycle Processes for Managing NHIs to see how lifecycle governance changes when controls must scale across machine identities.

What this signals

Context-aware review is becoming the differentiator. As more engineering teams use AI assistance, the useful control is no longer generic scanning but repository-specific judgment that understands sensitive paths, trust boundaries, and local policy. That is a governance shift, not just a tooling upgrade, and teams should expect security review standards to move closer to programmable policy with human escalation for exceptions.

With 43% of security professionals already worried about AI systems learning and reproducing sensitive information patterns from codebases, the review problem now extends beyond vulnerability detection into knowledge containment. That concern aligns with the broader identity challenge of preventing systems from inheriting more access or context than they need to operate safely.

If you are maturing AI-assisted security review, the next step is to align it with identity governance patterns such as least privilege, staged approval, and auditable delegation. The same discipline that protects NHIs and machine workflows now needs to govern AI review pipelines, especially where code changes can affect secrets handling or authentication logic.

For practitioners

Separate detection from adjudication Use different stages or services for issue discovery, technical challenge, and final verdict so one model never has unilateral approval authority over security findings.
Constrain context by review stage Pass only the artefacts each stage needs, such as compact rules for finding and full code hunks only for critique, to reduce prompt-injection exposure and overreach.
Benchmark against human-reviewed history Train and validate review rules on your own historical security comments and diffs so the system reflects local coding patterns, sensitive directories, and known failure modes.
Keep a separate false-positive record Log rejected findings and the rationale for rejection so security engineers can inspect repeated patterns and tune the review pipeline without weakening the approval gate.

Key takeaways

AI-assisted security review only scales when discovery and approval are separated into distinct stages with different evidence thresholds.
The real control gain is not model size, but governance design that preserves segregation of duties inside the review workflow.
Teams should treat AI review as an auditable control chain and benchmark it against local security history, not generic vulnerability knowledge.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Agentic AI Top 10 address the attack and risk surface, while NIST CSF 2.0 and NIST Zero Trust (SP 800-207) set the governance and control requirements practitioners need to meet.

Framework	Control / Reference	Relevance
OWASP Agentic AI Top 10	A-03	Stage separation and prompt-injection handling map to agentic review controls.
NIST CSF 2.0	PR.AC-4	The pipeline applies least-privilege principles to model context and decision rights.
NIST Zero Trust (SP 800-207)	AC-6	Scoped access and delegated authority align with least-privilege zero trust patterns.

Separate discovery, critique, and judgment so no single model can approve its own findings.

Key terms

Progressive Disclosure Pipeline: A review design that reveals information in stages instead of giving one system full context at once. In security operations, it reduces overreach and makes each decision step easier to audit. For AI review, it also limits prompt injection because later stages see only the evidence they need.
False Positive Filter: A control that removes speculative findings before they reach final approval or remediation. In AI-assisted security review, it prevents noisy outputs from overwhelming engineers and keeps the approval queue focused on credible issues. It is only effective when the filter is independent from the initial detector.
Segregation Of Duties: A governance pattern that splits discovery, challenge, and approval across separate roles or systems. In identity and security programmes, it reduces the chance that one actor can create, validate, and authorise its own access or finding. In AI review, it is the difference between assistance and unsupervised decision-making.

Deepen your knowledge

NHI governance, agentic AI identity, and machine identity lifecycle are core topics in our NHI Foundation Level course, the industry's only accredited NHI security programme. If you are responsible for identity security strategy or NHI governance in your organisation, it is worth exploring.

This post draws on content published by 1Password: SAGE and the future of AI-assisted security review. Read the original.

NHIMG Editorial Note
Published by the NHIMG editorial team on 2026-06-29.
NHI Mgmt Group — the independent authority on Non-Human Identity, IAM, and Agentic AI security. nhimg.org