31.3% of AI-assisted pull requests are now merged without any review β not because teams are reckless, but because the queue moves faster than the process can. That figure comes from Faros AI's engineering report, drawn from 22,000 developers across 4,000+ teams over two years.
The Volume Problem Manual Review Can't Solve
The same report found that median time-to-first review is up 156.6% at high-adoption organizations, while code churn has risen 861%. More PRs arrive faster, each carrying more defects than a human-authored equivalent. A 2025 analysis of 470 GitHub pull requests found AI-generated code averages 10.83 issues per PR versus 6.45 for human-written PRs β 1.7x more findings overall. Critical issues are up 40%, major issues up 70%, and security-specific findings run 1.5x higher.
The outcome is predictable. Faros AI's data shows the incidents-to-PR ratio is up 242.7% at high-adoption organizations. Manual review doesn't scale to this reality. The teams managing it well aren't hiring more reviewers β they're routing work differently. The five metrics every engineering team should track for AI code risk point to the same conclusion: triage has to happen before review, not during it.
What Automated PR Risk Scoring Looks Like in Practice
A risk scoring workflow assigns a numeric value to each incoming pull request before a human sees it. The score is computed from weighted signals evaluated at CI time: file sensitivity, change volume, test coverage delta, AI authorship confidence, and historical defect rate for the changed paths.
Scored PRs route automatically. Low-risk changes β documentation updates, formatting fixes, test additions β clear to the review queue without friction. Medium-risk changes are flagged and routed to a senior reviewer. High-risk changes, those touching auth modules, secrets configuration, or infrastructure paths, are blocked until explicit sign-off is recorded against the PR.
This separation keeps the queue manageable. Reviewers stop triaging every incoming change and concentrate on the ones with real blast radius. Compared to standard code review, automated risk scoring consistently surfaces security patterns reviewers miss at high volume β especially findings that require pattern recognition across the full diff rather than individual lines.
Every score, the signals driving it, and the reviewer action taken are written to a permanent record β a continuous audit trail across every AI-authored PR your team merges.
What re-entry.ai does about this: re-entry.ai applies real-time PR risk scoring directly in your CI pipeline, evaluating each pull request against your organization's governance policy before it reaches the merge queue. Engineering leaders get a scored, reviewable record across all AI-authored PRs β not a snapshot at audit time, but a continuous signal that reflects every governance decision your team makes.
What to Do Now
Audit the last 30 merged PRs from AI coding agents and calculate the no-review rate. The 31.3% industry benchmark gives you a baseline for comparison.
Map your highest-sensitivity file paths: auth modules, secrets configuration, infrastructure-as-code, and payment flows. These become the initial high-weight signals in your scoring model.
Define three risk bands β low, medium, high β with explicit routing rules before writing any automation. Policy ambiguity is the primary reason scoring systems get bypassed in practice.
Instrument CI to block merges above your high-risk threshold. Start conservatively to minimize false blocks while you calibrate signal weights against your actual defect history.
Align your scoring factors with the defect categories most relevant to your stack using the five-factor PR risk scoring framework as your reference point.
The review queue problem compounds as AI coding agents get faster. Building risk-based triage now means governance infrastructure exists before volume makes it unavoidable. Request early access at re-entry.ai.