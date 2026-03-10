Reviewing AI-generated code has become a problem: it's made at such a fast rate, and in such a different way to human engineers that spotting bugs is increasingly difficult.

Anthropic hopes to solve that with the launch of Code Review for Claude Code, a new multi-agent tool that spots bugs that human reviews might miss and aims for “depth, not speed”.

The move by Anthropic comes after the company claimed code output for its own software engineers has leapt by 200% over the last year.

"Code review has become a bottleneck, and we hear the same from customers every week," the company said in a blog post . "They tell us developers are stretched thin, and many PRs [pull requests] get skims rather than deep reads."

That's backed by a study from CodeRabbit which found AI is indeed helping developers speed up code creation – but at the cost of more errors.

Notably, those flaws can be harder to spot thanks to the "illusion of correctness" , in which mistakes made by AI aren't as obvious as those made by humans.

Indeed, developers spend so much time bogged down fixing flaws in AI code that they lose any productivity gains, according to a report from Harness .

Anthropic already has tools aimed at streamlining code reviews, such as Claude Code GitHub Action. However, the company admitted that this option is less thorough than the newly-released feature.

Claude Code GitHub Action will remain available and continue to be open source, the company noted.

So far, Code Review is only available as a research preview in Claude for Teams and Claude for Enterprise. Billed on token usage, using Code Review will cost $15-$25 each time, depending on complexity of the pull request.

How Code Review in Claude Code works

Whenever a pull request is opened, Code Review uses multiple agents to look for bugs at the same time, ranking them by severity and filtering out false positives, the company said.

Larger changes will allocate more agents to the code for a "deeper read", while basic changes will get a quicker look. Anthropic said the average review takes about 20 minutes.

Software engineers are shown a single comment with in-line comments for specific bugs, making it easier to address spotted issues.

Security professionals need not worry, either. Anthropic isn't pitching Code Review as a replacement for their human work, but an aide.

"It won't approve PRs – that's still a human call – but it closes the gap so reviewers can actually cover what's shipping," the company said.

Does it work?

Anthropic said it has been running Code Review on every pull review internally, and so far the signs have been positive. Previously, 16% of pull requests received "substantive" comments from human reviewers. Now, 54% see that many comments.

Naturally, more complicated code changes – those above 1,000 lines changed – are more likely to lead to Code Review findings, with 84% being flagged with an average of 7.5 issues. With smaller pull requests under 50 lines, only 31% are flagged, with an average 0.5 flaws.

"Engineers largely agree with what it surfaces: less than 1% of findings are marked incorrect," the company said.

That suggests the tool could help software engineers keep up with AI generated code – and that includes Anthropic, with security researchers last month spotting critical flaws in Claude Code itself.

