AI Is Creating More Work Than It Solves
Developers on high AI-adoption teams merge 98% more pull requests per day. PR review time has increased 91%. AI is generating code faster than humans can review it. That is not a productivity improvement. It is a pipeline crisis being relabelled as efficiency.
Nirmal Nambiar
Author

The most misunderstood finding in AI productivity research is buried in the middle of Faros AI's 2025 report, drawn from telemetry across over 10,000 developers and 1,255 teams. Developers on high AI-adoption teams complete 21% more tasks and merge 98% more pull requests per day. That is the headline most people share. What they skip is the line that follows: PR review time has increased 91%. For every doubling of output at the code-writing stage, review time nearly doubles too. The bottleneck did not disappear. It moved downstream. In most organisations, the downstream capacity the review, QA, security, integration, and deployment pipeline has not grown at all. AI generated a flood upstream. The dam is the same size it always was.
Amdahl's Law Applied to AI Coding
Amdahl's Law states that a system moves only as fast as its slowest component. Accelerating one stage of a pipeline without expanding the capacity of downstream stages does not increase total throughput it increases queue length at the constraint point. AI coding tools have dramatically accelerated one specific stage: writing implementation code. They have done nothing for the stages that follow code review, security assessment, QA testing, integration, deployment, and post-deployment monitoring. When you double the volume going into a pipeline without expanding its capacity to process that volume, you do not get more output. You get a longer queue and a slower overall cycle time for any individual piece of work.The Faros research found that developers on high AI-adoption teams touch 9% more tasks and 47% more pull requests per day. The word 'touch' matters here. More context switching historically correlated with cognitive overload and reduced quality is being reframed as a productivity indicator. A developer touching 47% more pull requests is not delivering 47% more reviewed, tested, merged, and deployed code. They are spending 47% more time in a review queue that was already at capacity before AI doubled the inbound volume.
The Code Acceptance Problem
GitHub Copilot has a 46% code completion rate. Only 30% of that code is accepted by developers after review. In other words, AI writes nearly twice as much code as gets used. The 70% that is rejected still had to be read, evaluated, and discarded a form of work that has no prior equivalent in the software development process. Before AI coding assistants, a developer wrote code they intended to use. Every line was a deliberate act. With AI assistants, developers spend significant time reviewing code they did not ask for, in forms they did not choose, to determine whether it is usable.The DORA 2025 report found that only 24% of developers fully trust AI-generated code. This is not scepticism about AI in general it is a calibrated response to a specific failure pattern. AI-generated code tends to be syntactically correct and structurally plausible while containing subtle logical errors, security vulnerabilities, edge case failures, and architectural assumptions that conflict with the existing codebase. These failures are harder to catch than the obvious errors that a simple syntax checker would surface. They require the kind of contextual understanding that only a human reviewer with knowledge of the specific system can apply. The quality of AI code review is therefore higher-stakes, not lower-stakes, than the review of human-written code.
Junior Developers: More Code, Lower Quality
The 2025 State of Engineering Management Report found that junior developers using AI tools generate significantly more code and that the code is substantially buggier and lower-performing than code written by junior developers without AI tools. The mechanism is straightforward: AI coding tools accelerate the generation of code that looks syntactically correct but lacks architectural soundness. A junior developer without AI tools is constrained by their own typing speed and knowledge gaps which limits the volume of problematic code they can produce. A junior developer with AI tools can generate problematic code at ten times the rate, creating a maintenance burden that compounds over time.This effect is not evenly distributed across team seniority levels. Senior engineers on the same teams showed the opposite pattern AI tools genuinely accelerated their output without a corresponding quality degradation, because they have the system knowledge and judgment to direct AI tools effectively and evaluate their outputs critically. The result is a widening quality gap between senior and junior contributions that is creating new team dynamics: senior engineers spending increasing proportions of their time reviewing and correcting AI-assisted junior output, which reduces the time they can spend on the strategic and architectural work where their judgment creates the most value.
The Organisational Gap Nobody Budgeted For
When companies decided to invest in AI coding tools, they budgeted for the tools. Almost none of them budgeted for the organisational changes required to handle the volume those tools produce. Review processes designed for a team generating 100 pull requests per week are not equipped to handle 200 without either adding reviewers or degrading review quality. Most teams chose the implicit third option: accept faster delivery with less rigorous review and learn about the quality gap in production.The DORA 2025 report is specific about the conditions under which AI coding tools generate genuine productivity benefits: teams with solid CI/CD practices, fast feedback loops, and strong existing engineering culture see meaningful gains. Teams without these foundations see little benefit and often see negative effects, as AI-accelerated code generation surfaces the weaknesses in their review and testing infrastructure faster than those weaknesses were previously visible. AI is a quality amplifier. It amplifies whatever system it is placed into. Strong systems get stronger. Weak systems get weaker, faster.
What the Numbers Show vs. What Is Claimed
| Claimed AI Benefit | What the Data Shows | What Is Hidden |
|---|---|---|
| 98% more PRs merged per day | True inbound volume doubled | PR review time also up 91% net cycle time unchanged or longer |
| 46% code completion rate | True AI writes nearly half of keystrokes | Only 30% of AI suggestions accepted 70% reviewed and discarded |
| 21% more tasks completed | True at individual level | Org-level delivery velocity did not scale proportionally bottleneck moved to review |
| 3060% time saved on coding | True for individual code-writing time | Time spent reviewing AI code, fixing AI errors, debugging AI edge cases not included |
| AI boosts developer happiness | True for AI-fluent seniors on greenfield work | Developers spending their day reviewing AI-generated junior code report higher stress |
What Would Actually Help
- Invest in review capacity before deploying AI generation tools the constraint is downstream, not upstream, and accelerating upstream without expanding downstream produces queue buildup, not throughput improvement
- Measure PR review time and post-merge defect rate, not just PR volume the metrics being optimised determine what the team actually optimises for, and volume metrics reward the wrong behaviour
- Separate individual productivity gain from organisational delivery improvement these are different claims requiring different evidence, and conflating them produces investment decisions based on the wrong unit of analysis
- Treat AI-generated code with the security review standards applied to third-party libraries because the failure modes (opaque logic, undocumented assumptions, subtle edge case errors) are more similar to imported code than to internally reasoned code
- Build AI literacy into engineering culture before mandating adoption the productivity gains are real for fluent users and negative for reluctant or unskilled ones, and mandating a tool before the team is ready to use it well produces the worst possible outcome
Related articles
View all →
AI ProductivityThe AI Productivity Paradox (2026): More AI, Slower Decisions
Every new AI dashboard added to an enterprise increases cognitive load, not productivity. The paradox of 2026: AI tools reduce tactical work while multiplying the strategic coordination overhead required to act on what they surface. The fix is not more assistants it is autonomous executors.
Enterprise ROIWhere Enterprises Actually Lose Money: Hidden Bottlenecks & Approval Traps
The financial losses that matter most in enterprise operations are not fraud or waste they are the compounding cost of cross-department delays, approval bottlenecks, and missed automated actions. Each delayed decision and every overlooked contract renewal quietly drains value at scale.
WorkflowFrom Factory Floor to Customer Door: Fixing Broken Workflows
The product is great. The marketing is working. But somewhere between the production line and the customer's doorstep, the workflow breaksin a production handoff, a quality check, a dispatch coordination, a last-mile delivery. Each break is a cost. Together, they are the margin gap between what the business should earn and what it actually does.
