AI Coding Tools Crisis 2025: 19% Slower, 45% Insecure, Rising Costs

Q: Do AI coding tools actually make developers slower?

The METR study found 16 experienced developers were 19% slower when using AI tools on codebases they knew well. However, the confidence interval was wide (-26% to +9%), and 69% kept using AI anyway. The study specifically notes this measured experts on familiar code—contexts where human expertise advantage is largest. Results for beginners or unfamiliar codebases weren't measured.

Q: How secure is AI-generated code?

Veracode tested 100+ LLMs and found 45% of AI-generated code contained security vulnerabilities. However, most studies don't compare this to human-written code baselines. AI optimizes for 'code that compiles' rather than 'code that is secure.' Security review processes must account for AI-generated code—don't trust it to be secure without verification.

You've seen the headlines: "AI makes developers 19% slower." "45% of AI code has security flaws." "Pricing skyrockets." But what do these findings actually mean? And what context is usually missing from the coverage?

There's legitimate research worth understanding here. There are also real limitations in that research. And there's a lot of sensationalism making it hard to think clearly about any of it.

Let's look at the three main areas of concern—productivity, security, and cost—and try to separate what we know from what we don't.

Research Summary

Productivity: METR study found 19% slowdown for experienced devs on familiar code—but 69% kept using AI anyway.
Security: 45% of AI code has vulnerabilities (Veracode)—but no human baseline comparison exists.
Cost: Usage-based pricing replacing flat rates. Some power users face 5-10x cost increases.
Perception gap: Developers believe AI helps 20%+ when measured results show opposite.
Missing data: Beginner productivity, unfamiliar codebases, human security baselines—all understudied.
The honest take: Real effects, more modest than claimed. Context-dependent. Measure objectively.

The Numbers You've Seen (With Context)

19%

Slower (METR study: 16 experienced devs, familiar codebases)

45%

AI code with flaws (Veracode—no human baseline comparison)

69%

Kept using AI anyway (despite slower results)

Human code baseline (mostly unmeasured)

The Productivity Question

The METR study is the most rigorous research we have on AI coding productivity. For a detailed breakdown, see our full METR study analysis. It's worth understanding what it actually found—and what it didn't.

What the Study Measured

16 experienced open-source developers worked on 246 real issues from their own repositories. Half used AI tools (Cursor Pro with Claude 3.5/3.7). Half didn't. The developers using AI took 19% longer.

Important Context Often Missing

Sample size: 16 developers is small. The researchers acknowledge this limitation.

Confidence interval: The 95% CI ranged from -26% to +9%. The true effect could be anywhere in that range.

Specific conditions: These were experts working on codebases they already knew well. The study explicitly notes this is where human expertise advantages are largest.

What wasn't measured: Beginners, unfamiliar codebases, prototyping, learning scenarios—contexts where AI might perform differently.

The Perception Gap Is Real

Here's the genuinely interesting finding: developers predicted AI would make them 24% faster. After finishing—with measurably slower results—they still believed they'd been 20% faster. And 69% kept using AI tools after the study ended.

This suggests two things that might both be true:

Developers can't accurately self-assess AI's productivity impact
There might be benefits beyond raw speed (reduced cognitive load, different kind of engagement, learning)

The study measured completion time. It didn't measure enjoyment, stress, or other factors that might explain why developers preferred the experience despite being slower.

The Harari Perspective

Yuval Noah Harari argues AI represents the first technology that makes autonomous decisions rather than just following instructions. This reframes the productivity question interestingly.

When you use an AI coding assistant, you're collaborating with a system that has its own "opinions." For experts who already know exactly what they want, this creates friction—time spent evaluating and correcting AI suggestions rather than just coding.

For developers in unfamiliar territory, the AI's "opinions" might fill knowledge gaps. The same property that slows experts might help beginners.

What We Can and Can't Conclude About Productivity

Reasonably supported: Experienced developers working on familiar, complex codebases may not benefit from current AI tools—and might be slower.

Not well-studied: Whether AI helps beginners, whether it helps with unfamiliar code, whether it helps with specific task types (prototyping, boilerplate, learning).

Overstated in both directions: "AI makes all developers slower" (not measured). "AI makes developers 10x faster" (not supported).

The Security Question

Security research is more consistently concerning than productivity research. Multiple studies find AI-generated code has significant security issues.

What the Studies Found

Veracode tested 100+ LLMs across 80 coding tasks and found 45% of AI-generated code contained security vulnerabilities. Other studies report even higher rates for specific vulnerability types (XSS, injection attacks). For a deeper analysis, see our article on the AI code quality crisis.

The Real Concern

AI optimizes for "code that looks correct" and "code that compiles." Security requires thinking about edge cases, attack vectors, and defensive patterns that may not be well-represented in training data.

This isn't a "wait for better models" problem. Research suggests larger models don't perform significantly better on security tasks. The issue appears to be architectural, not just scale-related.

The Missing Baseline

Important caveat: Most of these studies don't compare AI-generated code to human-written code under similar conditions. "45% of AI code has security flaws" sounds alarming, but what's the rate for human-written code?

We don't have great data on this. Some studies suggest human code also has high vulnerability rates. The honest answer is: we don't know if AI is worse than humans, only that it's not as good as we'd hope.

This doesn't mean security concerns aren't valid—they are. But the framing of "AI is uniquely dangerous" may be overstated without baseline comparisons.

What's Reasonable to Conclude About Security

Well-supported: AI-generated code frequently contains security vulnerabilities. You should not trust AI code to be secure without review. Security review processes need to account for AI-generated code.

Less clear: Whether AI code is significantly worse than human code. Whether specific vulnerability types are more common in AI code. Whether security improves with better prompting or review processes.

The Cost Question

AI coding tool pricing has changed significantly in 2025. Several platforms moved from flat-rate to usage-based pricing, causing unpredictable bills for some users.

What Actually Happened

Cursor moved to a credit-based system. Claude Code introduced usage caps. Replit changed to "effort-based pricing." These changes surprised users who'd budgeted for flat-rate subscriptions.

Why Pricing Changed

AI coding platforms were subsidizing heavy users. Some power users were getting $1,000+ of API value for $200/month. That's not sustainable as a business model.

The shift to usage-based pricing reflects actual costs. GPU compute is expensive. Model inference isn't free. The "era of cheap unlimited AI" was always going to end.

The legitimate complaint: Changes were often sudden, poorly communicated, and broke budget expectations. The economics are understandable; the execution frustrated users.

What This Means Practically

If you're budget-constrained: Flat-rate options like GitHub Copilot ($10-19/month) offer more predictability than usage-based tools.

If you're a heavy user: Usage-based pricing might actually cost you more than before. Do the math before committing.

For teams: Enterprise costs can be significant. A 500-developer team on GitHub Copilot Business faces ~$114,000/year. Whether that's worth it depends on actual productivity gains—which, per the research above, are uncertain. For more on enterprise costs, see our hidden costs analysis.

The Honest Assessment

Topic	What We Know	What We Don't Know
Productivity (experienced devs)	One rigorous study found 19% slowdown	Whether this generalizes; why devs prefer it anyway
Productivity (beginners)	Limited rigorous research	Whether AI helps those learning or in unfamiliar code
Security	AI code frequently has vulnerabilities	How it compares to human baseline
Cost	Pricing is shifting to usage-based models	Where pricing stabilizes long-term
Perception gap	Developers overestimate AI's help	What non-speed benefits might explain preference

Transparency Note

Syntax.ai builds AI coding tools. We're not neutral observers of this research—we have commercial interests in the AI coding space. We've tried to present the research honestly, including findings that might reflect poorly on AI tools generally. But you should factor our perspective into how you evaluate this analysis.

What Makes Sense Given the Uncertainty

Given what we know and don't know, here's what seems reasonable:

For Individual Developers

Don't trust your perception: The research is clear that developers can't accurately self-assess AI's productivity impact. If you want to know if AI helps you, measure objectively.
Context matters: AI probably works differently for different tasks. Prototyping, learning, boilerplate might benefit more than complex work on familiar codebases.
Security requires verification: Don't trust AI code to be secure. Run static analysis. Review carefully. This isn't optional.

For Teams and Organizations

Question bold productivity claims: The "10x developer" narrative isn't supported by rigorous research. Self-reported surveys aren't reliable evidence.
Measure actual outcomes: If you're deploying AI tools, track objective metrics—completion times, defect rates, security findings—not just adoption or satisfaction.
Budget for uncertainty: Both productivity gains and cost savings are less certain than vendors claim. Plan conservatively.

For the Discourse Generally

Be skeptical of extremes: "AI is useless" and "AI is transformative" are both overstated. Reality is context-dependent and uncertain.
Demand baseline comparisons: "AI code has X% vulnerabilities" means little without human code baselines.
Acknowledge what we don't know: The research is early. Sample sizes are small. Conditions are specific. Humility is appropriate.

The Bottom Line

AI coding tools are real technology with real effects. Those effects are more modest and context-dependent than either enthusiasts or critics suggest.

The productivity research suggests experienced developers on familiar codebases may not benefit—and might be slower. The security research suggests AI code needs careful review. The cost landscape is shifting toward usage-based pricing that may surprise users.

None of this means AI coding tools are useless. Some developers in some contexts probably benefit. But the evidence for transformative productivity gains isn't there yet, and the security concerns are real.

The Question Worth Asking

Instead of "Are AI coding tools good or bad?" try "In what specific contexts might AI help me, and how would I verify that it actually does?"

That's a harder question. It requires actual measurement rather than vibes. And honestly, the answer might be "I don't know yet"—which is a perfectly reasonable place to be given the current state of evidence.

Sources & Evidence Notes

METR Study: "Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity" - Randomized controlled trial with 16 developers, 246 tasks. Published 2025.
Veracode Security Research: Tested 100+ LLMs across 80 coding tasks. The 45% vulnerability rate is real, but no equivalent human baseline study was conducted.
Pricing Information: Based on publicly announced pricing changes from Cursor, Anthropic (Claude Code), and Replit in 2025. Enterprise cost calculation assumes GitHub Copilot Business at $19/user/month × 500 developers × 12 months.
Perception Gap Finding: From the METR study—developers predicted 24% faster, believed 20% faster after, but measured 19% slower.

Note: Research in this space is early. Sample sizes are small. We've tried to represent findings accurately while acknowledging their limitations.

Frequently Asked Questions

Do AI coding tools actually make developers slower?

The METR study found 16 experienced developers were 19% slower when using AI tools on codebases they knew well. However, the confidence interval was wide (-26% to +9%), meaning results could range from significant slowdown to modest speedup. Notably, 69% kept using AI anyway. The study specifically measured experts on familiar code—contexts where human expertise advantage is largest. Results for beginners or unfamiliar codebases weren't measured.

How secure is AI-generated code?

Veracode tested 100+ LLMs and found 45% of AI-generated code contained security vulnerabilities. However, most studies don't compare this to human-written code baselines. AI optimizes for "code that compiles" rather than "code that is secure." Security review processes must account for AI-generated code—don't trust it to be secure without verification. The missing baseline is important: we don't know if AI is worse than humans, only that it's not as good as we'd hope.

Why are AI coding tools getting more expensive?

AI coding platforms were subsidizing heavy users—some getting $1,000+ of API value for $200/month subscriptions. In 2025, Cursor, Claude Code, and Replit shifted to usage-based pricing reflecting actual compute costs. GPU compute is expensive; the "era of cheap unlimited AI" was unsustainable. For budget predictability, flat-rate options like GitHub Copilot ($10-19/month) may be better than usage-based tools. For teams, enterprise costs add up: a 500-developer team on GitHub Copilot Business faces ~$114,000/year.

Should developers use AI coding tools?

It depends on context. AI probably works differently for different tasks—prototyping, learning, and boilerplate may benefit more than complex work on familiar codebases. Key recommendations: Don't trust your perception (developers overestimate AI's help). Measure objectively. Never trust AI code to be secure without review. The evidence for transformative productivity gains isn't there yet, but some developers in some contexts probably benefit.