Skip to content
The AI Velocity Report Q4 2025

The AI Velocity Report: Q4 2025

5 min read

51% AI-authored code, prompt-first engineering, and why we dropped vibe coding tools. Our Q4 2025 quarterly report on AI adoption in production software delivery.

TLDR

51% of our code is now AI-authored, and the way our engineers work has fundamentally changed. After over two years of structured AI adoption, the team has shifted to prompt-first engineering: more time on specifications, framing, and review, less on manual coding. We dropped Replit, all vibe coding tools, and Codex. We adopted OpenSpec for spec-driven development that scales. Cursor with claude-opus-4-5 is our primary environment, and every line of AI output ships through senior review and ISTQB-qualified QA.

51% AI-authored code
Prompt-first Engineering shift
3 Tools dropped
2+ yrs Structured adoption
Stack Update

Our Stack This Quarter

Every tool, model, and methodology in production, what we are evaluating, and what we have moved on from.

AI Models

Using

  • claude-opus-4-5 Primary reasoning model
  • Azure OpenAI Client-facing AI features

Evaluating

None this quarter

Dropped

  • Codex Accuracy issues, needs more context than alternatives

Development Environment

Using

  • Cursor Primary IDE

Evaluating

  • Claude Code CLI
  • Codex CLI

Dropped

  • Replit Replaced by Cursor with Opus 4.5
  • Vibe coding tools Replaced by spec-driven development with OpenSpec

Workflow & Automation

Using

  • GitHub Copilot Enterprise tier, used via GitHub Coding Agent
  • MCP Servers Approved integrations for agents

Evaluating

  • GitHub Coding Agent

Dropped

None this quarter

Engineering Practice

Using

  • OpenSpec Spec-driven development framework, adopted this quarter

Evaluating

None this quarter

Dropped

None this quarter

What does prompt-first engineering actually look like?

After over two years of structured AI adoption, our engineers have crossed an important threshold: they are now prompt-first rather than code-first.

This is not a tools story. It is a behaviour change. Engineers spend a higher proportion of their time on non-coding activity: clearer thinking, better framing, stronger specifications, improved review, and tighter feedback loops. The coding itself is often faster, but the engineering around it has become more important, not less.

51% AI-authored code across our active projects, with the figure even higher in front-end systems where patterns are consistent.

In terms of output, 51% of our code is now AI-authored. That percentage is even higher in front-end systems where patterns are consistent and iteration cycles are rapid. Every line of AI-generated code goes through senior engineer review, ISTQB-qualified QA validation, and ships within our ISO 27001-certified security framework.

What models and tools are we using this quarter?

Opus 4.5 remains our default model. The token cost is higher than alternatives, but the increased accuracy reduces correction cycles and keeps it competitive on total cost. We find it far superior to Codex, which often needs more context and additional prompting to produce reliable results.

Cursor (with Opus 4.5) is firmly established as our primary IDE. Despite our engineers holding GitHub Copilot licences, local use of Copilot has been completely replaced by Cursor. We continue to use GitHub Copilot through GitHub Coding Agent within GitHub Enterprise for automated workflows.

We evaluated Claude Code CLI and Codex CLI. Both are capable tools, but Cursor remains our choice for day-to-day engineering work.

Why did we drop vibe coding tools?

We stopped using Replit and all dedicated vibe coding tools this quarter. Two reasons:

First, Cursor with Opus 4.5 produces superior results and gives more control. The trade-off is a steeper learning curve for non-engineers, but for our team that is the right trade-off.

Second, we now understand “the art of the start”: rapidly creating new projects using Cursor and Opus to avoid having to throw the prototype away, with the ability to evolve in a structured way. This eliminates the main advantage vibe coding tools had: getting started fast.

If your organisation has already built something with vibe coding tools and needs help getting it to a state you can depend on, we offer a prototype-to-production service covering code audits, security hardening, and ongoing support.

How does OpenSpec make spec-driven development scale?

AI tools do a remarkable job of creating working software from minimal input. Iterating on that initial effort becomes increasingly messy and often leads to a desire to start again.

A spec-driven approach solves this. We adopted OpenSpec this quarter as a lightweight framework for maintaining structured specifications that AI agents can work from consistently. Instead of iterating on a conversation, we iterate on a specification.

This has particular value for legacy modernisation, where onboarding AI into an existing codebase benefits enormously from a formal specification of the current system’s behaviour.

What We Dropped

Codex. Accuracy issues and higher context requirements made it uncompetitive against claude-opus-4-5. Dropped in favour of a single-model strategy built around Opus.

Replit and vibe coding tools. Cursor with Opus 4.5 produces better results with more control, and spec-driven development with OpenSpec eliminates the “fast start” advantage vibe tools offered.

Local GitHub Copilot. All engineers moved to Cursor. Copilot remains licensed for GitHub Coding Agent workflows.

How It Shows Up in Client Work

One of the most exciting applications this quarter has been AI-assisted legacy modernisation. We have provided legacy codebase reviews and modernisation roadmaps for years. With AI, we can dramatically reduce the effort and cost of the review process, and accelerate ongoing change.

OpenSpec is central to this: it provides a structured way to onboard AI into a codebase it has never seen, understand the existing system’s behaviour, and support modernisation decisions with real context rather than assumptions.

What’s Next: Q1 2026

Three areas we are exploring:

  1. Expanded MCP Server integration for approved engineering tools: work items, test execution, logging, Azure access, CI/CD, and GitHub. Fewer manual steps, faster feedback loops.
  2. Reusable agents and skills that codify proven delivery patterns into repeatable building blocks.
  3. Estimation with AI confidence built in. We are seeing increasing confidence in incorporating AI productivity gains into estimates, though this remains a work in progress where requirements are uncertain or systems are complex.

Frequently asked questions

How much of Talk Think Do's code is AI-authored?
As of Q4 2025, 51% of code is AI-authored, with the figure even higher in front-end systems. Every line is reviewed by senior engineers and validated by ISTQB-qualified QA.
What is prompt-first engineering?
Prompt-first engineering is a behaviour shift where engineers spend more time on specification, framing, and review, and use AI to generate the implementation. The engineering around the code becomes more important, not less.
Why did Talk Think Do drop vibe coding tools?
Cursor with claude-opus-4-5 produces superior results with more control. Combined with spec-driven development using OpenSpec, the main advantage of vibe coding tools (getting started fast) was eliminated.
What is OpenSpec and how does it help AI development?
OpenSpec is a lightweight framework for maintaining structured specifications that AI agents work from consistently. Instead of iterating on a conversation, teams iterate on a specification, which scales far better for complex systems.
Is AI-generated code safe for production use?
Every line of AI-generated code goes through senior engineer review, ISTQB-qualified QA validation, and ships within an ISO 27001-certified security framework.

Want to talk about what we're seeing?

Book a free 30-minute consultation. We will give you an honest assessment of your options.