The AI Velocity Report: Q4 2025
51% AI-authored code, prompt-first engineering, and why we dropped vibe coding tools. Our Q4 2025 quarterly report on AI adoption in production software delivery.
51% of our code is now AI-authored, and the way our engineers work has fundamentally changed. After over two years of structured AI adoption, the team has shifted to prompt-first engineering: more time on specifications, framing, and review, less on manual coding. We dropped Replit, all vibe coding tools, and Codex. We adopted OpenSpec for spec-driven development that scales. Cursor with claude-opus-4-5 is our primary environment, and every line of AI output ships through senior review and ISTQB-qualified QA.
Our Stack This Quarter
Every tool, model, and methodology in production, what we are evaluating, and what we have moved on from.
AI Models
Using
- claude-opus-4-5 Primary reasoning model
- Azure OpenAI Client-facing AI features
Evaluating
None this quarter
Dropped
- Codex Accuracy issues, needs more context than alternatives
Development Environment
Using
- Cursor Primary IDE
Evaluating
- Claude Code CLI
- Codex CLI
Dropped
- Replit Replaced by Cursor with Opus 4.5
- Vibe coding tools Replaced by spec-driven development with OpenSpec
Workflow & Automation
Using
- GitHub Copilot Enterprise tier, used via GitHub Coding Agent
- MCP Servers Approved integrations for agents
Evaluating
- GitHub Coding Agent
Dropped
None this quarter
Engineering Practice
Using
- OpenSpec Spec-driven development framework, adopted this quarter
Evaluating
None this quarter
Dropped
None this quarter
What does prompt-first engineering actually look like?
After over two years of structured AI adoption, our engineers have crossed an important threshold: they are now prompt-first rather than code-first.
This is not a tools story. It is a behaviour change. Engineers spend a higher proportion of their time on non-coding activity: clearer thinking, better framing, stronger specifications, improved review, and tighter feedback loops. The coding itself is often faster, but the engineering around it has become more important, not less.
In terms of output, 51% of our code is now AI-authored. That percentage is even higher in front-end systems where patterns are consistent and iteration cycles are rapid. Every line of AI-generated code goes through senior engineer review, ISTQB-qualified QA validation, and ships within our ISO 27001-certified security framework.
What models and tools are we using this quarter?
Opus 4.5 remains our default model. The token cost is higher than alternatives, but the increased accuracy reduces correction cycles and keeps it competitive on total cost. We find it far superior to Codex, which often needs more context and additional prompting to produce reliable results.
Cursor (with Opus 4.5) is firmly established as our primary IDE. Despite our engineers holding GitHub Copilot licences, local use of Copilot has been completely replaced by Cursor. We continue to use GitHub Copilot through GitHub Coding Agent within GitHub Enterprise for automated workflows.
We evaluated Claude Code CLI and Codex CLI. Both are capable tools, but Cursor remains our choice for day-to-day engineering work.
Why did we drop vibe coding tools?
We stopped using Replit and all dedicated vibe coding tools this quarter. Two reasons:
First, Cursor with Opus 4.5 produces superior results and gives more control. The trade-off is a steeper learning curve for non-engineers, but for our team that is the right trade-off.
Second, we now understand “the art of the start”: rapidly creating new projects using Cursor and Opus to avoid having to throw the prototype away, with the ability to evolve in a structured way. This eliminates the main advantage vibe coding tools had: getting started fast.
If your organisation has already built something with vibe coding tools and needs help getting it to a state you can depend on, we offer a prototype-to-production service covering code audits, security hardening, and ongoing support.
How does OpenSpec make spec-driven development scale?
AI tools do a remarkable job of creating working software from minimal input. Iterating on that initial effort becomes increasingly messy and often leads to a desire to start again.
A spec-driven approach solves this. We adopted OpenSpec this quarter as a lightweight framework for maintaining structured specifications that AI agents can work from consistently. Instead of iterating on a conversation, we iterate on a specification.
This has particular value for legacy modernisation, where onboarding AI into an existing codebase benefits enormously from a formal specification of the current system’s behaviour.
What We Dropped
Codex. Accuracy issues and higher context requirements made it uncompetitive against claude-opus-4-5. Dropped in favour of a single-model strategy built around Opus.
Replit and vibe coding tools. Cursor with Opus 4.5 produces better results with more control, and spec-driven development with OpenSpec eliminates the “fast start” advantage vibe tools offered.
Local GitHub Copilot. All engineers moved to Cursor. Copilot remains licensed for GitHub Coding Agent workflows.
How It Shows Up in Client Work
One of the most exciting applications this quarter has been AI-assisted legacy modernisation. We have provided legacy codebase reviews and modernisation roadmaps for years. With AI, we can dramatically reduce the effort and cost of the review process, and accelerate ongoing change.
OpenSpec is central to this: it provides a structured way to onboard AI into a codebase it has never seen, understand the existing system’s behaviour, and support modernisation decisions with real context rather than assumptions.
What’s Next: Q1 2026
Three areas we are exploring:
- Expanded MCP Server integration for approved engineering tools: work items, test execution, logging, Azure access, CI/CD, and GitHub. Fewer manual steps, faster feedback loops.
- Reusable agents and skills that codify proven delivery patterns into repeatable building blocks.
- Estimation with AI confidence built in. We are seeing increasing confidence in incorporating AI productivity gains into estimates, though this remains a work in progress where requirements are uncertain or systems are complex.
Frequently asked questions
How much of Talk Think Do's code is AI-authored?
What is prompt-first engineering?
Why did Talk Think Do drop vibe coding tools?
What is OpenSpec and how does it help AI development?
Is AI-generated code safe for production use?
Want to talk about what we're seeing?
Book a free 30-minute consultation. We will give you an honest assessment of your options.