The AI Velocity Report: Q1 2026
84% AI-authored code, up from 51% last quarter. Agent workflows, expanded MCP integration, and measurably faster delivery across every active project.
84% of our code is now AI-authored, up from 51% last quarter. The biggest drivers were greenfield projects built AI-natively from day one, Cursor Rules and Skills for consistency at scale, and six live MCP integrations closing the loop between code and engineering systems. Delivery is 40-50% faster across every active project, and that figure is now repeatable, not aspirational. A public API project won in competitive tender at 55% of the normal cost proved the economics are real: estimates were accurate, quality was high, and the client is delighted.
Our Stack This Quarter
Every tool, model, and methodology in production, what we are evaluating, and what we have moved on from.
AI Models
Using
- Claude Opus 4.6 Deep thinking: UX design, test authoring, requirement analysis. Used in plan mode
- Claude Sonnet 4.6 Primary execution model: coding, test authoring, technical problem solving. Lower cost, highly capable
- Azure OpenAI Client-facing AI features
- ChatGPT Team (Codex) Available but seeing limited use, primarily by non-engineers
Evaluating
- Claude Team Evaluating as a ChatGPT Team replacement, better integration with company data and knowledgebase
- Phi-3 Evaluating for high-volume client-facing scenarios
- Haiku Evaluating for high-volume client-facing scenarios
Dropped
None this quarter
Development Environment
Using
- Cursor Primary IDE, preferred by all engineers
- Cursor Rules Project-specific standards, context, and constraints encoded for consistent AI behaviour across the team
- Reusable Cursor Skills Delivery pattern library encoding proven workflows as repeatable building blocks
- Claude Code Agent workflows alongside Cursor
- Cursor Background Agents Cloud-based agent execution for longer-running tasks
Evaluating
None this quarter
Dropped
- GitHub Copilot Still licensed, unused locally. Cursor preferred by all engineers
Workflow & Automation
Using
- Custom MCP Servers Work items, test execution, logging, Azure, CI/CD, GitHub
- GitHub Coding Agent Enterprise CI/CD workflows
Evaluating
- OpenClaw Skipped on security grounds: self-managed install, credentials shared with remote agents, young fast-moving open-source codebase. Does not meet our ISO 27001 bar.
Dropped
- Manual context switching for work items Replaced by MCP work item integration
- Standalone test execution Replaced by MCP-driven test runner
Engineering Practice
Using
- OpenSpec Spec-driven development, second quarter in use
- AI-assisted estimation Competitive tender won at 55% of conventional cost with accurate delivery. Reliable where requirements are clear, still maturing for complex systems
Evaluating
None this quarter
Dropped
None this quarter
How did AI-authored code jump from 51% to 84%?
Last quarter we reported 51% of our code as AI-authored. This quarter it is 84%. That is a 33-percentage-point jump in a single quarter.
The most significant driver is a mix effect: several new greenfield projects started this quarter were developed AI-natively from day one, and their AI-authored percentages are close to 100%. When you start a project with AI at the centre rather than introducing it into an existing codebase, the ceiling is much higher. These projects pull the overall average up considerably.
That context matters when reading the headline figure. The 84% is not an across-the-board improvement on legacy or mature projects, though those have improved too. It reflects what is achievable when a team with two to three years of structured AI adoption starts something new with no inherited constraints.
The rest of the increase came from four compounding factors, roughly in order of impact:
OpenSpec maturity. After adopting OpenSpec in Q4 2025, our team has now built the muscle memory for spec-driven development. Engineers write specifications first, then use AI to generate implementations that follow the spec. The specifications act as both the prompt and the acceptance criteria. This dramatically reduces the back-and-forth that previously limited AI’s contribution on complex tasks.
Cursor Rules and Skills. This has been the single biggest operational shift this quarter, and in practice more impactful than MCP integration. Cursor Rules encode project standards, architectural decisions, and team conventions directly into the AI’s working context, so every engineer is working with the same guardrails regardless of which part of the codebase they are in. Cursor Skills go further: they package proven delivery patterns into reusable building blocks that engineers invoke rather than re-prompt from scratch. Together, they have reduced variation, shortened onboarding, and made AI output consistently higher quality across projects.
MCP integration. Custom MCP Servers connect our AI agents directly to engineering systems: work items, test execution, logging, Azure resource visibility, CI/CD pipelines, and GitHub. Fewer manual context switches mean more sustained AI-assisted development within a single session.
Team confidence. This is harder to measure but real. After over 2 years of structured adoption, every engineer on the team is comfortable working prompt-first. The early adopter phase is over. This is how we work now.
Every line of AI-generated code goes through senior engineer review, ISTQB-qualified QA, and ships within our ISO 27001-certified security framework.
What We Said We’d Do
Last quarter we set out three areas to explore in Q1 2026. Here is what happened.
Expanded MCP Server integration. Delivered. Six integrations are now live: work items, test execution, logging, Azure access, CI/CD, and GitHub. The practical outcome goes further than fewer manual steps. MCP-connected agents now check acceptance criteria and user story readiness before and after implementation, with results linked directly to pull requests and Azure DevOps work items. The loop between what was asked for and what was built is now closed automatically, not by a human remembering to check.
Reusable agents and skills. Delivered, and more significant than we anticipated. We have built a library of reusable Cursor Skills encoding proven delivery patterns as building blocks engineers invoke rather than re-prompt. Alongside this, Cursor Rules have become the foundation for consistent AI behaviour across the team: project standards, architectural constraints, and team conventions are encoded once and applied everywhere. In practice, Rules and Skills have had more impact on output quality and consistency than MCP integration.
Estimation with AI confidence built in. Real progress, with caveats. We won a competitive tender for a public API project at 55% of the conventional cost, and the estimates proved accurate. That is a meaningful milestone: pricing a project at roughly half the normal cost and delivering on time and on quality. Where requirements are clear and the codebase is well-understood, AI-adjusted estimates are proving reliable. The challenge is that productivity keeps increasing, so the baseline we estimate against is a moving target. We are not yet applying AI-first estimation across all projects, but the confidence gap is narrowing.
Is 40-50% faster delivery actually repeatable?
For the past two quarters we have reported 40-50% faster delivery. This quarter, we have the data to call it repeatable rather than aspirational, and we believe the gains are still increasing.
Across every active project, AI-augmented delivery consistently produces faster timelines without a corresponding increase in correction cycles. In several cases, correction cycles have decreased: AI-generated code reviewed by senior engineers is producing fewer defects than manually written code in the same systems.
The 40-50% figure is a conservative, measured floor. Productivity is compounding quarter on quarter as Cursor Rules, Skills, and MCP integrations mature. On newer projects where these practices are embedded from day one, the gains are already exceeding that range. We expect the headline number to move upward as the tooling and team confidence continue to improve.
Critically, the time saved is not coming from cutting corners. It is coming from reduced scaffolding, faster test generation, more consistent code patterns, and fewer manual context switches. Client expectations are rightly adjusting to this pace, and we see that as a positive signal: the speed is becoming the baseline, not the exception.
How do Cursor Rules and Skills deliver consistency at scale?
The most impactful change this quarter has not been the tools we connect to, but the knowledge we encode into the tools we already use.
Cursor Rules are project-specific files that sit inside the repository and load automatically as context when engineers work in Cursor. They encode the things that would otherwise live in someone’s head: architectural decisions, naming conventions, security constraints, accessibility standards, copywriting rules, and team-specific patterns. Every engineer working on a project gets the same guardrails. AI output that would previously require correction because it conflicted with an undocumented standard is caught before it is written.
Cursor Skills take this further. A Skill is a reusable, invokable workflow: a structured set of instructions that packages a proven delivery pattern so any engineer can execute it consistently. Rather than re-prompting from scratch for a common task, an engineer invokes the relevant Skill and gets consistent, high-quality output every time. Skills compound over time: as we refine delivery patterns on one project, those improvements propagate to all projects that use the same Skill.
Skills also offer a practical alternative to MCP integration for tool connectivity. Where an MCP Server requires setup, configuration, and authentication on every team member’s machine, a Skill that references a CLI the team already has installed is often faster to deploy and easier to maintain. If the tooling is already there, wrapping it in a Skill is frequently the better first step. MCP makes sense where deeper, persistent integration is worth the setup cost; a CLI-based Skill is the pragmatic choice when it is not.
The combination has fundamentally changed how we manage quality at scale. It is no longer dependent on individual engineers remembering to apply standards or knowing the right prompt. Standards are encoded. Patterns are repeatable. This is what has moved AI output from generally useful to consistently reliable.
How do MCP Servers reduce manual steps and speed up feedback?
MCP Servers connect our AI agents directly to engineering systems, reducing context-switching and closing the loop between code and the systems it deploys to.
Six integrations are now live in production workflows:
- Work items: Agents read and update tasks in Azure DevOps, checking acceptance criteria and user story readiness before implementation begins and again after, with results linked to the corresponding pull request.
- Test execution: Agents run and interpret test results without engineers switching context to a terminal.
- Logging access: Agents diagnose issues using real runtime signals from Application Insights and Azure Monitor.
- Azure access: Controlled environment and resource visibility for agents, so infrastructure context is available during development.
- CI/CD: Agents validate changes end-to-end through the pipeline before a human reviews.
- GitHub: Repository context, PR workflows, and automation are all agent-accessible.
The impact is fewer manual steps, faster feedback loops, and higher confidence in changes reaching production safely. The acceptance criteria integration is particularly significant: the gap between what a work item asked for and what the code actually delivers is now surfaced automatically at the point of the pull request, not caught later in QA.
One honest note: MCP setup and authentication can be fiddly to roll out consistently across a team. Configuration that works on one machine does not always transfer cleanly, and credential management adds friction. For integrations where the setup cost is high relative to the benefit, a CLI-based Cursor Skill is often a more practical option. We treat MCP as the right choice for high-value, persistent integrations, and Skills as the default for everything else.
What models and tools changed this quarter?
We moved from Claude Opus 4.5 to Opus 4.6 as our primary reasoning model, and added Sonnet 4.6 as the default execution model. Opus 4.6 handles deep thinking in plan mode: UX design, test authoring, and requirement analysis. Sonnet 4.6 handles coding, test execution, and technical problem solving at lower cost with comparable quality for implementation tasks. This two-model approach lets us match capability to task type without overspending.
Claude Code and Cursor’s background agents are now in daily use alongside the Cursor IDE. GitHub Copilot, while still licensed, has gone unused: every engineer prefers Cursor. Codex remains available through our ChatGPT Team subscription but sees limited use, primarily from non-engineers. We are evaluating Claude Team as a potential replacement, largely because of its stronger integration with company data and internal knowledgebases. We are also evaluating Phi-3 and Haiku for high-volume client-facing scenarios where cost per request matters more than peak reasoning capability.
How It Shows Up in Client Work
Public API with complex integration (competitive tender). This quarter we won a competitive tender for a public API project involving complex third-party integration. We costed it at 55% of the conventional estimate, factoring in AI-augmented delivery from day one. Some would see pricing a project at roughly half the normal cost as a risk. We see it as confidence earned through two years of measured results.
The estimates proved accurate. Quality was high, the client is very happy, and the responsiveness that the API platform has enabled for their business is already delivering value. This is the clearest proof point yet that AI-augmented delivery translates into real commercial advantage for clients, not just internal efficiency for us.
One honest reflection: while this project’s estimates landed well, accurate estimation remains a genuine challenge. Productivity keeps increasing quarter on quarter, which means the baseline we estimate against is a moving target. An estimate that was conservative three months ago can look generous today. We are getting better at this, but it is not solved.
Legacy modernisation with AI-assisted discovery. On a separate project, the combination of OpenSpec and MCP integration allowed the team to onboard AI into a codebase it had never seen, generate a structured specification of the existing system’s behaviour, and begin modernisation work in a fraction of the time a manual discovery would have taken.
The client received a modernisation roadmap with code-level specificity within two weeks rather than the six to eight weeks this type of engagement has historically required. The quality of the output was higher because it was grounded in the actual codebase, not assumptions from documentation.
What We Dropped
Manual context switching for work items. Engineers previously switched between their IDE and Azure DevOps or GitHub Issues to check acceptance criteria and update task status. MCP integration has eliminated this. It sounds minor; the cumulative time saving across a day is significant.
Standalone test execution. Running tests in a separate terminal and interpreting results manually is now handled by the MCP-driven test runner. Engineers review test results within the same AI-assisted session where the code was written.
GitHub Copilot (local). Still licensed, but every engineer on the team has moved to Cursor. The Copilot IDE extension is no longer in use. GitHub Coding Agent remains active for CI/CD workflows.
What We Passed On
OpenClaw. Despite genuine interest in what OpenClaw offers, we are deliberately not using it. The reasons are straightforward: self-managed install on your own hardware and network, credentials shared with remote agents, and a young, fast-moving open-source codebase. That combination does not pass a basic security test, let alone the bar our ISO 27001 certification requires. Being interested in a tool is not a reason to ignore its risk profile. We will revisit it if the security posture matures.
What’s Next: Q2 2026
Three areas we are exploring:
- AI-assisted estimation. Our competitive tender win at 55% of conventional cost proved the model works when requirements are clear. The bigger challenge is that productivity gains keep compounding, so the baseline we estimate against shifts every quarter. We are investing in estimation approaches grounded in codebase analysis rather than gut feel, and calibrating them against actual delivery data as it accumulates.
- Accelerator library evolution. Our proprietary accelerator modules (booking engine, assessment platform, LMS, enterprise RAG, and others) are being refactored to work more effectively as AI-generated starting points. The goal: a new project starts with a spec and an accelerator, and AI generates the first working version in hours rather than days.
- Cross-project learning. Patterns that prove effective on one project should propagate to others. We are building internal tooling to capture and share proven AI workflows, prompts, and architectural patterns across the team.
Frequently asked questions
How much of Talk Think Do's code is AI-authored?
What AI models does Talk Think Do use for software development?
Is AI-generated code safe for production use?
How much faster is AI-augmented software delivery?
What are Cursor Rules and Skills?
Want to talk about what we're seeing?
Book a free 30-minute consultation. We will give you an honest assessment of your options.