Skip to content
Development Practice

DevOps Maturity: Where Does Your Team Stand and What Should You Fix First?

14 min read Steve Proehl

DevOps Maturity Assessment

Answer 12 questions across five categories to discover your maturity level and get a prioritised improvement plan.

Deployment

How your team gets code from development into production.

How do you deploy to production?
  • Manual (copy files, run scripts, click through a portal)
  • Semi-automated (pipeline deploys, but with manual gates)
  • Fully automated (pipeline deploys to all environments)
  • Continuous (every merged change goes live automatically)
How often do you deploy?
  • Monthly or less
  • Weekly
  • Daily
  • Multiple times per day
How long between a commit and production?
  • Weeks
  • Days
  • Hours
  • Minutes

Testing

How your team validates changes before they reach production.

What percentage of your code has automated tests?
  • Under 20%
  • 20-50%
  • 50-80%
  • Over 80%
Do tests run automatically on every change?
  • No
  • Sometimes
  • Always
  • Always, with AI-generated coverage
Do you have end-to-end tests for critical user journeys?
  • No
  • Some critical paths covered
  • Comprehensive coverage

Monitoring

How your team detects and diagnoses production issues.

How do you find out about production issues?
  • Users report them
  • Basic uptime monitoring
  • Alerting on key metrics
  • AI-powered anomaly detection
How long does it take to diagnose a production issue?
  • Days
  • Hours
  • Minutes
  • Automated diagnosis

Infrastructure

How your team provisions and manages infrastructure.

How is infrastructure managed?
  • Manual (click-ops, hand-configured)
  • Partially scripted
  • Infrastructure as code (Bicep, Terraform)
  • Self-service platform with golden paths
Can a developer provision an environment without help?
  • No, it requires ops or admin access
  • With a request to another team
  • Self-service via documented process
  • Automated on branch creation

Security

How security is integrated into your delivery pipeline.

When does security scanning happen?
  • After deployment (or not at all)
  • In the CI/CD pipeline
  • On every pull request
  • Continuous, with AI-assisted review
How are secrets managed?
  • In code or environment variables
  • In config files (not in source control)
  • In a vault (Azure Key Vault, HashiCorp Vault)
  • Vault with automated rotation

Maturity levels

Level 5: Continuous Optimisation
Your team is operating at elite level. Continuous deployment, AI-driven observability, and a culture of measurement put you in the top tier of software delivery performance.
Highest-value improvements: Enable continuous deployment with automated quality gates. Implement AI-driven observability and incident response. Establish chaos engineering practices.
Level 4: Self-service and Proactive
Strong foundations with self-service capabilities and proactive monitoring. Your team deploys frequently with low failure rates and fast recovery.
Highest-value improvements: Build self-service platform capabilities (environment provisioning, pipeline templates). Implement policy-as-code for security and compliance. Adopt feature flags to decouple deployment from release.
Level 3: Consistent and Measured
Your team has solid automation and is tracking the right metrics. The next step is self-service infrastructure and proactive monitoring.
Highest-value improvements: Increase test coverage with AI-generated tests. Adopt infrastructure as code (Azure Bicep) for all environments. Add structured logging and distributed tracing.
Level 2: Basic Automation
You have a CI/CD pipeline and basic monitoring, but manual gates and limited testing slow you down. The biggest gains come from automating what is still manual.
Highest-value improvements: Set up a CI/CD pipeline that builds and tests on every commit. Add basic monitoring and alerting (Application Insights, uptime checks). Move to pull request-based workflows with branch protection.
Level 1: Manual and Reactive
Deployments are manual, testing is ad-hoc, and incidents are discovered by users. The jump from Level 1 to Level 2 is the highest-impact improvement you can make.
Highest-value improvements: Set up a CI/CD pipeline that builds and tests on every commit. Add basic monitoring and alerting (Application Insights, uptime checks). Move to pull request-based workflows with branch protection.

Take the DevOps Maturity Assessment

Answer 12 questions across five categories to discover your maturity level and get a prioritised improvement plan. Takes about three minutes.

5

Categories assessed

3 min

To complete

Free

Instant results

DevOps maturity is not about adopting every tool and practice. It is about improving the practices that have the biggest impact on your delivery performance. This guide provides a five-level maturity model with AI-augmented practices at each level, a self-assessment framework, and a prioritised improvement path based on DORA metrics.

Why maturity matters

DevOps maturity directly predicts software delivery performance. Teams with higher maturity deploy more often, with lower failure rates, and recover faster when things go wrong. They spend more time building features and less time fighting fires.

The DORA metrics (deployment frequency, lead time for changes, change failure rate, time to restore service) quantify this. Elite teams deploy multiple times per day with a change failure rate under 5% and restore service in under an hour. Low-performing teams deploy monthly, fail on 30%+ of deployments, and take days to recover.

The gap is not about talent. It is about practices, automation, and the feedback loops that let teams improve continuously. In 2026, it is also about AI: teams that embed AI into their DevOps toolchain achieve better DORA metrics at every level.

The five maturity levels

Level 1: manual and reactive

Characteristics:

  • Deployments are manual (copy files, run scripts, click through a portal)
  • No CI/CD pipeline or a pipeline that is rarely used
  • Testing is manual or ad-hoc
  • Monitoring is checking the server when someone reports a problem
  • Infrastructure is configured by hand (click-ops)
  • Incidents are discovered by users, not monitoring

DORA profile: Deployments monthly or less. Lead time weeks to months. Change failure rate 30%+. Recovery time days.

AI opportunity at this level: AI-augmented teams can set up CI/CD pipelines, generate initial test suites, and configure monitoring in days rather than weeks. The jump from Level 1 to Level 2 is the highest-impact improvement, and AI compresses the effort required.

Level 2: basic automation

Characteristics:

  • CI/CD pipeline exists and runs on every commit (at least build and basic tests)
  • Source control is used consistently (pull requests, branch protection)
  • Some automated tests (unit tests, basic integration tests)
  • Basic monitoring exists (uptime checks, error logging)
  • Infrastructure is still mostly manual but documented
  • Deployments are semi-automated (pipeline deploys, but with manual gates)

DORA profile: Deployments weekly to monthly. Lead time days to weeks. Change failure rate 15-30%. Recovery time hours to days.

What to fix first: Increase test coverage and automate the deployment gate. The manual approval step is often the bottleneck at this level. AI-generated tests and AI-assisted code review improve quality enough to reduce the perceived need for manual gates.

Level 3: consistent and measured

Characteristics:

  • CI/CD deploys to all environments automatically (with approval gates only where required by policy)
  • Comprehensive automated testing (unit, integration, end-to-end for critical paths)
  • AI-assisted code review catches style, security, and logic issues before human review
  • AI-generated tests supplement manual test writing, improving coverage breadth
  • Monitoring with alerting (the team is notified of problems before users are)
  • Infrastructure as code for core infrastructure (Azure Bicep, Terraform)
  • DORA metrics are tracked and reviewed

DORA profile: Deployments weekly. Lead time days. Change failure rate 10-15%. Recovery time hours.

What to fix first: Standardise infrastructure as code across all environments. Invest in observability (structured logging, distributed tracing, dashboards) to reduce time-to-diagnose. Begin measuring deployment frequency and lead time to identify the next constraint.

Level 4: self-service and proactive

Characteristics:

  • Teams can provision environments, deploy services, and manage infrastructure without waiting for a central team
  • Platform engineering provides golden paths (templates, modules, pre-configured pipelines)
  • AI-powered observability detects anomalies and predicts issues before they cause incidents
  • Policy-as-code enforces security and compliance automatically (no manual audits)
  • Feature flags decouple deployment from release (deploy anytime, enable features when ready)
  • Incident response is documented, practised, and measured

DORA profile: Deployments daily or multiple times per week. Lead time hours to days. Change failure rate 5-10%. Recovery time under an hour.

What to fix first: Refine the developer experience. If teams can self-serve but the experience is clunky, adoption suffers. Invest in documentation, templates, and CI/CD pipeline optimisation. AI-powered pipeline analysis identifies bottlenecks and suggests optimisations.

Level 5: continuous optimisation

Characteristics:

  • Continuous deployment to production (every merged change goes live automatically)
  • AI-driven incident response (automated diagnosis, suggested remediation, auto-rollback)
  • Predictive scaling based on usage patterns
  • Chaos engineering and resilience testing as standard practice
  • Continuous improvement driven by data (DORA trends, platform adoption, developer satisfaction)
  • Security is fully integrated (shift-left, automated scanning, runtime protection)

DORA profile: Deployments multiple times per day. Lead time under an hour. Change failure rate under 5%. Recovery time minutes.

What to fix first: At this level, improvements are incremental. Focus on developer experience, platform adoption metrics, and reducing toil. Invest in the AI-powered capabilities (predictive scaling, automated incident response) that deliver the final efficiency gains.

Self-assessment

Answer these questions to identify your current level and biggest improvement opportunity.

Deployment

  1. How do you deploy to production? (Manual / Semi-automated / Fully automated / Continuous)
  2. How often do you deploy? (Monthly or less / Weekly / Daily / Multiple times per day)
  3. How long between a commit and production? (Weeks / Days / Hours / Minutes)

Testing

  1. What percentage of your code has automated tests? (Under 20% / 20-50% / 50-80% / Over 80%)
  2. Do tests run automatically on every change? (No / Sometimes / Always / Always, with AI-generated coverage)
  3. Do you have end-to-end tests for critical user journeys? (No / Some / Comprehensive)

Monitoring

  1. How do you find out about production issues? (Users report them / Basic uptime monitoring / Alerting on key metrics / AI-powered anomaly detection)
  2. How long does it take to diagnose a production issue? (Days / Hours / Minutes / Automated diagnosis)

Infrastructure

  1. How is infrastructure managed? (Manual / Partially scripted / Infrastructure as code / Self-service platform)
  2. Can a developer provision an environment without help? (No / With a request / Self-service / Automated on branch creation)

Security

  1. When does security scanning happen? (After deployment / In CI/CD / On every PR / Continuous, with AI-assisted review)
  2. How are secrets managed? (In code / In config files / In a vault / Automated rotation)

Scoring

Use the interactive assessment above to calculate your score automatically. Each question’s options map to maturity levels: the first option is Level 1, the last is Level 4 or 5. Your overall maturity level is based on the average across all 12 questions.

  • Level 5 (Continuous Optimisation): average 3.6 or above
  • Level 4 (Self-service and Proactive): average 3.0 to 3.5
  • Level 3 (Consistent and Measured): average 2.3 to 2.9
  • Level 2 (Basic Automation): average 1.5 to 2.2
  • Level 1 (Manual and Reactive): average below 1.5

The categories where you score lowest indicate your biggest improvement opportunities. Focus on the weakest link first.

The three highest-value improvements at each level

From Level 1 to Level 2:

  1. Set up a CI/CD pipeline that builds and tests on every commit
  2. Add basic monitoring and alerting (Application Insights, uptime checks)
  3. Move to pull request-based workflows with branch protection

From Level 2 to Level 3:

  1. Increase test coverage with AI-generated tests
  2. Adopt infrastructure as code (Azure Bicep) for all environments
  3. Add structured logging and distributed tracing

From Level 3 to Level 4:

  1. Build self-service platform capabilities (environment provisioning, pipeline templates)
  2. Implement policy-as-code for security and compliance
  3. Adopt feature flags to decouple deployment from release

From Level 4 to Level 5:

  1. Enable continuous deployment with automated quality gates
  2. Implement AI-driven observability and incident response
  3. Establish chaos engineering practices

How AI compounds DevOps improvement

AI does not replace DevOps practices. It amplifies them at every level.

Code review: AI catches security vulnerabilities, performance issues, and logic errors that human reviewers miss in the volume of daily changes. Human reviewers focus on architecture and business logic. The combination is stronger than either alone.

Test generation: AI produces tests from code, specifications, and existing test patterns. This does not replace thoughtful test design, but it dramatically increases coverage breadth, catching regressions in code paths that manual test writing never reaches.

Pipeline optimisation: AI analyses build logs and execution patterns to identify slow steps, unnecessary rebuilds, and parallelisation opportunities. Teams that run GitHub Actions with AI-optimised workflows see measurable improvements in build times.

Observability: AI-powered monitoring detects anomalous patterns (unusual error rates, latency spikes, traffic changes) before they become incidents. This shifts incident response from reactive to proactive.

Infrastructure management: AI generates Azure Bicep templates, suggests resource configurations, and identifies cost optimisation opportunities. Infrastructure changes that required deep platform knowledge become accessible to the broader team.

These capabilities are available through an AI-augmented DevOps approach. The benefit compounds: faster builds, fewer failures, faster recovery, more time for features.

Where to start

  1. Take the self-assessment. Identify your current level and the biggest gap.
  2. Pick one improvement. Choose the highest-value improvement for your level from the list above. Do not try to improve everything at once.
  3. Measure the result. Track the DORA metric most relevant to your improvement (deployment frequency, lead time, change failure rate, or time to restore). This proves the value and builds momentum.

See our DevOps consulting services for how we help teams improve, or book a consultation to discuss your specific situation. If you are evaluating whether to build DevOps capability in-house or work with a partner, see our guide on in-house DevOps vs DevOps-as-a-Service.

Frequently asked questions

What is DevOps maturity?
DevOps maturity measures how effectively an organisation delivers, operates, and improves software. It covers practices across version control, CI/CD, testing, monitoring, infrastructure management, and incident response. Higher maturity means faster delivery, fewer failures, quicker recovery, and more time spent building value instead of fighting fires.
What are DORA metrics?
DORA (DevOps Research and Assessment) metrics are four measures of software delivery performance: deployment frequency (how often you deploy to production), lead time for changes (time from commit to production), change failure rate (percentage of deployments causing failures), and time to restore service (how quickly you recover from failures). These metrics are the most widely used benchmark for DevOps maturity.
Where should I focus first to improve DevOps maturity?
Focus on the constraint. If deployments are manual and infrequent, automate CI/CD. If deployments are automated but fail often, invest in testing and code review. If failures take hours to resolve, invest in monitoring and alerting. Improve the weakest link first: it has the biggest impact on overall delivery performance.
How does AI change DevOps practices?
AI enhances DevOps at every maturity level: AI-assisted code review catches issues humans miss, AI-generated tests improve coverage, AI-powered pipeline optimisation reduces build times, and AI-driven observability detects anomalies before they become incidents. These are not theoretical: AI-augmented DevOps teams achieve measurably better DORA metrics.
Do I need a dedicated DevOps engineer?
Not necessarily. Small teams can embed DevOps practices into their development workflow. Medium teams benefit from a DevOps-as-a-Service partner who provides expertise without the overhead of a full-time hire. Large teams with complex infrastructure typically need dedicated DevOps or platform engineers. See our guide on in-house DevOps vs DevOps-as-a-Service for a cost comparison.
How long does it take to improve DevOps maturity?
Moving one maturity level takes 2-6 months depending on starting point and investment. Quick wins (automating deployment, adding basic monitoring) deliver results in weeks. Structural improvements (platform engineering, policy-as-code, AI-driven observability) take longer but deliver compounding returns.

Ready to transform your software?

Let's talk about your project. Contact us for a free consultation and see how we can deliver a business-critical solution at startup speed.