Metrics in Execution¶
Metrics should inform decisions, not create anxiety. This page covers how to use delivery metrics—DORA metrics, cycle time, throughput—as signals that guide action rather than performance theater that consumes time without improving outcomes.
The goal is to connect abstract numbers to concrete behavior: What do these metrics tell us? When should we act on them? What actions actually move them?
What Problem This Solves¶
Most teams either ignore metrics or drown in them. Neither works.
When metrics are ignored:
- You can't tell if things are getting better or worse
- Problems become visible only when they're crises
- Conversations about performance are based on vibes, not data
- Improvement efforts have no feedback loop
When metrics are theater:
- Teams optimize for the metric, not the outcome
- Measurement consumes energy that could go to building
- Numbers are reported but not acted upon
- People game the system to look good
The right approach is in between: a small set of meaningful metrics, reviewed regularly, connected to actions that actually help.
When to Focus on Metrics¶
Actively invest when:
- You're establishing a new team or process
- You suspect problems but don't have data to confirm
- Leadership is asking questions you can't answer
- You want to evaluate whether an improvement worked
Maintain when:
- Things are working—metrics are health checks, not deep dives
- Onboarding new people to understand how the team works
Investigate when:
- Metrics move unexpectedly (in either direction)
- Metrics and team sentiment diverge (numbers look good but team feels bad, or vice versa)
- External stakeholders are concerned about delivery
Ownership¶
| Role | Responsibility |
|---|---|
| Engineering Manager | Owns metric visibility and review rhythm; addresses systemic issues |
| Tech Lead | Interprets technical implications; proposes actions |
| Platform/DevOps | Provides metric infrastructure; ensures accuracy |
| Individual Contributors | Understands metrics; contributes to improvement |
Metrics are signals, not targets
The moment a metric becomes a target, it stops being a useful measure. People will optimize for the metric instead of the outcome it represents. Use metrics to understand, not to judge.
The DORA Metrics¶
The DORA (DevOps Research and Assessment) metrics are the most validated predictors of software delivery performance. They're a good starting point for any team.
The Four Key Metrics¶
| Metric | Definition | Why It Matters |
|---|---|---|
| Deployment Frequency | How often you deploy to production | Measures ability to ship work incrementally |
| Lead Time for Changes | Time from code commit to production | Measures delivery speed |
| Change Failure Rate | Percentage of deployments causing failures | Measures quality of releases |
| Mean Time to Recovery (MTTR) | Time from incident detection to resolution | Measures resilience and recovery capability |
Performance Levels¶
DORA research identifies four performance levels:
| Level | Deploy Frequency | Lead Time | Change Failure Rate | MTTR |
|---|---|---|---|---|
| Elite | Multiple times/day | < 1 hour | 0-15% | < 1 hour |
| High | Weekly to daily | 1 day - 1 week | 16-30% | < 1 day |
| Medium | Monthly to weekly | 1 week - 1 month | 16-30% | 1 day - 1 week |
| Low | Monthly+ | 1-6 months | 16-30% | 1 week - 1 month |
What DORA Tells You¶
Elite/high performers are not just fast—they're also stable. The research shows that speed and stability go together, not trade off. Teams that deploy frequently have lower failure rates and recover faster.
If metrics diverge, investigate. Fast lead time with high failure rate suggests skipping quality gates. Low failure rate with slow lead time suggests over-engineering. Balance matters.
Context matters. A startup and a hospital system have different acceptable risk profiles. Use industry benchmarks as reference, not dogma.
Cycle Time and Flow Metrics¶
Beyond DORA, flow metrics help you understand how work moves through your system.
Cycle Time¶
Definition: Time from when work starts to when it's done (typically from "In Progress" to "Done" or "Deployed").
Why it matters: Long cycle time means feedback is slow, risk per change is high, and work stays "in flight" longer.
What affects cycle time:
- Work item size (smaller = faster)
- WIP limits (lower = faster)
- Handoffs and waiting time
- Review bottlenecks
- Deploy frequency
Throughput¶
Definition: Number of work items completed per unit time.
Why it matters: Throughput indicates team capacity and is useful for forecasting.
Caution: Don't optimize for throughput by making items smaller without actually delivering more value. Count value delivered, not tickets closed.
Work in Progress (WIP)¶
Definition: Number of items currently being worked on.
Why it matters: High WIP correlates with longer cycle time (Little's Law). It also correlates with context switching, which kills productivity.
The relationship: Cycle Time ≈ WIP / Throughput. To reduce cycle time, reduce WIP.
Flow Efficiency¶
Definition: Active time / Total time. How much of cycle time is spent actually working vs. waiting.
Why it matters: Most work spends more time waiting than being worked on. Flow efficiency reveals this.
Typical numbers: 15-40% flow efficiency is common. If your flow efficiency is < 15%, you have a waiting problem.
Making Metrics Actionable¶
The Review Rhythm¶
Metrics need regular attention to be useful, but not constant attention.
Daily: Don't look at metrics daily unless there's a problem. Check dashboards if something seems off.
Weekly: Quick team check-in on flow metrics. Are we hitting our targets? Any anomalies?
Sprint/Iteration: Review cycle time, throughput, any experiments. What did we learn?
Monthly/Quarterly: Deeper analysis. Trends over time. Strategic decisions about process.
From Metric to Action¶
| Metric Signal | What It Might Mean | Possible Actions |
|---|---|---|
| Cycle time increasing | Items too big, WIP too high, bottlenecks emerging | Slice smaller, reduce WIP, investigate queues |
| Deployment frequency dropping | Fear of releases, manual process, merge pain | Improve CI/CD, smaller batches, feature flags |
| Change failure rate rising | Skipping tests, rushing, inadequate review | Strengthen quality gates, slow down, investigate patterns |
| MTTR increasing | Poor observability, unclear runbooks, process confusion | Improve monitoring, update runbooks, practice incidents |
| Low flow efficiency | Waiting in queues, handoffs, slow feedback | Visualize queues, reduce handoffs, parallelize work |
What Actions Actually Help¶
For slow cycle time:
- Reduce WIP limits
- Slice work smaller
- Eliminate waiting (faster reviews, self-service deploys)
- Reduce handoffs
For low deployment frequency:
- Automate the deployment pipeline
- Use feature flags to decouple deploy from release
- Build confidence through better testing
- Practice deploying (if fear is the blocker)
For high change failure rate:
- Improve test coverage on failure-prone areas
- Add staging validation
- Use canary deployments
- Slow down and investigate root causes
For slow MTTR:
- Improve observability and alerting
- Create and maintain runbooks
- Practice incident response
- Reduce MTTD (Mean Time to Detection)
Avoiding Measurement Theater¶
Anti-patterns to Avoid¶
Gaming the metric. When the goal is "reduce cycle time," teams split work into tiny items that close fast but don't deliver value. Always pair metrics with outcomes.
Metric without action. Dashboards exist but nobody looks at them. Numbers are reported but not discussed. Measurement without follow-through is waste.
Individual metrics. Using delivery metrics to evaluate individual performance destroys collaboration. People will hoard work, avoid helping others, and optimize locally.
Too many metrics. When you track 20 things, you're not really tracking anything. Focus on 3-5 metrics that matter.
Vanity metrics. Lines of code, commits, PRs merged—these measure activity, not outcomes. They're worse than useless if they drive behavior.
Principles for Healthy Measurement¶
Start with the question. What do you want to know? Pick metrics that answer that question.
Fewer, not more. A small set of metrics you actually review beats a large set you ignore.
Team-level, not individual. Delivery metrics are about team performance and process, not individual productivity.
Trends over absolutes. Whether cycle time is "good" matters less than whether it's improving or declining.
Pair with qualitative data. Metrics tell you what; conversations tell you why. Don't skip the why.
What Good Looks Like¶
You'll know metrics are working when:
| Signal | What it looks like |
|---|---|
| Metrics inform decisions | "We're considering X because cycle time has increased" |
| Team understands the metrics | Anyone can explain what the metrics mean and why they matter |
| Actions follow signals | When a metric moves, the team investigates and responds |
| No gaming | People focus on outcomes, not manipulating numbers |
| Trends are visible | You can show how metrics have changed over months |
| Balance maintained | Speed and quality are both tracked; neither is sacrificed |
Failure Modes and Mitigations¶
The Dashboard Graveyard¶
Symptom: Dashboards exist but nobody looks at them. Metrics are technically available but effectively invisible.
Root cause: No review rhythm. Metrics aren't connected to decisions. Too many dashboards.
Mitigation: Pick 3-5 metrics. Review them regularly. Connect them to actions. Sunset unused dashboards.
The Metric Mandate¶
Symptom: Leadership demands metrics improve without providing support or context. Teams feel pressure to hit numbers.
Root cause: Metrics used as targets. No understanding of what affects them.
Mitigation: Educate leadership on what metrics mean and what affects them. Discuss barriers, not just results. Make it safe to surface problems.
The Local Optimization¶
Symptom: Team A improves their cycle time by pushing work to Team B. Overall system doesn't improve.
Root cause: Metrics measured at team level without system view.
Mitigation: Track end-to-end metrics that span teams. Look at value streams, not just team boundaries.
The Goodhart Trap¶
Symptom: The metric improves but outcomes don't. Cycle time is great, but customers aren't happier.
Root cause: Optimizing for the measure, not the outcome it represents.
Mitigation: Pair delivery metrics with outcome metrics (customer satisfaction, revenue, etc.). Ask whether better numbers mean better results.
Copy-Paste Artifact: Delivery Metrics Dashboard Spec¶
Use this to define what to track and how.
## Delivery Metrics Dashboard
**Team:** [Name]
**Last updated:** [Date]
### Core Metrics
| Metric | Definition | Source | Target | Current |
| --------------------- | ------------------------------------ | ----------------------- | ---------------------- | ------- |
| Deployment Frequency | Deploys to production per day/week | CI/CD system | [e.g., daily] | |
| Lead Time for Changes | Commit to production (median) | Version control + CI/CD | [e.g., < 2 days] | |
| Change Failure Rate | % of deploys causing rollback/hotfix | Incident tracking | [e.g., < 15%] | |
| MTTR | Detection to resolution (median) | Incident tracking | [e.g., < 4 hours] | |
| Cycle Time | In Progress to Done (median) | Issue tracker | [e.g., < 5 days] | |
| WIP | Items in progress | Issue tracker | [e.g., < 2x team size] | |
### Review Cadence
| Frequency | Who | What |
| --------- | ----------------- | ------------------------------- |
| Weekly | Team | Quick check on anomalies |
| Sprint | Team | Review metrics, discuss actions |
| Monthly | Team + leadership | Trends, strategic decisions |
### Data Sources
| Metric | Tool | Collection method |
| ------------------- | --------------- | ---------------------------- |
| Deploy frequency | [CI/CD tool] | [Automated/Manual] |
| Lead time | [Git + CI/CD] | [Automated] |
| Change failure rate | [Incident tool] | [Manual tag on incidents] |
| MTTR | [Incident tool] | [Calculated from timestamps] |
| Cycle time | [Issue tracker] | [Automated from workflow] |
Copy-Paste Artifact: Metrics Review Agenda¶
## Monthly Metrics Review
**Date:** [Date]
**Attendees:** [Team + stakeholders]
**Duration:** 45 minutes
### Metrics Snapshot
| Metric | Last Month | This Month | Trend | Target |
| -------------------- | ---------- | ---------- | ----- | ------ |
| Deployment Frequency | | | ↑/↓/→ | |
| Lead Time | | | ↑/↓/→ | |
| Change Failure Rate | | | ↑/↓/→ | |
| MTTR | | | ↑/↓/→ | |
| Cycle Time | | | ↑/↓/→ | |
### Discussion
1. **What moved?** Any significant changes in metrics?
2. **Why?** What caused the change? (Process, team, external factors)
3. **So what?** Do we need to act on this?
### Actions from Last Month
| Action | Status | Impact |
| -------- | ---------------------------- | ------------------- |
| [Action] | Done/In progress/Not started | [Effect on metrics] |
### New Actions
| Action | Owner | Due |
| ------ | ----- | --- |
| | | |
### Open Questions
-
Copy-Paste Artifact: Metric Investigation Template¶
Use when a metric moves unexpectedly.
## Metric Investigation: [Metric Name]
**Date:** [Date]
**Investigator:** [Name]
### The Signal
**Metric:** [Which metric]
**Expected:** [What we expected or the baseline]
**Actual:** [What we observed]
**Period:** [When this occurred]
### Potential Causes
| Hypothesis | Evidence For | Evidence Against |
| ---------------- | ----------------- | ---------------- |
| [Possible cause] | [Data supporting] | [Data refuting] |
| [Possible cause] | [Data supporting] | [Data refuting] |
### Root Cause
**Most likely cause:** [Description]
**Contributing factors:**
- [Factor]
- [Factor]
### Impact
**Who/what is affected:** [Description]
**Severity:** [Low / Medium / High]
### Recommended Actions
| Action | Priority | Owner | Due |
| ------ | -------- | ----- | --- |
| | | | |
### Monitoring
**How we'll know if it's fixed:** [Metric target or behavior]
**Check-in date:** [Date]
Further Reading¶
- Accelerate by Nicole Forsgren, Jez Humble, and Gene Kim – The research behind DORA metrics
- The Phoenix Project by Gene Kim, Kevin Behr, and George Spafford – DevOps principles in narrative form
- Measuring and Managing Performance in Organizations by Robert Austin – Why measurement often backfires
- The State of DevOps Report – Annual research on software delivery performance
Related¶
- Engineering Metrics – Comprehensive metrics guide
- Quality and CI – The practices that drive good metrics
- Continuous Improvement – Acting on what metrics reveal
- Planning and Slicing – How to improve cycle time through smaller work