Metrics in Execution¶

Metrics should inform decisions, not create anxiety. This page covers how to use delivery metrics—DORA metrics, cycle time, throughput—as signals that guide action rather than performance theater that consumes time without improving outcomes.

The goal is to connect abstract numbers to concrete behavior: What do these metrics tell us? When should we act on them? What actions actually move them?

What Problem This Solves¶

Most teams either ignore metrics or drown in them. Neither works.

When metrics are ignored:

You can't tell if things are getting better or worse
Problems become visible only when they're crises
Conversations about performance are based on vibes, not data
Improvement efforts have no feedback loop

When metrics are theater:

Teams optimize for the metric, not the outcome
Measurement consumes energy that could go to building
Numbers are reported but not acted upon
People game the system to look good

The right approach is in between: a small set of meaningful metrics, reviewed regularly, connected to actions that actually help.

When to Focus on Metrics¶

Actively invest when:

You're establishing a new team or process
You suspect problems but don't have data to confirm
Leadership is asking questions you can't answer
You want to evaluate whether an improvement worked

Maintain when:

Things are working—metrics are health checks, not deep dives
Onboarding new people to understand how the team works

Investigate when:

Metrics move unexpectedly (in either direction)
Metrics and team sentiment diverge (numbers look good but team feels bad, or vice versa)
External stakeholders are concerned about delivery

Ownership¶

Role	Responsibility
Engineering Manager	Owns metric visibility and review rhythm; addresses systemic issues
Tech Lead	Interprets technical implications; proposes actions
Platform/DevOps	Provides metric infrastructure; ensures accuracy
Individual Contributors	Understands metrics; contributes to improvement

Metrics are signals, not targets

The moment a metric becomes a target, it stops being a useful measure. People will optimize for the metric instead of the outcome it represents. Use metrics to understand, not to judge.

The DORA Metrics¶

The DORA (DevOps Research and Assessment) metrics are the most validated predictors of software delivery performance. They're a good starting point for any team.

The Four Key Metrics¶

Metric	Definition	Why It Matters
Deployment Frequency	How often you deploy to production	Measures ability to ship work incrementally
Lead Time for Changes	Time from code commit to production	Measures delivery speed
Change Failure Rate	Percentage of deployments causing failures	Measures quality of releases
Mean Time to Recovery (MTTR)	Time from incident detection to resolution	Measures resilience and recovery capability

Performance Levels¶

DORA research identifies four performance levels:

Level	Deploy Frequency	Lead Time	Change Failure Rate	MTTR
Elite	Multiple times/day	< 1 hour	0-15%	< 1 hour
High	Weekly to daily	1 day - 1 week	16-30%	< 1 day
Medium	Monthly to weekly	1 week - 1 month	16-30%	1 day - 1 week
Low	Monthly+	1-6 months	16-30%	1 week - 1 month

What DORA Tells You¶

Elite/high performers are not just fast—they're also stable. The research shows that speed and stability go together, not trade off. Teams that deploy frequently have lower failure rates and recover faster.

If metrics diverge, investigate. Fast lead time with high failure rate suggests skipping quality gates. Low failure rate with slow lead time suggests over-engineering. Balance matters.

Context matters. A startup and a hospital system have different acceptable risk profiles. Use industry benchmarks as reference, not dogma.

Cycle Time and Flow Metrics¶

Beyond DORA, flow metrics help you understand how work moves through your system.

Cycle Time¶

Definition: Time from when work starts to when it's done (typically from "In Progress" to "Done" or "Deployed").

Why it matters: Long cycle time means feedback is slow, risk per change is high, and work stays "in flight" longer.

What affects cycle time:

Work item size (smaller = faster)
WIP limits (lower = faster)
Handoffs and waiting time
Review bottlenecks
Deploy frequency

Throughput¶

Definition: Number of work items completed per unit time.

Why it matters: Throughput indicates team capacity and is useful for forecasting.

Caution: Don't optimize for throughput by making items smaller without actually delivering more value. Count value delivered, not tickets closed.

Work in Progress (WIP)¶

Definition: Number of items currently being worked on.

Why it matters: High WIP correlates with longer cycle time (Little's Law). It also correlates with context switching, which kills productivity.

The relationship: Cycle Time ≈ WIP / Throughput. To reduce cycle time, reduce WIP.

Flow Efficiency¶

Definition: Active time / Total time. How much of cycle time is spent actually working vs. waiting.

Why it matters: Most work spends more time waiting than being worked on. Flow efficiency reveals this.

Typical numbers: 15-40% flow efficiency is common. If your flow efficiency is < 15%, you have a waiting problem.

Making Metrics Actionable¶

The Review Rhythm¶

Metrics need regular attention to be useful, but not constant attention.

Daily: Don't look at metrics daily unless there's a problem. Check dashboards if something seems off.

Weekly: Quick team check-in on flow metrics. Are we hitting our targets? Any anomalies?

Sprint/Iteration: Review cycle time, throughput, any experiments. What did we learn?

Monthly/Quarterly: Deeper analysis. Trends over time. Strategic decisions about process.

From Metric to Action¶

Metric Signal	What It Might Mean	Possible Actions
Cycle time increasing	Items too big, WIP too high, bottlenecks emerging	Slice smaller, reduce WIP, investigate queues
Deployment frequency dropping	Fear of releases, manual process, merge pain	Improve CI/CD, smaller batches, feature flags
Change failure rate rising	Skipping tests, rushing, inadequate review	Strengthen quality gates, slow down, investigate patterns
MTTR increasing	Poor observability, unclear runbooks, process confusion	Improve monitoring, update runbooks, practice incidents
Low flow efficiency	Waiting in queues, handoffs, slow feedback	Visualize queues, reduce handoffs, parallelize work

What Actions Actually Help¶

For slow cycle time:

Reduce WIP limits
Slice work smaller
Eliminate waiting (faster reviews, self-service deploys)
Reduce handoffs

For low deployment frequency:

Automate the deployment pipeline
Use feature flags to decouple deploy from release
Build confidence through better testing
Practice deploying (if fear is the blocker)

For high change failure rate:

Improve test coverage on failure-prone areas
Add staging validation
Use canary deployments
Slow down and investigate root causes

For slow MTTR:

Improve observability and alerting
Create and maintain runbooks
Practice incident response
Reduce MTTD (Mean Time to Detection)

Avoiding Measurement Theater¶

Anti-patterns to Avoid¶

Gaming the metric. When the goal is "reduce cycle time," teams split work into tiny items that close fast but don't deliver value. Always pair metrics with outcomes.

Metric without action. Dashboards exist but nobody looks at them. Numbers are reported but not discussed. Measurement without follow-through is waste.

Individual metrics. Using delivery metrics to evaluate individual performance destroys collaboration. People will hoard work, avoid helping others, and optimize locally.

Too many metrics. When you track 20 things, you're not really tracking anything. Focus on 3-5 metrics that matter.

Vanity metrics. Lines of code, commits, PRs merged—these measure activity, not outcomes. They're worse than useless if they drive behavior.

Principles for Healthy Measurement¶

Start with the question. What do you want to know? Pick metrics that answer that question.

Fewer, not more. A small set of metrics you actually review beats a large set you ignore.

Team-level, not individual. Delivery metrics are about team performance and process, not individual productivity.

Trends over absolutes. Whether cycle time is "good" matters less than whether it's improving or declining.

Pair with qualitative data. Metrics tell you what; conversations tell you why. Don't skip the why.

What Good Looks Like¶

You'll know metrics are working when:

Signal	What it looks like
Metrics inform decisions	"We're considering X because cycle time has increased"
Team understands the metrics	Anyone can explain what the metrics mean and why they matter
Actions follow signals	When a metric moves, the team investigates and responds
No gaming	People focus on outcomes, not manipulating numbers
Trends are visible	You can show how metrics have changed over months
Balance maintained	Speed and quality are both tracked; neither is sacrificed

Failure Modes and Mitigations¶

The Dashboard Graveyard¶

Symptom: Dashboards exist but nobody looks at them. Metrics are technically available but effectively invisible.

Root cause: No review rhythm. Metrics aren't connected to decisions. Too many dashboards.

Mitigation: Pick 3-5 metrics. Review them regularly. Connect them to actions. Sunset unused dashboards.

The Metric Mandate¶

Symptom: Leadership demands metrics improve without providing support or context. Teams feel pressure to hit numbers.

Root cause: Metrics used as targets. No understanding of what affects them.

Mitigation: Educate leadership on what metrics mean and what affects them. Discuss barriers, not just results. Make it safe to surface problems.

The Local Optimization¶

Symptom: Team A improves their cycle time by pushing work to Team B. Overall system doesn't improve.

Root cause: Metrics measured at team level without system view.

Mitigation: Track end-to-end metrics that span teams. Look at value streams, not just team boundaries.

The Goodhart Trap¶

Symptom: The metric improves but outcomes don't. Cycle time is great, but customers aren't happier.

Root cause: Optimizing for the measure, not the outcome it represents.

Mitigation: Pair delivery metrics with outcome metrics (customer satisfaction, revenue, etc.). Ask whether better numbers mean better results.

Copy-Paste Artifact: Delivery Metrics Dashboard Spec¶

Use this to define what to track and how.

## Delivery Metrics Dashboard

**Team:** [Name]
**Last updated:** [Date]

### Core Metrics

| Metric                | Definition                           | Source                  | Target                 | Current |
| --------------------- | ------------------------------------ | ----------------------- | ---------------------- | ------- |
| Deployment Frequency  | Deploys to production per day/week   | CI/CD system            | [e.g., daily]          |         |
| Lead Time for Changes | Commit to production (median)        | Version control + CI/CD | [e.g., < 2 days]       |         |
| Change Failure Rate   | % of deploys causing rollback/hotfix | Incident tracking       | [e.g., < 15%]          |         |
| MTTR                  | Detection to resolution (median)     | Incident tracking       | [e.g., < 4 hours]      |         |
| Cycle Time            | In Progress to Done (median)         | Issue tracker           | [e.g., < 5 days]       |         |
| WIP                   | Items in progress                    | Issue tracker           | [e.g., < 2x team size] |         |

### Review Cadence

| Frequency | Who               | What                            |
| --------- | ----------------- | ------------------------------- |
| Weekly    | Team              | Quick check on anomalies        |
| Sprint    | Team              | Review metrics, discuss actions |
| Monthly   | Team + leadership | Trends, strategic decisions     |

### Data Sources

| Metric              | Tool            | Collection method            |
| ------------------- | --------------- | ---------------------------- |
| Deploy frequency    | [CI/CD tool]    | [Automated/Manual]           |
| Lead time           | [Git + CI/CD]   | [Automated]                  |
| Change failure rate | [Incident tool] | [Manual tag on incidents]    |
| MTTR                | [Incident tool] | [Calculated from timestamps] |
| Cycle time          | [Issue tracker] | [Automated from workflow]    |

Copy-Paste Artifact: Metrics Review Agenda¶

## Monthly Metrics Review

**Date:** [Date]
**Attendees:** [Team + stakeholders]
**Duration:** 45 minutes

### Metrics Snapshot

| Metric               | Last Month | This Month | Trend | Target |
| -------------------- | ---------- | ---------- | ----- | ------ |
| Deployment Frequency |            |            | ↑/↓/→ |        |
| Lead Time            |            |            | ↑/↓/→ |        |
| Change Failure Rate  |            |            | ↑/↓/→ |        |
| MTTR                 |            |            | ↑/↓/→ |        |
| Cycle Time           |            |            | ↑/↓/→ |        |

### Discussion

1. **What moved?** Any significant changes in metrics?
2. **Why?** What caused the change? (Process, team, external factors)
3. **So what?** Do we need to act on this?

### Actions from Last Month

| Action   | Status                       | Impact              |
| -------- | ---------------------------- | ------------------- |
| [Action] | Done/In progress/Not started | [Effect on metrics] |

### New Actions

| Action | Owner | Due |
| ------ | ----- | --- |
|        |       |     |

### Open Questions

-

Copy-Paste Artifact: Metric Investigation Template¶

Use when a metric moves unexpectedly.

## Metric Investigation: [Metric Name]

**Date:** [Date]
**Investigator:** [Name]

### The Signal

**Metric:** [Which metric]
**Expected:** [What we expected or the baseline]
**Actual:** [What we observed]
**Period:** [When this occurred]

### Potential Causes

| Hypothesis       | Evidence For      | Evidence Against |
| ---------------- | ----------------- | ---------------- |
| [Possible cause] | [Data supporting] | [Data refuting]  |
| [Possible cause] | [Data supporting] | [Data refuting]  |

### Root Cause

**Most likely cause:** [Description]

**Contributing factors:**

- [Factor]
- [Factor]

### Impact

**Who/what is affected:** [Description]
**Severity:** [Low / Medium / High]

### Recommended Actions

| Action | Priority | Owner | Due |
| ------ | -------- | ----- | --- |
|        |          |       |     |

### Monitoring

**How we'll know if it's fixed:** [Metric target or behavior]
**Check-in date:** [Date]