The DevOps engineer's handbook

DevOps metrics

Metrics for software delivery have a sketchy past. In the pursuit of valuable measures, we’ve seen countless mistakes. From lines of code to story-point velocity, focusing on easy-to-measure things and using the numbers in unhealthy ways damaged trust in experimental approaches.

There is a way forward, however. You must keep these past mistakes in mind, but you can introduce metrics without causing problems.

Many software teams already use metrics to track their production systems. Observability can tell you the state of your infrastructure and applications. If well-practiced with observability, you might even reverse a change based on its impact on key application metrics. For example, reverting based on sales volumes or the value of orders placed.

Bringing observability to your teams and processes is just as important. You need an objective way to create improvement ideas and test their effectiveness.

To protect the value of your measurement system, it’s crucial to avoid misuse. Some organizations turned software delivery performance into a goal, but delivering software is not the objective. The outcome of the software and the organization is to provide value.

There are many downsides to an inappropriate measurement system. The most significant damage caused by misusing metrics is when teams or individuals mistake a measurement as more important than the organization’s mission.

A healthy measurement system

Your organizational culture will often predict the health of your measurement system. You need a high-trust, low-blame culture to avoid problems with elevation, aggregation, and misuse.

Elevation occurs when a metric gets reported too far up in an organization. Rather than use metrics to inform decisions, teams and individuals over-emphasize tasks that move the number. This results in sub-optimization, where outcomes worsen despite the elevated metric improvement.

The aggregation problem happens when you measure against the wrong scale. A useful metric for measuring team and application performance may be misleading or damaging. Especially when applied with a different level of aggregation, like an individual or a whole department.

You misuse a metric when you change the purpose of the measurement. The classic misuse example is taking measures meant for continuous improvement and using them to compare teams. When you misuse metrics, you create the conditions where teams can game measurement systems. This destroys the value of the metrics and damages performance at the organizational level.

Metrics for navigation

You might find it helpful to think about metrics as navigation aids. Imagine taking a long journey. You need a mix of different signals to arrive at your destination. At your first waypoint, like a train station, you need information to tell you which platform to use and if trains are running on time.

As you continue your journey, you use short-term signals and discard them. Long-term signals are more stable, like your eventual destination and stations where you change trains.

This metaphor allows you to assign one of 4 categories to each metric you use.

  • Destination: The outcome that aligns with the organization’s purpose.
  • Waypoints: Solid indicators that you’re heading towards your destination metrics’ goals.
  • Signposts: Informational metrics to orient yourself. These are not all directional. They may show other information needed for your journey.
  • Tracks: Provides hints about your surroundings. Sometimes, focus may be on these tracks. Sometimes, you may not pay them much attention.

Destination metrics

Destination metrics show whether you’re achieving your organization’s goal and purpose. This is often financial, but not always. For example, the Against Malaria Foundation’s destination metrics track clinical malaria rates in their operating regions.

While destination metrics are stable over the long term, they aren’t necessarily permanent. They tend to reflect the organization’s mission, so they may change if the mission changes.

Waypoint metrics

Waypoint metrics tell you if you’re heading in the right direction. Each time you pass a waypoint, you reduce uncertainty about the outcome. The metrics you choose as waypoints will respond earlier than destination metrics. Yet, the relationship you established gives you confidence they’ll result in the outcomes you want.

For example, you might use opportunities in the pipeline to predict future revenue. You must be cautious with goals for waypoint metrics. Having more opportunities is only helpful if you maintain some probability they’ll become revenue. Adding many low-probability opportunities will damage your destination metrics, as you’ll waste time on them rather than likelier sales.

Signpost metrics

Signpost metrics provide concrete signals for smaller groups within the organization. They provide local goals that connect the group’s work to the organization’s efforts. These metrics will differ for each team and will have a recognizable local flavor.

A contact center team will have very different measurements from a sales or finance team. When a team improves performance against a signpost metric, it should connect to organizational performance. That doesn’t mean you should create goals for these metrics, as this can have counter-productive effects on outcomes.

Tracking metrics

A tracking metric is local and often temporary. You would use a tracking metric to narrow the focus on part of your role and improve it. For example, if your build times were causing a bottleneck, you’d measure them while you made improvements. Once the constraint moves, you’ll stop measuring build times and focus on the next one.

It’s important to keep trimming tracking metrics to avoid overloading yourself with information. Removing them from your dashboard makes room for new metrics, though you could leave an automated alarm to tell you if things slip.

You should stop measuring at the tracking level when their value no longer justifies the effort to collect them.

Differences between metric levels

Higher up this system of classification, the metrics are more stable. While they might change, they’ll do so less often. These measurements tend to confirm you are heading in the right direction.

Lower down, metrics should change more often in response to your context. These metrics are good for prediction as they move earlier. They are more responsive to changes you introduce. Their value comes from reducing the lag between cause and effect. You must continuously check the relationship they have to outcomes, though.

If you find the relationship between level measures and destination metrics invalid, you must replace them.

Prioritize measurement health

In an ideal world, improvements to your work would cause an immediate response in your destination metrics. This rarely happens. Most gains are marginal, and their impact delayed. It’s also risky to pause all other changes while you wait to see if your improvement makes a difference.

Metrics more sensitive to local changes means you have a faster feedback loop as you experiment to find better ways of working. You must carefully maintain the vibrancy of these metrics. Avoiding misuse, elevation, and bad aggregation is even more important.

The individuals and interactions involved in building software create a complex and dynamic system, so it’s challenging to measure. You need your best efforts to operate metrics that are healthy, not harmful.

You need to:

  • Collect and use the data for its intended purpose
  • Balance the need for fast feedback with the importance of tracking real outcomes
  • Constantly adjust to keep things healthy

Wherever possible, let teams collect measurements for their own use. By limiting the extent of publication for the metrics, you’ll have more control over how they get used. The metrics should help you to identify improvement opportunities. You should not use metrics to compare teams or rate individual performance.

As a team makes progress, internal and external customers will notice the improvement without needing the numbers.

Our white paper on measuring Continuous Delivery and DevOps gives general guidance on metric design and explains types and levels of measurement. You can also learn about the common mistakes in DevOps metrics on our blog.

In Out of the Crisis, William Edwards Deming warned of the dangers of depending only on numbers to manage work.

Management by use only of visible figures, with little or no consideration of figures that are unknown or unknowable.

You shouldn’t rule out metrics that are hard to get or include metrics just because your existing tools make them easy to find. Select measurements that give you a real sign of progress and work out how to report on them.

It takes more than numbers to be successful in DevOps. Alongside a healthy set of metrics, you must have a strong sense of purpose for your organization and its products and services. You’ll also need to look beyond short-term outcomes in your search for success over the longer term.

Metrics for software delivery

You may get overwhelmed by all the different frameworks and measurements. Don’t be. You don’t need to measure everything all the time. Use your situational awareness to apply the right metrics at the right time.

In most cases, you should collect measurements at the team level for use within the team. This keeps the metrics manageable and easy to change when they are no longer useful.

DORA metrics

Developed by DORA (DevOps Research and Assessment), the DORA metrics balance competing software delivery concerns by measuring throughput and stability. Operational measures help unlock the potential benefit of software delivery on organizational outcomes.

Learn more about DORA metrics.

The SPACE framework

The SPACE framework provides a structure that encourages good metric design. By combining measurements across a range of categories, you can create a robust measurement strategy for your organization.

Learn more about the SPACE framework.

MONK metrics

If you have a platform engineering team, MONK metrics will help you track the effectiveness of the internal developer platform (IDP).

Learn more about MONK metrics.

Developer experience metrics

One of the more challenging things to measure is developer productivity. The developer experience (DevEx) metrics provide a 3-dimensional system for measuring teams.

Learn more about DevEx metrics

Continuous Delivery statements

You can use the Continuous Delivery statements to perform a qualitative assessment of your software delivery performance. Although you can’t display the results in a graph, this is a powerful way to check your capability and explore areas to improve.

Learn more about Continuous Delivery statements

More reading

These resources will help you find out more about DevOps metrics:

Help us continuously improve

Please let us know if you have any feedback about this page.

Send feedback

Categories:

Next article
DORA metrics