Is My Software Quality Bad – in Three Metrics
Three simple metrics that software engineering leaders can use to measure software quality.
If you are wondering whether your team’s software quality could be better, you may already have some data.
Are you seeing any of these concerning signs?
- QA is always falling behind or being a bottleneck.
- New projects are started without any test strategy or plan.
- There is no automation or mainly UI automation.
- You have stabilization sprints before releases.
- You are losing engineers.
- You do not measure quality and do not have a good quality dashboard.
These are all quality-related frustrations, and they should be fixed, but they are not enough on their own to say the software has quality problems. Quality is defined by your customer experience.
You can measure your customer satisfaction or net promotor score, but a poor score may just be an indication that you don’t have the right features, or your support and customer success departments are not doing their jobs. What you want to know is “is my engineering team producing high-quality software?”
Three metrics to measure your software quality
To check your quality quickly, try the following three metrics:
The first metric is the amount of time your team spends fixing customer-found defects
Take a look at the figure Engineering Time Spent by Area below that describes what teams can spend their time on. The left-right axis describes work that is past-facing or future-facing. The top-down axis work that is done for external customers or internal technology to support those customers. You want your teams to spend time building new features. However, that can’t be all the team does. They must also spend time updating the architecture to support future features. Your teams need to spend time paying down technical debt to prevent it from building up and slowing down development. Technical debt includes software upgrades, replacing out-of-date components, refactoring to improve code that was not ideal when it was first written, and so on.
The thing you least want engineers doing is spending time correcting customer-found defects. For this metric, don’t worry about the time it takes to fix internally found bugs, that’s just part of your software development process. The concern is the time it takes to fix customer found bugs including bugs that block new deployments. Time spent fixing customer bugs slows down new features and makes customers unhappy.
So how do you measure the time spent on defect correction?
If you have a dedicated team that is fixing customer bugs, then just divide the size of that team by the whole engineering team size (Dev and QA, but not managers) and you’ll get the percentage of effort spent fixing bugs. By the way, having a dedicated bug-fix team is a bad sign for your quality. In general, it is not a good idea to have such a team except as a short-term workaround to get through an unexpected bug backlog – say six months. If you have to keep the team longer or repeatedly build such a team, there’s a quality problem.
If you are using planning poker for estimation, you can compare the number of story points for bugs vs. other work. Story points are not comparable across teams, but you can look at the number of points each team spends on fixing customer bugs vs. their other work.
If you don’t have story points you can try the number of bugs and the number of stories. For example, say your average story takes 3 days and your average bug takes 1.5 days – include both development and QA time. Take the number of customer bugs / (bugs + (stories * 3/1.5)) and you have your percentage.
So, what does good quality look like?
Based on the over 100 companies I’ve looked at, 5% or less of engineering time spent fixing customer bugs is ideal. 5% to 10% is OK. 10% to 15% isn’t the end of the world, but it’s begun to smell bad. Spending more than 15% of the team’s time fixing customer bugs indicates a significant quality problem that is wasting development time you could be spending on new features.
The second metric is the Change Failure Rate, also called the Hotfix Rate
What percentage of your releases had a problem you needed to fix as soon as possible (without waiting for the next release)?
Count your last 6 to 10 releases (or 1 to 2 years’ worth if releases are very infrequent). The number of releases with issues divided by the total number of releases is your hotfix rate. If you are having trouble matching hotfixes to releases, you can get the number more easily by adding up all the hotfixes in 6 months or a year and the number of releases at the same time and divide the number of hotfixes by the number of releases. See here for more about the measures.
The Change Failure rate is one of the Google DevOps Research and Assessment metrics (DORA). Google measures these every year across many companies. In 2022, companies with a failure rate of
- 46% to 60% or higher rated Low Quality
- 16% to 30% were Medium Quality
- Up to 15% were High Quality
The change failure rate is not a perfect metric. It is highly correlated with your release frequency. If you release more often, you’ll probably have more releases that don’t have failures. However, it’s still a good way to see if you have a problem from a customer’s point of view.
The third metric is the bug production rate or incoming bug rate
You can think of software engineering teams as producing features, but engineering teams also produce bugs. Engineering teams tend to produce bugs at a relatively steady rate. We can measure that rate.
Measure the customer-found bugs over some time and divide them by the size of the engineering team. Here’s the formula:
Bug Production Rate = (number of customer bugs) / (groups of 8 engineers) / (the number of 2-week sprints in a time period)
You can get the number of customer-found defects by looking at ones escalated by customer support. If you have a service, you may also want to add in the operational failures that cause service issues like downtime.
I normalize by groups of 8 engineers because I want to emphasize that we are not measuring the bug production of an individual developer. We’re not trying to find blame. We want to measure the quality of the software. The number of engineers includes developers and QA, but not managers or people who don’t code (Product Owners, non-coding leads, non-coding Scrum Masters, etc.).
This is a slowly changing metric, so measure it across a minimum of 3 months. You want several releases to be covered. It’s not a very good metric to look at every sprint to see if you are doing better, as it will fluctuate based on release schedules and when customers use the features.
Based on the 100+ companies I’ve looked at, companies with 2 or fewer bugs/teams of 8/sprint have good enough quality and it’s not worth trying to improve it. Companies with 4+ bugs/teams/sprint have a quality problem that needs fixing. Between 2 and 4 is the “it’s starting to smell, you don’t have to fix it right now, but a little quality love wouldn’t hurt”.
So, now what?
Let’s say you’ve confirmed your suspicion, and your software quality could be improved. Can you use these metrics to help you make improvements? Unfortunately, no. These metrics are great for showing if you have a quality problem. They are not diagnostic enough to tell you what the problem is, or how to fix it. Also, because each metric is slow to change, they should be used only every 3 or 4 releases to track improvement, but not more frequently. In the short-term there is a lot of noise in the data that will obscure the trend.
If you have a quality problem, you’re going to need to diagnose the root causes of the problem to come up with a transformation plan to fix the causes. Root causes vary with every company! I’ve seen companies where they had a disconnect between requirements and engineering, others that had integration problems between teams, others that lacked deployment automation leading to release issues, and so on. This is where I can help you!
My Experience
If you need help improving your software quality, I have been transforming the software quality of companies as a consultant for over 8 years. I have
- over 30 years of experience leading teams to improve quality, service resilience, and availability.
- acted as an interim leader guiding quality engineering for several companies.
- led multiple Quality, DevOps, and Scaled Agile transformations across various industries.
I can help you find the root causes of your quality problems, create a plan to fix them, and coach you through the transformation.
Copyright © 2023, All Rights Reserved by Bill Hodghead, shared under creative commons license 4.0