Software Testing Series: Top Three Measurements of Quality

measurement and drafting tools

The three most important things to understand about the quality of your software are the three things most relevant to your business and your stakeholders (and arguably, your boss).

Top three measurements of software quality

How do people perceive our quality?
How big of a problem is our quality?
How bad is our software, really?

Going into some detail on each area…

1. How do people perceive our quality? This is our desert island* metric. Perception is reality. If we release tragically buggy code, but none of our users find the bugs, then our perceived quality level is good. If we have one bug in a million lines of code, but it causes us our software to bluescreen for our CEO at Comdex, our perceived quality level is bad.

There are two metrics at the top of our list for tracking perception of quality.
A. Tracking reported defects. Keeping track of the bugs that are submitted for our software is the most direct way to get insight into how our software is perceived. When we use a bug-tracking system, this process is usually completely automated. We can look at all kinds of statistics about who submitted bugs, when, against which release. Ideally, we would also know about their environment – operating system, running applications, available memory, drivers, etc. The windows operating system and the firefox web browser are two common applications that include environmental information in automated bug-submittals.

B. Tracking qualified defects. Many of the bugs that are submitted are not actually bugs. A submittal represents a user believing that the software should behave differently than it does. Sometimes, it misbehaves because of a bug. Sometimes, the user makes a mistake in reporting something that never actually happened. And sometimes, the user wants the software to behave a particular way, and it doesn’t – this is what developers usually call a feature request.

This developer perspective is a little misleading, however. To developers, if the software performs according to the spec, it meets the requirements, and is therefore good software. To a product manager, this could represent one of several requirements writing mistakes – the missing behavior should be there but it isn’t, because it was not documented in the spec. In short, the source of this bug is in the requirements.

The reason we care about tracking both qualified and reported bugs as part of tracking perception is that we encourage teams to provide feedback to the users about their reported bugs. An automated email saying “This bug has been acknowledged” or “Thank you for reporting this, it is not considered a bug, please see this entry in our FAQ [link]” is usually sufficient. Other alternatives – providing read-only public access to our bug tracking system is a good idea, and personal contact and followup is a great idea. It also gives us an opportunity to build stronger relationships with and get additional feedback from the users.

2. How big of a problem is our quality? When reporting quality status to a non-technical manager, we are often asked “OK, there are 100 bugs – how bad are they?” The same issues that we struggle with when prioritizing requirements are also at the root of problems in prioritizing bug fixing.

There are two important axes, severity and priority, that can be used to describe how bad a bug is. The best way to visualize this is with a Venn diagram. The “hot” red area are the bugs most likely to burn you, and the “cool” green areas are the bugs least likely to cause real problems.

3. How bad is our software really? Armed with the data of how many bugs we have, we can now ask the question, “How buggy is our software?” With 100 reported bugs, we are in a lot of trouble if we’ve written a small, simple application. A hundred reported bugs in an operating system is a much smaller problem, proportionally.

Proportionality is the key, and we’ll suggest different ways to measure it in another post. A couple easy ways to put bug levels in perspective are

Bugs per line of code
Bugs per man-month of development
Bugs as a function of usage (per user, per use, etc)

Thanks to Harry Nieboer for his post and links.

* explanation of the desert island meme.

– – –

Check out the index of software testing series posts for more articles.

Scott Sehlhorst is a product management and strategy consultant with over 30 years of experience in engineering, software development, and business. Scott founded Tyner Blain in 2005 to focus on helping companies, teams, and product managers build better products. Follow him on LinkedIn, and connect to see how Scott can help your organization.

Leave a Reply Cancel reply