Tag Archives: unit tests

Communicate Relevant Quality Metrics

ruler

Most teams think about testing in terms of code coverage – what % of the lines of code are covered? What matters to our stakeholders is how well the software works. More precisely, how well does the software let the users work? We should be targeting our quality message in relevant terms, because users care about what they can do with software, not how well we programmed it.

The problem is that we tend to think about our software from the inside out, and our customers think about it from the outside in. We need a way to communicate our understanding of the insides within the customer’s framework of understanding the outside.

Inside Out Quality Measurement

Inside-out measurement of quality is what most people developers and testers think about. Users don’t. Executives don’t. Customers don’t. This section recaps this view in order to contrast it with the outside-in view.
We’ve talked about how to view software as a framework of structured requirements, designs, code, and tests. This is the right way to think about it, when thinking about the inside of our process. A diagram we’ve used before to show the structured requirements view of the world puts it pretty succinctly.

Wiegers' view of structured requirements

Interpreting the Inside-Out Diagram

  • The user has goals.
  • Each goal is achieved by enabling one or more use cases.
  • Each use case is enabled by implementing one or more functional requirements with a set of characteristics and subject to restrictions.
  • Each functional requirement drives a set of design decisions.
  • Each design element is implemented by writing software (code).
  • Each element of code is tested with one or more tests. These are generally unit tests, and by definition are whitebox tests.

Incorporating an interaction design process into this approach results in a more complex, blended view of the world. Wiegers’ view is simpler, so we’ll focus on how to communicate with our users in this framework. These ideas can be easily extended to other frameworks.

One Step Back

Taking one step back, we can see a slightly bigger picture. In the following diagram, we collapse all of the requirements elements into a single rectangle, and add testing.

inside view

This diagram shows a single use case, enabled through two requirements, each of which is driving a design element. Each design element is implemented with a section of code, and each section of code is also tested with one or more white box tests.

The Problem with Inside-Out

While inside-out is the way that we have to think about our software when developing it, it couldn’t be more wrong as a way to describe our quality to our stakeholders. We might be able to communicate the overly simplified diagram above to a client, but even adding one level of complexity will derail the message. The diagram below will make most stakeholders’ eyes water, even though it is still simplistic.
inside view

When we deliver a release, we need to communicate about the quality of the release. We can do this by providing the results of our test suite. The test suite is represented by the “T” boxes in these simplified diagrams. We can tell our stakeholders that we have 90% code coverage, or that 85% of our tests pass. Most measurements of quality are meaningless once you get outside of the box.

More gibberish for our customers.

Outside-In Quality Measurement

Our customers view software from the outside in.

Outside view of software

Interpreting the Customer-View Diagram

  • The user has one or more goals. (WHY?)
  • The user achieves those goals by enacting use cases. (WHAT?)
  • The use cases are enabled by buying software. (HOW?)

We engage with users during developmen in an agile or iterative process. During that engagement, the users will care about the next level of detail (requirements), but only because what they care about are use cases (or scenarios). We need to write requirements so that they can get value out of the software. The responsibility is ours, they care about how they use the software.

Using Inside Knowledge for Outside Communication

We need to communicate some message about the quality of each release to our stakeholders, because keeping them in the loop keeps our project alive. It sets expectations, can create momentum, and prevents surprises. All of these are very good things(tm).

We can do this by providing the results of our test suite. What we want to tell them is “Use Case 1 has 100% passing tests, UC 2 has 100% passing tests, and UC 3 has 50% passing tests.” This lets our stakeholders know that Use Case 1 and UC 2 are both ready start generating ROI for them. UC 3 is not ready yet, and needs more work. When we combine this “quality by use case” message with “release planning by use case” we are providing a clean message for our customers, that is targeted for them, and makes sense from their perspective.

In the diagrams, we see how each level of the diagram is supported by the next level down. Conversely, each level is supporting the level above it. By following the arrows backwards, we can see which code is supported by a given test case. We can then determine which design element is supported by that code, and keep moving up until we find the use cases. In our example, the mappings would look like the following:

Mapped use cases

Interpreting the Use Case Mapping Diagram

  • The first three test cases all support Use Case 1
  • The next two test cases support UC 2.
  • The last two test cases support both UC 2 and UC 3.

The last two test cases are doing double duty in our example, because both UC 2 and UC 3 depend upon the same requirement. This is a very common element of real-world diagrams like this one. The tests of a common code element will support multiple use cases.

Quality Measurement and Motivation

Suddenly, some test cases are more important than others. When there is a system of metrics in place, people tend to optimize on those metrics.

With inside-out quality measurements, all test cases are created equal. If 5 of 1000 tests fail, quality is really good. Maybe. What if those failed test cases are in the database connection that is critical to every use case? Five tests fail (half a percent!) and nothing works.

With outside-in quality measurements, critical test cases carry the most weight. The five failing test cases will cause all of our use cases to fail, and they will get the attention they deserve.

The same approach can be used for measuring code-coverage, cyclomatic complexity, or any other (normally) inside-out metric. Developers are smart. When they see that they can kill N birds with one stone, they jump at the chance. Fixing a critical bug, or adding a well-placed test case can have multiplied impact with this approach.

Use cases that are isolated will get the least attention. Unless we also prioritize them.

Conclusion

We write software for our customers. They buy it because it is valuable to them. Our customers think about that value in terms of what they can accomplish with the software.

When we communicate with our customers about quality, it should be on our customer’s terms, not ours.

Measuring the Cost of Quality: Software Testing Series

scale

Should we test our software? Should we test it more?

The answer to the first question is almost invariably yes. The answer to the second question is usually “I don’t know.”

We write a lot about the importance of testing. We have several other posts in our series on software testing. How do we know when we should do more automated testing?

Determining the costs is an ROI analysis. Kent Beck has a great position –

If testing costs more than not testing, then don’t test.

At first glance, the statement sounds trite, but it really is the right answer. If we don’t increase our profits by adding more testing, we shouldn’t do it. Kent is suggesting that we only increase the costs and overhead of testing to the point that there are offsetting benefits.

We need to compare the costs and benefits on both sides of the equation. We’ll start with a baseline of the status quo (keeping our current level of testing), and identify the benefits and costs of additional testing, relative to our current levels.

We should do more automated testing when the benefits outweigh the costs

We’ll limit our analysis to increasing the amount of automated testing, and exclude manual testing from our analysis. We will use the assumption that more testing now will reduce the number of introduced bugs in the future. This assumption will only hold true when developers have the ability to run the automated tests as part of their personal development process. We’ve written before about the sources of bugs in the software development process, and in other posts in this series we show how automated testing can prevent future bugs (unlike manual testing, which can only identify current bugs).

We are also assuming that developers are running whitebox unit tests and the testing team is running blackbox tests. We don’t believe that has an impact on this analysis, but it may be skewing our perspective.

Benefits

  • Reduced costs of bugs in the field. Bugs in the field can cause us to have “emergency releases” to fix them. They can increase the costs of (internal teams) using our software and working around the bugs. They can cause delayed sales. Bugs cause lost customers.
  • Reduced costs of catching future bugs. When developers can run a regression suite to validate that their code didn’t break anything before asking the testing team to test it, they can prevent introducing regression bugs. And thereby prevent the costs of finding, triaging, and managing those bugs.
  • Reduced costs of developing around existing bugs. Developers can debug new code faster when they can isolate it’s effects from other (buggy) code.
  • Reduced costs of testing around existing bugs. There is a saying – “What’s the bug behind the bug?” we’ve heard when testers are trying to validate a release. A bug is discovered, and the slack time in the schedule is used fixing that bug – then the code is resubmitted to test to confirm that the bug was fixed. Another bug was hiding behind it, and untestable because the first bug obfuscated the second bug. Addressing the second bug introduces unplanned testing costs. Preventing the first bug will reduce the costs of testing the latent bug.

Costs

Most of these increased costs are easy to measure once they are identified – they are straightforward tasks that can be measured as labor costs.

  • Cost of time spent creating additional tests.
  • Cost of time spent waiting for test results.
  • Cost of time spent analyzing test results.
  • Cost of time spent fixing discovered bugs.
  • Cost of incremental testing infrastructure. If we are in a situation where we have to increase our level of assets dedicated to testing (new server, database license, testing software licenses, etc) in order to increase the amount of automated testing, then this cost should be captured.

Conclusion

This is a good framework for making the decision to increase automated testing. By focusing on the efficiencies of our testing approaches and tools, we can reduce the costs of automated testing. This ultimately allows us to do more automated testing – shifting the pareto optimal point such that we can increase our incremental benefits by reducing our incremental costs.

Foundation Series: Unit Testing of Software

Requirements class students

What are unit tests?

monkey at keyboard

Testing software is more than just manually banging around (also called monkey testing) and trying to break different parts of the software application. Unit testing is testing a subset of the functionality of a piece of software. A unit test is different from a system test in that it provides information only about a particular subset of the software. In our previous Foundation series post on black box and white box testing, we used the inspections that come bundled with an oil change as examples of unit tests.

Unit tests don’t show us the whole picture.

A unit test only tells us about a specific piece of information. When working with a client who’s company makes telephone switches, who’s internal software development team did not use unit tests we discussed the following analogy:
Unit tests let us see very specific information, but not all of the information. Unit tests might show us the following:

bell

A bell that makes a nice sound when ringing.

dial

A dial that lets us enter numbers.
horn

A horn that lets us listen to information.

We learn a lot about the system from these “pictures” that the unit tests give us, but we don’t learn everything about the system.

phone

We knew (ahead of time) that we were inspecting a phone, and with our “unit tests” we now know that we can dial a phone number, listen to the person on the other end of the line, and hear when the phone is ringing. Since we know about phones, we realize that we aren’t “testing” everything. We don’t know if the phone can process sounds originating at our end. We don’t know if the phone will transmit signals back and forth to other phones. We don’t know if it is attached to the wall in a sturdy fashion.

Unit testing doesn’t seem like such a good idea – there’s so much we need to know that these unit tests don’t tell us. There are two approaches we can take. The first is to combine our unit tests with system tests which inspect the entire system – also called end to end tests. The second is to create enough unit tests to inspect all of the important aspects. With enough unit tests, we can characterize the system (and know that it is a working phone that meets all of our requirements).

old phone with unit tests

Software developers can identify which parts of their software need to be tested. In fact, this is a key principal of testing-driven development (TDD) – identify the tests, then write the code. When the tests pass, the code is done.

Why not use system tests?

The system test inspects (or at least exercises) everything in the software. It gives us a big picture view. Ultimately, our stakeholders care about one thing – does the software work? And for them, that means everything has to work. The intuitive way to test, then, is to have tests that test everything. System testing is also known as functional testing.
old phone

These comprehensive tests tell us everything we want to know. Why don’t we use them?

There is a downside to system testing. In the long run, it’s more expensive than unit testing. But the right way to approach continuous integration is to do both kinds of testing.

In our Software testing series post on blackbox and whitebox testing we discuss several tradeoffs associated with the different types of testing. For most organizations, the best answer is to do both kinds of testing – do some of each. This is known as greybox testing, or grey box testing.

System tests are more expensive, because they are more brittle and require more maintenance effort to keep the tests running. The more your software changes, the faster these costs add up. Furthermore, with Agile practices, where portions of the system are built and tested incrementally, with changes along the way, system tests can be debilitatingly expensive to maintain.

Because unit tests only inspect a subset of the software, they only incur maintenance costs when that subset is modified. Unit testing is done by the developers, who write tests to assure that sections of the software behave as designed. This is different from functional testing, that assures that the overall software meets the requirements.
There are more articles on software testing in our software testing series.
– – –

Check out the index of the Foundation series posts for other introductory articles.

Software Testing Series: Black Box vs White Box Testing

Armwrestling

Should I use black box testing or white box testing for my software?

You will hear three answers to this question – black, white, and gray. We recently published a foundation series post on black box and white box testing – which serves as a good background document. We also mention greybox (or gray box) testing as a layered approach to combining both disciplines.

Given those definitions, let’s look at the pros and cons of each style of testing.

Black box software testing

Black box

pros

  • The focus is on the goals of the software with a requirements-validation approach to testing. Thanks Roger for pointing that out on the previous post. These tests are most commonly used for functional testing.
  • Easier to staff a team. We don’t need software developers or other experts to perform these tests (note: expertise is required to identify which tests to run, etc). Manual testers are also easier to find at lower rates than developers – presenting an opportunity to save money, or test more, or both.

cons

  • Higher maintenance cost with automated testing. Application changes tend to break black-box tests, because of their reliance on the constancy of the interface.
  • Redundancy of tests. Without insight into the implementation, the same code paths can get tested repeatedly, while others are not tested at all.

White box software testing

White box

pros

  • More efficient automated testing. Unit tests can be defined that isolate particular areas of the code, and they can be tested independently. This enables faster test suite processing
  • More efficient debugging of problems. When a regression error is introduced during development, the source of the error can be more efficiently found – the tests that identify an error are closely related (or directly tied) to the troublesome code. This reduces the effort required to find the bug.
  • A key component of TDD. Test driven development (an Agile practice) depends upon the creation of tests during the development process – implicitly dependent upon knowledge of the implementation. Unit tests are also a critical element for continuous integration.

cons

  • Harder to use to validate requirements. White box tests incorporate (and often focus on) how something is implemented, not why it is implemented. Since product requirements express “full system” outputs, black box tests are better suited to validating requirements. Carefull white box tests can be designed to test requirements.
  • Hard to catch misinterpretation of requirements. Developers read the requirements. They also design the tests. If they implement the wrong idea in the code because the requirement is ambiguous, the white box test will also check for the wrong thing. Specifically, the developers risk testing that the wrong requirement is properly implemented.
  • Hard to test unpredictable behavior. Users will do the strangest things. If they aren’t anticipated, a white box test won’t catch them. I recently saw this with a client, where a bug only showed up if the user visited all of the pages in an application (effectively caching them) before going back to the first screen to enter values in the controls.
  • Requires more expertise and training. Before someone can run tests that utilize knowledge of the implementation, that person needs to learn about how the software is implemented.

Which testing approach should we use?

There is also the concept of gray box testing, or layered testing – using both black box and white box techniques to balance the pros and cons for a project. We have seen this approach work very effectively for larger teams. Developers utilize white box tests to prevent submission of bugs to a testing team that uses black box tests to validate that requirements have been met (and to perform system level testing). This approach also allows for a mixture of manual and automated testing. Any continuous integration strategy should utilize both forms of testing.
Weekend reading (links with more links warning):

White box vs. black box testing by Grig Gheorghiu. Includes links to a debate and examples.

Black box testing by Steve Rowe.

A case study of effective black box testing from the Agile Testing blog

Benefits of automated testing from the Quality Assurance and Automated Testing blog

What book should I read to learn more?

Software Testing, by Ron Patton (the eBook version, which is cheaper).

Here’s a review from Randy Rice “Software Testing Consultant & Trainer” (Oklahoma City, OK)

Software Testing is a book oriented toward people just entering or considering the testing field, although there are nuggets of information that even seasoned professionals will find helpful. Perhaps the greatest value of this book would be a resource for test team leaders to give to their new testers or test interns. To date, I haven?t seen a book that gives a better introduction to software testing with this amount of coverage. Ron Patton has written this book at a very understandable level and gives practical examples of every test type he discusses in the book. Plus, Patton uses examples that are accessible to most people, such as basic Windows utilities.

I like the simplicity and practicality of this book. There are no complex formulas or processes to confuse the reader that may be getting into testing for the first time. However, the important of process is discussed. I also have to say a big THANK YOU to Ron Patton for drawing the distinction between QA and testing! Finally, the breadth of coverage in Software Testing is super. Patton covers not only the most important topics, such as basic functional testing, but also attribute testing, such as usability and compatibility. He also covers web-based testing and test automation ? and as in all topics covered in the book, Patton knew when to stop. If you want to drill deeper on any of the topics in this book, there are other fine books that can take you there!

I love this book because it is practical, gives a good introduction to software testing, and has some things that even experienced testers will find of interest. This book is also a tool to communicate what testing and QA are all about. This is something that test organizations need as they make the message to management, developers and users. No test library should be without a copy of Software Testing by Ron Patton!

– – –

Check out the index of software testing series posts for more articles.