Software testing series: Pairwise testing

testing equipment
Before we explain pairwise testing, let’s describe the problem it solves

Very large and complex systems can be very difficult and expensive to test. We inherit legacy systems with multiple man-years of development effort already in place. These systems are in the field and of unknown quality. With these systems, there are frequently huge gaps in the requirements documentation. Pairwise testing provides a way to test these large, existing systems. And on many projects, we’re called in because there is a quality problem.

We are faced with the challenge of quickly improving, or at least quickly demonstrating momentum and improvement in the quality of this existing software. We may not have the time to go re-gather the requirements, document them, and validate them through testing before our sponsor pulls the plug (or gets fired). We’re therefore faced with the need to approach the problem with blackbox (or black box) testing techniques.

For a complex system, the amount of testing required can be overwhelming. Imaging a product with 20 controls in the user interface, each of which has 5 possible values. We would have to test 5^20 different combinations (95,367,431,640,625) to cover every possible set of user inputs.

The power of pairwise

With pairwise programming, we can achieve on the order of 90% coverage of our code in this example with 54 tests! The exact amount of coverage will vary from application to application, but analysis consistently puts the value in the neighborhood of 90%. The following are some results from pairwise.org.

We measured the coverage of combinatorial design test sets for 10 Unix commands: basename, cb, comm, crypt, sleep, sort, touch, tty, uniq, and wc. […] The pairwise tests gave over 90 percent block coverage.

Our initial trial of this was on a subset Nortel’s internal e-mail system where we able cover 97% of branches with less than 100 valid and invalid testcases, as opposed to 27 trillion exhaustive testcases.

[…] a set of 29 pair-wise AETG tests gave 90% block coverage for the UNIX sort command. We also compared pair-wise testing with random input testing and found that pair-wise testing gave better coverage.

Got our attention!

How does pairwise testing work?

Pairwise testing builds upon an understanding of the way bugs manifest in software. Usually, a bug is caused not by a single variable causing a bug, but by the unique combination of two variables causing a bug. For example, imagine a control that calculates and displays shipping charges in an eCommerce website. The website also calculates taxes for shipped products (when there is a store in the same state as the recipient, sales taxes are charged, otherwise, they are not). Both controls were implemented and tested and work great. However, when shipping to a customer in a state that charges taxes, the shipping calculation is incorrect. It is the interplay of the two variables that causes the bug to manifest.

If we test every unique combination of every pair of variables in the application, we will uncover all of these bugs. Studies have shown that the overwhelming majority of bugs are caused by the interplay of two variables. We can increase the number of combinations to look at every three, four, or more variables as well – this is called N-wise testing. Pairwise testing is N-wise testing where N=2.

How do we determine the set of tests to run?

There are several commercial and free software packages that will calculate the required pairwise test suite for a given set of variables, and some that will calculate N-wise tests as well. Our favorite is a public domain (free) software package called jenny, written by Bob Jenkins. jenny will calculate N-wise test suites, and its default mode is to calculate pairwise tests. jenny is a command line tool, written in C, and is very easy to use. To calculate the pairwise tests for our example (20 controls, each with 5 possible inputs), we simply type the following:

jenny 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 > output.txt

And jenny generates results that look like the following:

1a 2d 3c 4d 5c 6b 7c 8c 9a 10c 11b 12e 13b 14d 15a 16c 17a 18d 19a 20e
1b 2e 3a 4a 5d 6c 7b 8e 9d 10a 11e 12d 13c 14c 15c 16e 17c 18a 19d 20d
1c 2b 3e 4b 5e 6a 7a 8d 9e 10d 11d 12a 13e 14e 15b 16b 17e 18e 19b 20c
1d 2a 3d 4c 5a 6d 7d 8b 9b 10e 11c 12b 13d 14b 15d 16d 17d 18b 19e 20a
1e 2c 3b 4e 5b 6e 7e 8a 9c 10b 11a 12c 13a 14a 15e 16a 17b 18c 19c 20b
1a 2a 3c 4e 5e 6a 7b 8c 9d 10b 11b 12b 13e 14a 15d 16d 17c 18c 19b 20d […]

Where the numbers represent each of the 20 controls, and the letters represent each of the five possible selections.

What’s the catch?

There are two obvious catches. First, when you use a tool like jenny, we must run all of the tests that it identifies, we can’t pick and choose. Second, pairwise testing doesn’t find everything. What if our example bug before about taxes and shipping only manifested when the user is a first time customer? Pairwise testing would not catch it. We would need to use N-wise testing with N >= 3. Our experience has been that N=3 is effective for almost all bugs.

There is also a sneaky catch – test generators like jenny assume that the order of variables is irrelevant. Sometimes we are testing dynamic user interfaces, where the order of value selection in controls is relevant. There is a solution to this, and we will update this post with a link to that solution when it is available.

– – –

Check out the index of software testing series posts for more testing articles.

  • Scott Sehlhorst

    Scott Sehlhorst is a product management and strategy consultant with over 30 years of experience in engineering, software development, and business. Scott founded Tyner Blain in 2005 to focus on helping companies, teams, and product managers build better products. Follow him on LinkedIn, and connect to see how Scott can help your organization.

11 thoughts on “Software testing series: Pairwise testing

  1. Chris, thanks for commenting and welcome to Tyner Blain!

    With four years of PWT experience, do you find that N-wise is too much effort (marginal benefit over pairwise testing)?

  2. I need some Power Point Presentation for Pairwise Testing related to Computer Field
    Your’s is a good web Page but not realted to my requirement

  3. Pingback: product testing
  4. Pingback: Igor
  5. Hii Scott, read the article and it gives me quite an insight. Could you give me some suggestion on how to do pairwise testing on a control flow graph vertically, where the nodes are different components and a link between them represents dependency.

    Thank you in advance.

    1. Thanks Anvika,

      I apologize, but I don’t know that I know enough about what you’re asking. Can you explain – or link to an example of a control flow graph? I might know what it is, but by a different name.

  6. Hello Scott,

    What is the relationship between the number of test cases and the parameters + cardinality + order of combination (strength) in Pairwise test? I am using the Microsoft PICT to generate Pairwise tests. For 4 parameters with cardinality of 5, 4, 4, 4 it generates 24 test cases for test strength of 2 (pairwise). For test strength of 3, it generates 115 test. How do I mathematically calculate the number of tests it will generate?

    Thanks

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.