Foundation Series: Basic PERT Estimate Tutorial

estimation classroom

PERT = Program Evaluation Review Technique

PERT is a technique for providing definitive estimates of how long it will take to complete tasks. We often estimate, or scope, the amount of time it will take us to complete a task or tasks. PERT allows us to provide not only an estimate, but a measure of how good the estimate is. Good estimates are a critical element in any software planning strategy. In this post, we will present an introduction to using PERT, explain how it works and how to interpret PERT estimates.

Multiple estimates

ice cream

Imagine we’ve been asked to predict how long it will take us to go to the store and get some vanilla ice cream. “It depends,” we think. If the traffic is light, and there is a short line at the checkout counter, 15 minutes. If traffic is average and we have to wait behind several people at the checkout line, 30 minutes. If there’s an accident on the way and traffic is backed up, it could take an hour.

We may not realize it, but we just created a PERT estimate.

Definition of PERT

A PERT estimate uses three estimates for any given task, to provide both an expected duration for the task and an understanding of how much we might be off in our estimate. To create a PERT estimate, we first create three seperate estimates of the time it will take to complete the task.

Optimistic, or best-case scenario. 15 minutes to get ice cream, if everything goes right.
Likely scenario. 30 minutes is how long we think it will probably take to get the ice cream.
Pessimistic, or worst-case scenario. 60 minutes if everything bad that could reasonably happen happens.

Next, we combine these estimates to create a single number that best predicts how long it will take. We create a weighted-average of the three values, but we count the “likely” estimate as being four-times more likely than either the optimistic or pessimistic estimate. This represents the mean estimate.
mean = (optimistic + (4 * likely) + pessimistic) / 6

The PERT mean estimate for our ice cream delivery is (15 + (4 * 30) + 60) / 6 = 32.5 minutes. We often see PERT estimates documented as all three values, 15/30/60.

Basic PERT interpretation

A PERT estimate includes an approximation of standard deviation (stdev) based on the optimistic and pessimistic values. Stdev provides us with insight into the shape of the probability curve represented by the estimates.

stdev = (pessimistic – optimistic) / 6

The PERT stdev for our ice cream delivery is (60 – 15) / 6 = 7.5 minutes.

The PERT estimate is a representation of a beta distribution. The math is pretty complex, and applying it to a single estimate can give us a false sense of precision. Remember that the numbers we put together in the first place are our best guesses, so doing more precise math on rough estimates doesn’t give us a more precise PERT estimate, just more math.

The beta distribution is very similar to a normal distribution (the familiar bell curve), when it is balanced. This similarity is used to simplify the application of PERT. The standard simplification in using PERT is to treat estimates as if they represented the following probabilities:

Optimistic: 5% of the time we will complete our task in less than the optimistic time estimate.
Likely: 50% of the time we will complete our task in less time than the likely time estimate.
Pessimistic: 95% of the time we will complete our task in less than the pessimistic time estimate.

More advanced PERT analyses

What happens when we want to combine several PERT estimates for our project? Imagine we had to make two trips to buy ice cream, and we want an estimate of how much total time it would take.

Here’s the wrong solution. Add the two PERT estimates. 15/30/60 + 15/30/60 = 30/60/120 with a mean of 65 minutes and a stdev of 15 minutes. This would say that there is a 90% chance that we will spend between 30 and 120 minutes getting ice cream.

Again, without much math, there’s an intuitive reason why this is wrong. We are not taking into account the likelihood that our two trips will take different amounts of time. If we hit bad traffic in one trip, that doesn’t mean we are likely to hit it on the second trip as well. If there is 1 chance in 20 that we hit out pessimistic estimate, then there is only 1 chance in 400 that it would happen twice. So really, there is a 99.5% chance that we would have between 30 and 120 minutes of total travel. We can narrow down this estimate a lot.

To combine the two estimates, we combine the underlying probability distributions that they represent. With the PERT approximations, we would do this as follows:

The combined mean is the sum of the two old means. 32.5 + 32.5 = 65 minutes.
The combined stdev is the square root of the sum of the squares of the two old stdevs. SQRT(7.5^2 + 7.5^2) = 10.6.
The combined optimistic estimate is the mean minus ( 3 * stdev). Optimistic = 65 – (3 * 10.6) = 33 minutes.
The combined pessimistic estimate is the mean plus (3 * stdev). Pessimistic = 65 + (3 * 10.6) = 97 minutes.
Our combined PERT estimate is 33/65/97 with a mean of 65 minutes and a stdev of 10.6 minutes.

This combined window is more narrow than the “just double everything” approach, and reflects that the law of averages will eventually narrow down the interval for our estimate – because some tasks take longer than estimated, and some take less time. It all averages out in the end, as long as the estimates are good to begin with.

Beyond PERT

One question that always comes up when planning is how you created your estimates.
Managing releases with timeboxes.
Updating an existing release schedule with requirements changes.

Summary

We have enough information to know how to create a PERT estimate for a single task. We also know how to combine those PERT estimates to provide a multi-task estimate. There is a lot more we could cover, such as showing the impact of this statistical technique on critical path analysis, or incorporating an approximation of the correlation of estimates (if this task takes longer than estimated, then that task will probably take longer than we estimated). That’s too much detail for this introductory article – just know that it’s out there. What we cover here represents what the vast majority of project managers understand about PERT (and maybe a bit more).

– – –

Check out the index of the Foundation Series posts which will be updated whenever new posts are added.

Scott Sehlhorst is a product management and strategy consultant with over 30 years of experience in engineering, software development, and business. Scott founded Tyner Blain in 2005 to focus on helping companies, teams, and product managers build better products. Follow him on LinkedIn, and connect to see how Scott can help your organization.

13 thoughts on “Foundation Series: Basic PERT Estimate Tutorial”

Pingback: ATZ OK » Blog Archive » links for 2006-05-06
Pingback: links for 2007-05-14 « D e j a m e S e r (IT & tech)
Juan Carlos says:

May 7, 2008 at 9:00 am

PERT its a good option to estimate. How ever exist a a specific formula for PERT to Estimate software projects

(optimistic + (3 * likely) + (pessimistic*2)) / 6
Source: Estimating Software- Intensive Systems

I think that we should use PERT to estimate workitems, but not the core of the project. it means you can estimate meetings, documentation, general task, etc.

Scott Sehlhorst says:

May 7, 2008 at 10:02 pm

Hey Juan Carlos (or is it just Juan?),

Thanks for commenting on this estimation article too. Now all you need to do is join the conversation on our use-case-points estimation series :).

That formula for a PERT estimate is interesting – it biases the pessimistic numbers. Essentially, instead of saying “this is what we think it should be” with a beta distribution, it hides the “we think we’re wrong” part within the math.

I suspect that you lose the ability to quantify the probability of delivering at a point in time when you do this. I prefer to use the “pure” PERT estimate of 1+4+1, where I know the math works. Then, I recognize that my calculation of the standard deviation is also correct. When I want to be conservative, I project the PERT estimate plus two standard deviations.

This is, in my opinion, a better approach, because it is more transparent, and easier to follow during a review. It doesn’t “pollute” the original PERT number by baking the conservative view into the original numbers.

Pingback: Estimacion pert « Bote Salva Vidas
David Paul says:

June 5, 2009 at 10:50 am

How would you take into account the fact that you get better or more proficient at driving to the store the 2nd and subsequent times you go. I know this may not make this with the “store” analogy (unless you consider short cuts you might discover or speed traps you learn about, etc), but with software development, it is a reality.

Also, what’s the math for estimating X trips to the store? If 1 trip is PERT:15/30/60, then what will 15 trips be?

1. Scott Sehlhorst says:
  
  June 15, 2009 at 11:36 am
  
  Hey David, thanks for the question. Generally speaking, as you learn, you make better (tighter) estimates. For example, you could go from 15/30/60 to 20/30/45. Each range is your assessment of uncertainties. Ideally, you would also get “better” at something while also getting more accurate – going from 15/30/60 to 15/25/40.
  
  For estimating X trips, there are a few answers, depending on how you’re asking the question. If you’re wondering what type of “estimation learning” to bake into your estimates, I’d be inclined to say “none.” While you will get better at estimating as time goes on, you also have increased uncertainty the further you look into the future. Your ability to accurately predict 15/30/60 right now is higher than to predict 15/30/60 six months from now. I don’t know which factor will dominate, so I’d be inclined to ignore both factors, assuming that they cancel each other out.
  
Scott Sehlhorst says:

June 18, 2009 at 10:59 pm

To read more, check out the follow-up article I just wrote: Advanced PERT Estimation.

It addresses how to aggregate multiple PERT estimates at the task level into a PERT estimate for a larger item.

beni says:

February 5, 2012 at 9:54 pm

Thank you for the information you provided in the blog. I just want to say that the square root of 7.5 is 2.74 and the standard deviation for the multi task in the ice cream example will not be 10.6.
SQRT(7.5^2 + 7.5^2) = 10.6.
(2.74+2.74) = 5.48

1. Scott Sehlhorst says:
  
  February 6, 2012 at 10:04 am
  
  Hey beni, thanks for the comment, and welcome to Tyner Blain!
  
  I agree that the square root of 7.5 is 2.74, and that the sum of the square roots would be 5.48.
  
  However, the math we’re going for here is the square root of the sum of the squares. You have to square 7.5 (to get 56.25) and add those up – yielding 112.5. Then you take the square root of 112.5 to get 10.6.
  
  I hope that helps!
  
  Scott
  
Pingback: De onzin van Story Points in relatie tot Velocity | Tenuto Blog
Pingback: Estimating Costs Of Unit Testing With Qt/C++ « Robert Wloch
Pingback: Do Professional Programmers Need a Code of Conduct? An Interview with Robert C. "Uncle Bob" Martin