Outside reading: correlation and causality

kids reading outside

A while ago, we asked you to send us links to good blogs.   Jeff Kinsey sent us a link to his blog, Ski’s throughput on command.  We found this post on logical thinking processes which is good.  Thanks Ski for sending us the link!

Their post discusses the differences between causality and correlation of events.

From their article:

Without a means to apply critical thinking to specific situations, people can only resort to unstructured intuition – “gut feelings.” The problem with intuition is that without a means to structure cause and effect, it’s difficult for people to differentiate correlation from cause and effect. 

Correlation is the occurrence of two phenomena in close time proximity. Cause and effect is the relationship between two phenomena in which one can be demonstrably verified as the cause and the other the effect. The problem arises when correlation is assumed to be cause and effect, when in reality it is not.

This is a great point.

A tangible example of critical thinking, correlation and causality applied to software product success

We recently were asked to pull together a set of operational statistics for a development team to attempt to find causal or correlating relationships with business processes.  The goal was to provide insight into the development process for a non-technical manager and to propose changes in that process.  One dimension we were asked to study was the lines of code in a set of applications.  We were to compare the size of the applications with the costs of maintaining them.  We looked at about two dozen applications ranging from 10,000 to 250,000 lines of code.  We also looked at several other easy to measure metrics, like the number of controls in the user interface, the number of classes in the code, etc.
While we found some correlation in the data – more code means more work (our intuition tells us that), we could not find any causality in the data.  The two largest applications (in lines of code, and other low-level quantitative metrics) had very different maintenance burdens – one was the most expensive, the other was the middle of the pack.

What was different between the two applications?

  • One has grown organically over ten years, the other was written completely over the last two years, with a pre-planned architecture.
  • One has significantly more usage (10x) than the other, with correlating bug-discovery and feature-request activities.

We also recognized that there were identifiable, but not easily measurable factors that completely overwhelm the statistics (more intuition) – the quality of the requirements being given to the two teams and the level of experience of the development teams.

Ultimately, we had to present the following messages to our client 

  • The statistics, while demonstrating some correlation, are overwhelmed by external factors that aren’t easy to quantify.
  • The lack of demonstrable causality in the data leads us to not draw conclusions or propose changes along the measured dimensions.

While data can be your friend, it is important to know when it just doesn’t help too.

In a recent post about measuring product manager performance, we talked about how easy to measure information may not be the most useful information.  Critical thinking has a lot of applications!

1 thought on “Outside reading: correlation and causality

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.