Over 90% of the cost of software development is software maintenance (cite). This alarming trend was predicted as early as 1972. McKinsey suggests that CIOs should spend no more than 40-60% on maintenance. Gartner’s IT Spending and Demand Survey (2002) reports that CIOs are spending 80% of their budgets on maintenance (p12 of presentation). Agile development can help reverse this trend.
The Cost Trends of Software Maintenance
Jacoozi published an analysis of the impact of continuous refactoring on software maintenance costs. Continuous refactoring is an element of agile software development, where the developers continuously make minor improvements to the architecture and design as they maintain the code.
Thanks to Levent Gurses for providing this modified version of the chart from his article, Continuous Refactoring and ROI. In his article, he discusses both recurring “big bang” refactoring (the pink curve) and continuous refactoring (the green curve).
What Levent’s chart shows with the black line is that the cost of maintenance grows at a significant rate over time when you don’t refactor the code.
We can re-use the diagram from our earlier analysis of agile development and ROI, and overlay this cost structure. The green curve represents sales volume, contrasted with the shaded curve representing development costs.
In the product lifecycle diagram above, there is an initial “hump” of development cost. Note that when you are using incremental development, the hump extends past the end of the development stage of the product life cycle.
The largest part of the shaded area represents the ongoing costs – 90% of which are maintenance costs. Note – we are assuming that companies use a rational investment strategy – they continue to maintain the software until the costs equal or exceed the revenue. The investment should stop when the opportunity cost of continued investment exceeds the benefits of continued investment.
Broken Windows
Continuous refactoring is making small investments in improving the code over time. The absence of those investments allows the code to grow more expensive to maintain over time. Gartner estimated that 50% of the cost of ongoing maintenance labor is spent trying to understand the existing code base. This is very inefficient.
In The Tipping Point, Malcolm Gladwell described this phenomenon by analogy to the broken windows in East New York City. This is just one gripping example from his book of the same title.
As the code gets more convoluted over time, these two factors serve to increase costs dramatically – the increased difficulty of doing the job, combined with increased apathy about doing it right. Costs would go up simply because the code gets larger (as the software is expected to do more and more). These factors accelerate the rate at which the phenomenon occurs – consistent with Levent’s faster than linear cost growth function.
Fixing The Windows
When faced with the challenge of reducing the ongoing maintenance costs, you have a few choices:
- Eliminate Ongoing Maintenance and Development. This was the Autodesk strategy (fire the engineers, milk the product for revenue). It worked great for a very short time. Profit growth was tremendous. This of course accelerated the decline in sales, eliminating revenue (and therefore profit).
- Reduce Spending On Maintenance and Development. Simply reducing the budget has all of the negative impact of cutting the budget, but with fewer gains. A reduced budget (with no other changes) increases the frustration both of unstatisfied customers and of overwhelmed developers. This is a bad idea.
- Refactor The Code To Improve Efficiency. You can make ongoing maintenance more efficient by making the code easier to understand and modify. This generates cost savings – the “big money” in ROI calculations.
Improving efficiency reduces the costs of ongoing development, as the following diagram shows:
Other Benefits
By reducing the cost of ongoing maintenance, you can improve the profitability of the product. You also free up resources for investment in new product development. This helps move your organization to McKinsey’s recommended 40% to 60% maintenance budget.
You will also get intangible benefits and improved efficiency by reducing the number of broken windows in the product’s code base. This results in increased motivation of the team members and greater job satisfaction. The increase in motivation will decrease the cost of development of other functionality.
You may also extend the useful life of the product – by extending the amount of time when ongoing maintenance is still profitable. This additional development work could result in increased sales – extending the product life cycle.
In the image cited in the beginning of the article, the green line shows the cost of change, but does not consider the cost of the refactoring itself. If you add the exponentially growing cost of *continuous* refactoring to the cost of change, the resulting curve will go above black line. So the green area shown on the last image in fact shows savings of changing “refactoring” approach to traditional “factoring” approach. As my experience shows, refactoring as a tool to reduce maintenance efforts is not a best choice for new projects: better is to establish the code structure and interface conventions before coding starts, so you know the proper place where new code should be added. Refactoring also is not a choice for complex legacy projects, where rewriting is generally cheaper. The refactoring really helps when you add external code to your existing product, but that does not happen often. YMMV…
AVA,
Thanks for the comment. You’ve obviously never done refactoring in a continuous environment. Let me try to answer your points one by one:
1. “…the green line shows the cost of change, but does not consider the cost of the refactoring itself…” – this is not a Total Cost of Ownership (TCO) diagram. It addresses the Cost of Change (CC). Why should it consider the cost of refactoring?
2. “…If you add the exponentially growing cost of *continuous* refactoring…”. Let’s see the math here. Exponentially growing cost of refactoring to me is this: If my factor is 2 and if I spend $1 for refactoring in Iteration 1, then in subsequent iterations I spend $2, $4, $8, $16, $32, $64, $128…you get the picture. By Iteration 11 I am spending $1024 on refactoring alone. This could not be farther from what happens in a real life continuous refactoring project. I can say with some level of confidence that, BECAUSE OF continuous refactoring the cost is kept under control.
3. “… the resulting curve will go above black line…” – so you’re saying that doing nothing actually saves the company money?
4. “As my experience shows, refactoring as a tool to reduce maintenance efforts is not a best choice for new projects: better is to establish the code structure and interface conventions before coding starts, so you know the proper place where new code should be added…” – In my 11+ years coding and architecting I am yet to meet the person who can “establish the code structure and interface conventions before coding starts”.
5. “The refactoring really helps when you add external code to your existing product, but that does not happen often.” – wrong definition of refactoring. Refactoring application code means making changes to the internal structure of the code while preserving its external behavior. So it is not to add new features. And, yes, it does happen a lot – as it should. The customer should be able to add/remove features as their priories change.
I suggest you take a read on Agile and Refactoring. You may find it helpful for your projects. Martin Fowler is a good place to start.
Have a nice day.
Levent Gurses
Scott,
Are these charts “notional” or do they have field data behind them? The conjecture that continuous refactoring is cheaper has alwsys been the mantr of agile. As as agile practioner, we have never seen actual data from the field – time cardds, defect numbers, sunk cost figures for a side-by-side comparison.
Any sources for these possibly “notional” charts?
Maybe data isn’t needed to answer this question if we can leverage our cultural intelligence:
How often does your kitchen get cleaned? After each meal? At the end of the day? Once a month? Once a year?
Which is more cost effective?
If a study isn’t needed to guide us on kitchen cleaning, we shouldn’t need a study to decide how often to clean the code.
Hey Glen,
Yeah, “real data” is the holy grail of agile. I’ve heard several speakers wish for data. The only study I’ve found so far is one that uses a probabilistic approach (using an options-valuation model) to assessing the cost-benefits of refactoring versus the anticipated level of change. A quick quote from the article:
Thus, refactoring is likely to add to the system a value, if ten or more changes need to be exercised during the next three years.
That should help some.
Anecdotal data from my decade in the software world is that the only enterprise project I’ve been on with more than a couple man-years of effort that hit ALL of the deliverables and adapted to schedule and team changes was one that continuously refactored (at just under 5% of total dev effort, if I remember correctly).
While Levent’s curves may be notional, they are definitely consistent with my experiences and expectations.
AVA and Levent,
I think you guys are both coming from different backgrounds, and using the same terms to mean different things.
My approach to continuous refactoring doesn’t mean “rewrite all of the code at all times” – which could lead to an increase in the cost of refactoring as the code base gets larger. I would expect growth to be slower than the growth of the code base at this point, because the changes aren’t arbitrary, they improve things. Regardless – don’t do this.
When I’ve planned projects for my teams, I’ve budgeted 5 hours per day of “on task” time, with remaining time for refactoring, self-education, and the other intangibles that come with knowledge-work. I’ve encouraged refactoring without mandating it, and most developers do it. I’ve had to make specific suggestions for more junior developers in the past – but those also served as training exercises for them (and were fun for me).
Note: by “5 hours per day” I specifically mean that tasks with 25 hours worth of combined estimated effort was assigned to each developer per week.
Specific, larger, “we need to refactor X” tasks were estimated and incorporated into the schedule. That’s where the 5% # comes from in my previous comment.
Thanks ALL for the great conversation, let’s keep it going.
Levent Gurses, let me in turn comment on your comments.
1. “… diagram…addresses the Cost of Change (CC). Why should it consider the cost of refactoring?”
Because the refactoring changes the code.
2. “By Iteration 11 I am spending $1024 on refactoring alone… the cost is kept under control.”
The knowing how much money is lost is not keeping cost under control. The prevention of money loss is keeping cost under control.
3.” “… the resulting curve will go above black line…†– so you’re saying that doing nothing actually saves the company money?”
Yes, of course. The refactoring changes the code without changing the product – from the customer’s point of view that’s throwing resources. The customer says: if you cannot spend my time and money for product improvement, just don’t start touching my code. In this sense, when you do nothing (related to refactoring), you actually save resources (and can use them to improve the product).
4. “In my 11+ years coding and architecting I am yet to meet the person who can “establish the code structure and interface conventions before coding starts—.
Anybody who follows international and national software engineering standards can. I did and do it in ~30 my projects. I didn’t do it in 3 projects (all 3 were unsuccessful).
The use of standards written by experts reduces numerous development risks. Search Internet for keywords “software engineering development standard ISO” to get the trend.
5.”“The refactoring really helps when you add external code to your existing product, but that does not happen often.†– wrong definition of refactoring.”
I use the same definition as you. I see that I did not provide enough details in the original post. The scenario is following: you want to merge sources of two working products A and B. Product A is your own one, you designed and implemented it, so you don’t have maintenance problems. Product B doesn’t follow your code structure conventions, which may lead to troubles during maintenance. So you create the refactoring support infrastructure for the product B, verify it, then refactor the code. Product B is ready for merge with A. “Product” may be just the free code sample from Internet.
The idea of my original post is that refactoring is too expensive tool for producing maintainable code. Cheaper alternatives exist, such as check-in diff review. If your codebase has multiple and frequent refactorings, that’s a warning sign: your developers need training. Simple reading of “Code Complete” and “Writing Solid Code” helps a lot. Obligatory written explanation of each check-in, and review of added/changed/removed code:
– prevent undisciplinary codebase changes
– and keep the code from degradation.
Scott Sehlhorst,
In my current projects developers do refactoring during development, as they do debugging, so the refactoring is not separated from the factoring. There is typically zero, sometimes one refactoring iteration per check-in (looks like 5% from your post is “worst-case” estimation for my case). The changes of code that do not add new features and do not fix bugs are prohibited: “the changes aren’t arbitrary, they improve things”. That saves significant efforts and keeps the codebase stable.
Thanks AVA for explaining more. Your comments reminded me of how my work often happened as a developer:
There is a distinction that might help the discussion:
Rewriting at the time of development is refactoring writ small. It is redesigning, rewriting, testing, commenting, etc – much of the activities like you describe, and I remember doing as a developer as part of my personal development process.
Refactoring writ large is changing the design so that it adapts to new requirements. Not just improving the encapsulated (Requirement X yields code Y) code. But rewriting things that were done “too quickly” in order to get them done “quickly.” It’s the whole code-debt issue. Sometimes quickness is optimized over correctness.
Some examples using patterns so I type less:)
After typing this, I find myself wondering – did I move the discussion forward, or sideways? Time will tell.
Hey all, the boat is rocking again!
I must say I am more confused now, after reading AVA’s explanations than I was before. In fact, as Scott pointed earlier, it’s almost certain that we are talking about different things here. I guess, we are enjoying the fruits of the internet as a communication medium!
Anyway, AVA thank you for sharing your experiences. Mine are different.
Steve McConnell is still one of my best reads, but man, in both “Code Complete (1-2)” and “Rapid Development”, do I feel some of the stuff is so yesterday?
Scott, thanks for providing an environment to exchange ideas. As far as real data for the charts is concerned, I have to agree with you one more time. It’s very difficult to get real numbers. But they’ll come. There are number of studies coming lately which demonstrate Agile methods adoptions and success rates. I am sure we’ll see one on the subject pretty soon.
Thank you all,
Levent Gurses