On Cringely On Refactoring
Bob Cringely has been upsetting some programmers with his comments on refactoring. Initially, he had this to say:
“Cleaning up code” is a terrible thing. Redesigning WORKING code into different WORKING code (also known as refactoring) is terrible. The reason is that once you touch WORKING code, it becomes NON-WORKING code, and the changes you make (once you get it working again) will never be known. It is basically a programmer’s ego trip and nothing else. Cleaning up code, which generally does not occur in nature, is a prime example of amateur Open Source software.
He then followed it up with this, in which he’s more reasoned and suggests that only “too much” refactoring is bad.
He seems to intentionally miss the point that not all code changing is refactoring: some of it is still just hacking. Refactoring is risk management…
I’ve noticed three distinct types of refactoring. The first is something I’ve always done and never used to refer to as refactoring, it’s more a case of finishing. You do some work on some code; add a new feature or fix a bug. You get the change to work, which may involve exploring your understanding of the problem and trying different approaches. You check in your completed change and then you refactor the code change so that it’s clean. At that point you’re done. The refactoring itself could be viewed as optional, but failure to do it results in the code base starting to become untidy. Code in this area may be harder to change again due to the scars left by previous changes.
Schedule pressure may mean that you think it’s impossible to clean up the code right now. You’ve hacked in a fix and you have lots more things to do so you rush on to the next on your list. This is almost always a mistake. When you’re hacking in the change you’re usually focused on the problem. Refactoring straight after implementing a change means that you have to think about your solution. Often this moment of reflection will help you to create a better solution. People who say that I should be able to make my change without then needing to refactor it straight after should write some code in the real world once in a while…
So, the first kind of refactoring is more a case of not stopping until you’re done.
If you don’t get a chance, don’t have the discipline or just don’t care then eventually the code will have accumulated lots of scar tissue and it will be harder to understand and maintain. At this point we may get to use the second kind of refactoring. You need to make a change, but it’s in the scar tissue. Due to previous unfinished changes or initially poor design the change is hard to make or it’s hard to be sure that it doesn’t have unintended side effects. Refactoring the code around the area that you need to change makes it easier to make the change. You may end up changing more code that you would otherwise have had to change and this introduces a risk that you might break something. You need to balance this risk against the risk that you’ll break something anyway because you’re working on scarred code. Good tests help to reduce the risk on both sides of the equation, so having good tests isn’t a justification in itself to refactor before making the change.
Deciding to perform this second type of refactoring often comes down to the fact that it might not actually take much longer to refactor and apply the change than it would be to hack in a quick fix. To be able to hack in a quick fix and be sure that it’s the right fix you need to spend time investigating the surrounding code to make sure you have no unintended side effects. Because we’re in scar tissue this may take longer and be harder. You can either: investigate, understand, fix and moved on or investigate, understand, refactor, fix and move on. Often when you reach the point where you understand, the refactoring required is obvious. If you have to perform more fixes in this area then it may take equally long to investigate and understand next time, or for the next person. If you refactor as you investigate you can end up in a situation where the next time a change in this area is required you’re no longer working with scar tissue and the change is easier to make.
The second kind of refactoring is deciding that the best way to change some code might require redesigning some of the surrounding code.
The third kind of refactoring tends to occur when the people working on a code base have not done either of the first two forms of refactoring. This is the after the fact refactoring. You analyse an existing code base and refactor the code to improve the design and maintainability of the code.
Like the GOF book, Fowler’s book gives programmers a way of talking about something that many of them already did. He defines the semantics. Most professional programmers were probably doing the first two forms of refactoring all along. Now they can talk about it using the same terms.
Not all code changing is refactoring. Refactoring requires discipline. You need to balance the risk of refactoring with the risk of not refactoring. You need to know when to stop and you need to do it because it’s the right thing to do for the project, not just because you can. Changing code without this kind of thought and discipline is just hacking (in the worst sense of the word).
All refactoring is risk management. Sure you’re creating a risk that when you refactor you will break the code. You mitigate that risk, if possible, by having strong tests. If you choose to refactor then you’re accepting that the risk of refactoring is lower than or equal to the risk of not refactoring. So, what’s the risk of not refactoring?
Choosing not to refactor a heavily scarred code base is accepting that the code may be harder to understand and may take longer to change. This might be fine. You might only ever change the code once in a while and the changes could be simple, your team might fully understand the complexity of the code. However, you may find that your code is buggy and it takes a long time to locate and fix your bugs. Each fix may introduce new bugs. You might not have anybody left on the team that actually understands the code. Changes that would appear to be simple are taking vast amounts of time… Or you may find that someone is suggesting that the design and code are now so broken that you should throw it all away and start again. As Joel points out, this is always a bad idea.
Choosing to refactor allows you to make small, incremental improvements to the code and move it towards something that is ‘better’. You can define better before you start, and every project’s better will be different. Yes there’s a chance that you’ll break things, but when you choose to refactor you’re accepting that risk.
It seems that Bob got his inspiration from a friend of his, Paul Tyma. Here Paul explains a bit of the background behind the ‘cleaning up code’ quote. Right near the end Paul says something that hits home.
I’m surprised so much trust is afforded to each and every refactoring programmer.
So am I. But then, I’m also surprised that so much trust is afforded to each and every programmer. But that’s another story…