Kevin Barnes on TDD

2005-11-19

Kevin Barnes over at Code Craft has just written an interesting piece on TDD. In it he claims that “Excessive unit testing infrastructure hampers your practical ability to refactor things efficiently. People scream at me when I say this. Unit tests are supposed to enable refactoring by spotting unexpected bugs resulting from the refactoring effort. I agree. Unit tests help with this, but only up to a point. When they cross the line into extreme-testing they can be pretty much guaranteed to break every single time you refactor the code”

He has a point, as with all things, you need to be pragmatic…

This started out as a comment for Kevins posting but it got a little long so I decided to post it here instead. First go and read the original full article, I’ll still be here when you get back…

I think Kevin has a good point here. Testing is good. Test Driven Development is good. Things done to excess where you forget about what you’re actually trying to achieve by doing the thing can be bad…

Firstly I think Kevin’s 85% coverage rule is a pretty good metric. I don’t often use coverage tools when I’m writing tests as I don’t believe that striving for 100% coverage buys you anything but being able to tick a box that says you have 100% coverage. When I do use coverage tools I use them like this, i.e. just to get a feel for what we might be missing. I like the idea of automating the generation of coverage numbers into the build but it’s one of those metrics that I don’t think too many people should really get to see. It can distort people’s view on what’s really important.

Likewise, I agree that it doesn’t really matter when you write the tests, the important thing is having some tests. Sure some tests generate a lot of design value if you write them before the code but some don’t. You don’t need to beat yourself up over it. The important thing is having tests to support you when you later need to change the code. The trick is to be pragmatic and to realise that the purpose of it all is to generate working code quickly and to support you when you need to change that code. I don’t subscribe to the idea that you should spend hours trying to work out how to write a test for simple functionality - unless the aim of those hours is actually to work out how to write tests (for a book, perhaps) rather than to deliver code to clients.

I’m not quite so convinced by the argument that because the tests might break if you refactor the code then it prevents the code being refactored as much as you might otherwise. Without the tests how can you be sure that this refactoring doesn’t break anything? Sure I’ve been on projects where some developers didn’t run the tests after they’d made a change because the change caused the tests to break and they couldn’t be bothered to work out if the break was a real problem or just a consequence of the way the tests were testing the original code and sometimes they were right and sometimes they put regressions back into production. The question you need to ask yourself is how important is it that we meet the deadline? You then need to balance that against delivering code that works after the deadline or code that might work by the deadline… The kind of project that you’re working on, the kind of client that you have and the kind of developer that you are will adjust what’s correct for you. Religious arguments about TDD are like most religious arguments; pretty pointless.

I completely agree that the unit size is important and sometimes you’ll get a better return on the time spent writing tests if you test at a larger granularity. This is often especially true when you’re retro-fitting tests to a legacy code base. If you only have the time to write one set of unit tests for a single unit then pick one that proves that lots of your legacy code still works; choose an effective inflection point and get the biggest bang for your buck. Later, if there is a later, you can write more unit tests for smaller parts of the initial large ‘unit’. Likewise, even when doing TDD at the class level you sometimes need to test several classes together in a composite unit. This proves that the classes actually work together, there’s no sense mocking up everything and then finding that when you plug all of the real pieces together nothing works… Again I find that you have to be careful how you decide on the composition of your unit; practice, experience and mistakes help here, as always.

I disagree with some of Kevin’s comments about order dependent testing. Sometimes this kind of interaction testing gives you fragile tests, sometimes it gives you rigid tests - it just depends on the way you look at it and what is important. I do a lot of simple interaction testing and I agree that sometimes it becomes a pain to change things because of the expectations that the tests have from the code under test. The problem is that about an equal number of times I find that the rigidity of the expectations finds errors that might not otherwise be found. A simple change might mean that a service provider gets called several times per operation rather than once, that call might be expensive, the additional calls might not really be required, the rigidity of the tests shows the problem straight away and you can, perhaps, refactor the additional calls away… As I’ve said before, I find that the tests support the code and the code supports the tests, it’s almost a symbiotic relationship. The tests keep the code working as you’d expect whilst you refactor the code and the code keeps the tests working as you’d expect whilst you refactor the tests. If you view things this way then what may seem to be fragile tests to some can be viewed as advantage; the tests are rigid, not necessarily fragile.

Kevin’s comments on database testing seem to follow the standard mantra of people who don’t isolate their external systems. Yes you test your database interaction but that interaction can generally occur in a small number of classes. You don’t need all of your classes to talk to your database, many can (I’d go as far as saying “should”) rely on accessing the database via a reasonably narrow “interface” that you can stub out for most tests… In summary, you shouldn’t be building SQL queries all over the place and if you’re not then you can isolate the code that works with the database from the code that simply needs to access things that are stored in the database. This is equally true for testing all external systems.

I’ve never come across anyone who mocks system objects, but I guess they exist… Generally I find that I rarely use ‘bare’ system objects, like map and list. I usually have more specialised ‘collection’ abstractions that expose only the exact interface that the code around them needs. In practice this means that whilst the CWidgetCollection class might use a map I’d test no lower than the CWidgetCollection level… In fact, thinking about it, this fits well with the “external system” isolation thing above. You should be dealing in terms of appropriate abstractions within your system. Just as a map is probably not an appropriate abstraction and so you create a CWidgetCollection that’s implemented in terms of a map, so a database isn’t often an appropriate abstraction, it’s far too general purpose, so you provide an appropriately limiting abstraction and work in terms of that. That said, I have had value out of mocking very low level things to ensure that very low level wrapper classes work correctly. Once again it depends on the level of abstraction that you’re working at. I’ve got lots of value out of mocking the Win32 registry to test registry access wrappers, for example.

With interfaces I tend to let the design drive where they’re appropriate. I often initially work in terms of concrete objects and only when I find that I need two implementations do I decide to slip an interface in to allow polymorphism. In a way this goes hand in hand with selecting the appropriate unit size. If you come at the problem from the TDD zealot end of the scale then everything has to use interfaces because everything has to be tested in isolation at a single class level of unit. If you come at this from the TDD pragmatist end of the scale then you only need to insert interfaces to decouple your units…

I completely agree that you need to stay focussed on the real purpose of the work that you’re doing. For me that purpose is to deliver working code to clients quickly, with as few bugs as possible and to be able to update that code quickly and without introducing regressions. Unit testing helps me do that, sometimes. Sometimes it’s appropriate to develop code in full on TDD style and sometimes it’s not. TDD is just another tool. Like most things in software development the skill is in using the tools and blending the techniques to give the required results. TDD, like most other methodologies, is no substitute for skill and experience.