Our first test
The refactoring project reached an exciting new stage on Friday. We were finally able to refactor some code to the point where several classes could be extracted from the application project and built in a library that the application then linked to. This required that we reduce the coupling of the classes and increase the cohesion. Once these classes were extracted we were able to set up a test harness project and write our first test case!
Our application is a client/server data entry system. The refactoring project itself is interested in the client side code.
We have several situations where the data that a user enters needs to be validated against some reference data. For example, a user can only enter valid customers; and these customers are present in a validation data structure on the client… We had a couple of new requirements and a week to implement them. I decided that the new requirements would be easier and safer to implement if we refactored the existing code first.
The data structure that the validation code used was far from ideal. A pair of vectors of strings provided a simple key/value data structure in which a linear search was required for validation. These were stored in a globally accessible map which associated the key and value vectors with a name which was used to differentiate between different kinds of validation data. The dialog that the user saw if they entered an incorrect value allowed the user to gradually refine the search by editing a search string. As the search was refined so the number of available entries in the display was reduced. The existing functionality was spread between the display dialog on the main application, the map or pairs of vectors was purely data. The application would ask the display dialog to validate the user input and, if it was valid the dialog would never be displayed, if it was invalid the dialog would pop up.
Step one of our refactoring was to find out where the global validation data was used and how it was used. Changing the name of the typedef that was used to reference the map allowed us to follow the compilers errors to locate all of the call sites. We then replaced the typedef with a class that wrapped the map so that we could isolate the actions that the users of the validation code were performing. We narrowed the interface on the validation data from the full interface of a map down to just the operations that the client code required. We then added the operations that the client code performed on the pair of key/value vectors so that the class was a complete replacement for the existing validation data structures. We had abstracted the validation into a validator object which hid the details of how it managed the data for validation.
Step two dealt with performance issues, validating against 60,000 items took too long. Why there are 60,000 valid items is something that’s on the list for future investigation. We adjusted the way that the key value pairs were stored and added a new class that managed the interim results of the user’s search. No point starting from scratch to find values that contain “AB” when we already have the search results from searching for values that contain “A”.
Once performance was better we removed the global nature of the validation data by using the Parameterise From Above pattern and passing the validation data explicitly to all call sites. There were still too many areas of code coupled to the validation data, but at least the coupling was now explicit via a parameter rather than implicit via a global.
Then we moved the code around between classes. Now that the data access was more explicit it was clear that some operations should exist inside the new classes rather than outside them.
Finally we added the last of our new requirements, the ability to search via subsets of the values, and a way for the user to configure the subsets. The work that we’d intended to do was done and we had around a day to spare, so we performed some more refactoring to reduce the coupling between the validation code and the rest of the application.
The application project currently includes over 300 files. It’s hard to manage and the coupling causes long build times. We hope to move much of the code into libraries and structure for ease of testing and packaging. Near the end of the day on Friday we took a leap of faith and extracted all of the new validation classes out of the application and into a library. The code almost compiled. We had one remaining tie to break. The display dialog would remain in the application and was dependant on the new library code, but the part of the validator that popped up the display was now dependant on application code. We broke the circular dependency with an interface. The validator is now given an object that implements a ‘user search’ interface. This interface is part of the library code and the display dialog implements the interface.
The library compiled and the code worked. At least our “monkey’s at the keyboard” tests passed. Given that we had an hour or so left before the weekend and we now had nicely isolated validation code we created a test harness executable that was dependant on the library code. We then wrote our first test case…
We have appalling test coverage at present. We have one test that tests a small part of one object. But we have a test harness, that’s a very big step in the right direction. We now have one small set of classes that are sufficiently loosely coupled and highly cohesive that we can test them in isolation. At the start of the week that was unthinkable.