Code Generation

I’ve spent much of this week moving from a hand crafted prototype to a code generated version of the same code. The code is all repetitive boiler plate, sort of like the stuff that MIDL generates for you. The code generated version is considerably better code and has evolved faster than the hand written code because I was free to adapt and improve the code at the “single entity” level whereas when I was prototyping the code all improvements needed to be made to all of the entities involved. Now that the code generator knows how to generate each of the entities I only need to refactor the template… Fairly obvious stuff but I’m wondering if I shouldn’t have opted for generating the code earlier rather than later.

The biggest challenge for me was to make sure that my code generator was like MIDL rather than being like a Visual Studio App Wizard. Requirement number 1 was that you should be able to run the code generator as part of the build with only the input file checked in to source control. If I needed to extend the code that was generated then that extension would be done in such a way that the extensions lived in separate files. This was important because I’m pretty sure that I won’t get the code generator right first time so I envisage that I’ll be updating it and re-running it over and over again. I didn’t want that to affect the work I was doing that uses the generated code, and besides, it would just be laziness if the tool didn’t work this way.

Requirement number 2 was that we minimise the amount of duplicate code that the generator generates. The generator could rely on the generated code being linked with library code and so anything that was identical between the code generated by two different input files shouldn’t be generated and should be in a library.

Number 3 was that the code shouldn’t be ugly. ;) It should look just like code that has been hand written. This was the easy one.

I had a quick look at a few tools and decided that they were all too complex for what I wanted so I rolled my own. My input language is C++, my output language is C++ and the templates themselves are C++. Possibly not the most inspired choice for the template language but it works and it was quick to write because I don’t find writing C++ hard and I’m supported by a vast library of well tested code that I use every day…

The good news is that I’ve achieved my goals quicker than I expected and the ability to generate the boiler plate code has meant that the project has leapt ahead; but that was always the plan. What was interesting is that writing the code generator made me think about how I write code, but more about that in another posting.