GoogleTest - the first test

2023-04-21

I’m in the process of investigating GoogleTest and the experience has been interesting. I’ve been unit testing code and doing Test Driven Development for a long time now; almost 20 years and I’m still learning. I’ve had my own test code, a simple test framework, that has served me well but it’s not always appropriate so I’m looking for alternatives. I’ll no doubt write about my thoughts on GoogleTest later, once I’ve played a bit more and put my thoughts in order. Initial impressions are good, though I’m still working out how to use it in my code. I’ve started with the simplest approach and used it using CMake as suggested in the GoogleTest User’s Guide.

TL;DR

This isn’t about GoogleTest, those posts will come later.

This is about how illuminating it can be to write the first test for a piece of code and how writing the first test exposes you to all of the dependencies and baggage that a simple piece of code can have.

I decided to select a simple piece of code to test; the “reusable id manager” code that I’ve recently used for learning some Rust programming. This is a collection of numerical ids which can be allocated, used and then released back to a pool for later reuse. I tend to use this kind of code for network protocols where something like a peer-id is required; these ids can be within a given range (usually limited by the data type that represents them) and are “in use” for a period and then return to the collection. This allows us to have, say, an id for a connection that is represented by a BYTE where we may have up to 255 active connections, each with a unique id, at one time. Once a connection closes it releases the id and then the id can be used again for another connection.

Writing this code in Rust and writing tests for it was fairly easy and my plan was to use this code, rather than the more complex code from my Practical Testing series of articles as it would be easier to work with when my focus is actually on understanding the testing.

I’ve done quite a lot of work with CMake and generally use it from within CLion for my Linux and MacOS builds of The Server Framework so I expected it to be quite easy to get things up and running. The GoogleTest side of things was easy but as soon as I added the code that I wanted to test things got more complex.

The code in question is a single header file, however, since it’s part of my framework code, it uses some other headers in the framework. I wanted to focus on converting the tests that I have from my test framework to GoogleTest and so started with a single test, one that constructed the object.

#include "ReusableIdManager.h"
#include <gtest/gtest.h>

using JetByteTools::Core::TReusableIdManager;

TEST(ReusableIdManagerTest, TestConstruct) {
    TReusableIdManager<unsigned int> idManager;
}

Writing the first test always exposes you to all the dependencies of the code under test. For this piece of work I decided to start at the top and chase the dependencies down until I had something that built and ran; this took longer than I expected and by the time I was done I had the following result from cloc.

S:\googletest\git\JetByteTools>cloc-1.96.1.exe *
     159 text files.
     159 unique files.
       1 file ignored.

github.com/AlDanial/cloc v 1.96  T=0.20 s (794.2 files/s, 168914.2 lines/s)
-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
C++                             46           3522           2163          12690
C/C++ Header                   108           3229           3999           8010
CMake                            5             21              4            180
-------------------------------------------------------------------------------
SUM:                           159           6772           6166          20880
-------------------------------------------------------------------------------

This is, quite frankly, utterly ridiculous and purely the result of the normal process of accretion in code that has been around since 1997. Even with regular ongoing maintenance, stuff gets added, value gets added, and dependencies increase. It can be difficult to see it happening when you always link with the whole of the library that contains the code and you don’t see it when adding unit tests for a new class because, with my current testing style, the test links with the whole library that contains the code under test. It’s painfully exposed here because I’m trying to test one class in isolation.

I faced this issue with the code in the Practical Testing series but worked around it somewhat by removing dependencies and sliming the code down somewhat, basically avoiding the issue.

The legacy nature of some of my code and how to move things forward is something that I’ve been thinking about a lot recently. This attempt to test a simple piece of code really brings home to me exactly what the problems are.

This is what the file under test needs to include…

#include "JetByteTools/Admin/StaticAssert.h"

#include "Exception.h"
#include "ToString.h"
#include "DebugTrace.h"

#include <set>

#include <limits>

StaticAssert.h is due to the complexity of getting static_assert to work in pre C++ 11 code. However, that complexity is now gone and the header itself has been slimmed down to practically nothing to reflect that and should be removed as static_assert can now be used directly in all the compilers that I support. However, this isn’t really part of the problem here. It’s a single header with no other dependencies.
Exception.h - a custom exception type. The exception is reasonably simple, but it pulls in the whole _tstring concept whereby, depending on the build configuration, the standard string that my framework uses is either std::string or std::wstring. This dates back to the original code I was writing for Windows 95 and NT 4.0 where I decided that, since the Win32 API could be wide or narrow depending on the platform and since most of the Windows code that I worked with used the _T macros for strings and could take both types that’s how the framework would be. Looking back this doesn’t look as useful now where most of the code doesn’t touch the Win32 API. Right now I have a situation where the Linux and MacOS versions of the code build with options that result in the use of std::string and the Windows version builds with with options that result in std::wstring; mostly, I think, we should switch to using std::string and only use other things where they’re needed.
ToString.h - this could be replaced, in this code at least, with std::to_string.
DebugTrace.h - this is used because the destructor of the class in this file has an optional exception handling block which catches, logs and swallows any exceptions that are produced. It’s horrible ‘bandage’ code that exists to allow otherwise fatal issues to be logged and ignored and not really considered properly. This is the header that pulls in the most additional code as it provides access to a singleton-based logging system; which includes crash dump creation code, call stack dumping code, locks, mutexes, semaphores, etc, etc.

After adjusting the code to remove these dependencies, we end up with it including this:

#include <stdexcept>
#include <string>
#include <set>

#include <limits>

and the following results from cloc:

S:\googletest\git2>cloc-1.96.1.exe ReusableIdManager.h
       1 text file.
       1 unique file.
       0 files ignored.

github.com/AlDanial/cloc v 1.96  T=0.01 s (112.8 files/s, 73121.0 lines/s)
-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
C/C++ Header                     1            146             58            444
-------------------------------------------------------------------------------

We have a single file to test and a single test file and that’s it.

This was an interesting exercise and something I’ve been thinking about for some time. Whilst I’ve been quite good at removing unrequired code as we drop support for older compilers there’s so much more that can be done to maintain an existing codebase. The fact that the code has its origins back in the 90’s isn’t an excuse to do nothing. I’m better now at making sure that the code has tests, even if it’s usually possible to add more tests and get more coverage. Back in 2005 I bemoaned my legacy code as code that didn’t have unit tests. Now my legacy code is code that has acquired too much dependency on code, concepts and styles that may not be appropriate anymore…

The main reason for some of the cruft is that I have lots of clients using the code and making breaking changes, even for good reasons, is hard. I’ve been planning on modernising the code for new clients and then moving existing clients to the new codebase as and when they wish to. Step one is to modernise some of the code, slim down the dependencies, work out exactly why things are as they are and then determine how the code can be adjusted to still perform as expected but without any historical artefacts that are no longer appropriate. It should be a fun exercise.