One of my long-term clients has hundreds of cloud machines running instances of their server, each server maintains thousands of reliable UDP connections using a custom protocol that we’ve developed over the years. When things go wrong it’s often hard to work out why. Even though we have reasonable unit test coverage of the code that runs the UDP protocol, it’s hard to build tests that cover every possible scenario.
I’ve been investigating the ‘sparsely documented’ \Device\Afd interface that lies below the Winsock2 layer. Today I use a test driven method for understanding and documenting the API.
TDU - Test Driven Understanding When trying to understand a new API I always like to end up with executable documentation in the form of tests that show the behaviour of the API. I write these tests in the same way that I write any tests; writing a test that fails and then adjusting so that it passes.
I’m in the process of investigating GoogleTest and the experience has been interesting. I’ve been unit testing code and doing Test Driven Development for a long time now; almost 20 years and I’m still learning. I’ve had my own test code, a simple test framework, that has served me well but it’s not always appropriate so I’m looking for alternatives. I’ll no doubt write about my thoughts on GoogleTest later, once I’ve played a bit more and put my thoughts in order.
UPDATED: 23 August 2021 see here
One of my clients runs game servers on the cloud. They have an AWFUL lot of them and have run them for a long time. Every so often they have problems with DDOS attacks on their servers. They have upstream DDOS protection from their hosting providers but these take a while to recognise the attacks and so there’s usually a period when the servers are vulnerable.
Previously on “Practical Testing”… The last time I updated the code was in 2016, things have changed quite a lot since then with several new compilers and several compilers that I no longer support. There are very few actual code changes, but the code now builds with Visual Studio 2019 (16.6 preview 3.0) and Visual Studio 2017. I’ve removed support for anything before Visual Studio 2015.
The code is here on GitHub and new rules apply.
Previously on “Practical Testing”… Having just resurrected this series of blog posts about testing a non-trivial piece of real-world C++ code I’ve fixed a few bugs and done a bit of refactoring. There’s one more step required to bring the code in this article right up to date.
The timer queue that is the focus of these blog posts is part of The Server Framework. This body of code has been around since 2001 and has evolved to support new platforms and compilers.
Previously on “Practical Testing”… I’ve just fixed a new bug in the timer queue and in doing so I updated the code used in the blog posts to the latest version that ships with The Server Framework. This code included two fixes that had been fixed some time ago and which I hadn’t documented here. They also lacked unit tests… Last time, I wrote tests for, and fixed, the first bug.
Previously on “Practical Testing”… I’ve just fixed a new bug in the timer queue and in doing so I updated the code used in the blog posts to the latest version that ships with The Server Framework. This code included two fixes that had been fixed some time ago and which I hadn’t documented here. They also lacked unit tests… In this episode I find and fix the first of these issues by writing a unit test that triggers the issue.
Back in 2004, I wrote a series of articles called “Practical Testing” where I took a piece of complicated multi-threaded code and wrote tests for it. I then rebuild the code from scratch in a test driven development style to show how writing your tests before your code changes how you design your code. Since the original articles there have been several bug fixes and redesigns all of which have been supported by the original unit tests and many of which have led to the development of more tests.
I’ve just spent a day tracking down a bug in a pending release of The Server Framework. It was an interesting, and actually quite enjoyable, journey but one that I shouldn’t have had to make. The bug was due to a Windows API call being inserted between the previous API call and the call to GetLastError() to retrieve the error code on failure. The new API call overwrote the previous error value and this confused the error handling code for the previous API call.