I’m debugging performance issues with a C++ server that has been stalling and then failing to recover. I’ve reached a point where we can generate the problem using a network interruption that causes multiple connections to disconnect at the same time. The fixed sized thread pool that services these connections becomes overloaded with work that requires it to clean up connection objects for the disconnected connections and all of the threads in the pool spend far too long fighting over the lock to the heap as they try and return memory to it to clean up the connection objects.
Back in August 2012 I shared my scripts for building OpenSSL on Windows. These have changed a little since the ones I had for the 1.0.x and 0.9.x releases of OpenSSL. The main idea is the same, the scripts build the OpenSSL code as both static libs and DLLs for both x86 and x64 and allow you to have all of the files next to each other in the same directory by adding various ‘warts’ to the file names.
I’ve been writing a series of blog posts, called “Practical Testing”, about testing real-world, multi-threaded code. Up until now I’ve used my own, home grown, unit testing framework. When I started out with this series back in 2004 there wasn’t much in the way of mature testing frameworks for C++ and, over the years I haven’t really found a need to switch from my own stuff that I understand and that works pretty well.
Nineteen years ago I began a series of blog posts, called “Practical Testing”, about testing real-world, multi-threaded code. As with most code that works well, and is used by lots of people, we’re still changing it and improving it and using it. I’ve just done a precis of how we got here and now it’s time to continue the journey.
As I hinted at the end of the last episode, there were some outstanding issues to deal with and some new functionality to add and test.
I’m in the process of investigating GoogleTest and the experience has been interesting. I’ve been unit testing code and doing Test Driven Development for a long time now; almost 20 years and I’m still learning. I’ve had my own test code, a simple test framework, that has served me well but it’s not always appropriate so I’m looking for alternatives. I’ll no doubt write about my thoughts on GoogleTest later, once I’ve played a bit more and put my thoughts in order.
So, this morning I’m back from my Easter break and working on some code for a client and the first thing I do is kick off my CI build and things start failing. It seems that my “cunning plan” to have my CI build use the preview version of Visual Studio 2022 whilst the client uses earlier versions has paid off again…
We build with all warnings enabled and treat warnings as errors.
I’ve been playing around with Rust recently and whilst investigating asynchronous programming in Rust I was looking at Tokio, an async runtime. From there I started looking at Mio, the cross-platform, low-level, I/O code that Tokio uses.
For Windows platforms Mio uses wepoll, which is a Windows implementation of the Linux epoll API on Windows and is based on the code that is used by libuv for Node.js. This uses networking code that is NOT your standard high-performance Windows networking code using I/O completion ports and instead uses the ‘sparsely documented’ \Device\Afd interface that lies below the Winsock2 layer.
Back in 2004 I started a series of blog posts called “Practical Testing”, about unit testing a non-trivial piece of C++ code. The idea was to show how adding unit tests to existing, real-world, code could be useful and could support future development and refactoring. Episodes 1 though 13 were written in 2004 and covered getting the code under test and fixing some bugs that were complicated to reproduce in a live environment.
I’ve finally done what I should have done several years ago and shut down the LockExplorer site.
I haven’t had the time required to keep the tool up to date and fewer people were interested than I originally expected.
I may make the source code available at some point.
One of my clients has been reporting an intermittent issue with the deployment of new releases of their game server. This runs as a Windows service on many, many, cloud machines and, just sometimes, the service seems to have issues during start up after upgrading the code on a machine that it has otherwise been running fine on.
I’ve been adding debug code to our service start up code to try and work out what’s going on and today we had our first hit.