When your machine has multiple NUMA nodes it’s often useful to restrict a process to using just one for performance reasons. It’s sometimes hard to fully utilize multiple NUMA nodes and, if you get it wrong, it can cost in performance as the nodes need to keep their caches consistent and potentially access memory over a slower link than the memory that is closer to the node, these things can be relatively expensive.
We recently had an old client contact us with an unusual request. We last worked with VEXIS Systems Inc. back in 2010 when we extended the telephony server we’d built for them to support CLR hosting, using The Server Framework’s CLR Hosting Option. We then built a managed plugin system that integrated with the existing unmanaged system so that they could write their business logic in either unmanaged code or in a managed language such as C#.
I’ve always found testing multi-threaded code in C++ a humbling experience. It’s just so easy to make stupid mistakes and for those mistakes to lurk in code until the circumstances are just right for them to show themselves.
This weekend a unit test that has run thousands of times for many years started to fail. The reason for the failure was a fairly obvious race condition in the test code. This issue had lain dormant and only been exposed by running the tests on my new development machine.
I’ve had a sick PC for several weeks now. It has cost me a surprising amount of time and thought.
It started with my main work machine randomly hanging. This is Windows 11 with a Ryzen 9 5900X, and it has previously run faultlessly for two years or so. The hangs were, at first, annoying and I assumed that it was some driver that had been updated and was playing up, and initially I hoped that it would just fix itself with another update.
One of my long-term clients has hundreds of cloud machines running instances of their server, each server maintains thousands of reliable UDP connections using a custom protocol that we’ve developed over the years. When things go wrong it’s often hard to work out why. Even though we have reasonable unit test coverage of the code that runs the UDP protocol, it’s hard to build tests that cover every possible scenario.
The manual process around updating broken links is due to be replaced by a simple link checker that I’ve been writing in Rust. It’s not quite ready yet but it’s nearly there…
I was updating a few broken links today and came across this from 2004;
“Software development is about discipline and detail; code quality starts to decay as soon as developers forget this. All code decays, but tests can help to make this decay obvious earlier.
This blog has been around a long time and the internet tends to rot. This means that quite a lot of the links on old posts are broken. I’m slowly fixing these broken links to use “The Wayback Machine” but it’s complicated to automate as the resulting URLs need to include a timestamp of a valid snapshot and can’t just include a ‘rough idea of the date’. So I’m fixing the broken links manually by watching the posts that are accessed the most and manually checking the links and fixing them up.
On the 3rd of May 2003 I posted the first entry on this blog. I then proceeded to “back fill” the blog with various things that had either been posted before in other places or had been laying around waiting for me to have somewhere to put them. This is why although the blog began in 2003 the archives go back to 1992.
What I said on the 10th anniversary of this blog is still apt:
I’ve been investigating the ‘sparsely documented’ \Device\Afd interface that lies below the Winsock2 layer. Today I use a test driven method for understanding and documenting the API.
TDU - Test Driven Understanding When trying to understand a new API I always like to end up with executable documentation in the form of tests that show the behaviour of the API. I write these tests in the same way that I write any tests; writing a test that fails and then adjusting so that it passes.
Yesterday I was bemoaning encapsulation and how it was hiding what was going on inside my objects (and quite right too, what good would it be otherwise?). The issue is that the object I was interested in, and each of the objects that formed it, were allocating more memory that expected. It wasn’t so much that the object was bigger than expected, just that there were more allocations than I expected and that for some reason destroying lots of these objects is taking longer than I would expect.