Setting the preferred NUMA node for a Windows Service (and making it work after a reboot)
When your machine has multiple NUMA nodes it’s often useful to restrict a process to using just one for performance reasons. It’s sometimes hard to fully utilize multiple NUMA nodes and, if you get it wrong, it can cost in performance as the nodes need to keep their caches consistent and potentially access memory over a slower link than the memory that is closer to the node, these things can be relatively expensive.
Windows allows you to start a process on a specific NUMA node. For service
processes you can use ChangeServiceConfig2() and SERVICE_CONFIG_PREFERRED_NODE
to tell the Service Control Manager to start your service on your preferred
node. Recently a client has been complaining that one of their services that uses
this functionality has been having problems after a machine restart. The service
worked fine if it was installed and then run, always starting on the correct node,
but failed after the machine was rebooted.
The failure mode was strange. The SCM stores persistent service configuration in
the registry. The registry for the service in question contained the expected “PreferredNode”
entry of the correct type and with the correct value but the SCM was not using it.
When our service starts it does some sanity checks to ensure that all of the NUMA settings
are as expected. We do this because if these settings are incorrect then the performance
suffers, and people complain. To allow these sanity checks to work we need to know how we
configured the service and so we call QueryServiceConfig2() to
see the value that we had set the SERVICE_CONFIG_PREFERRED_NODE
to when we installed
our service. We then check that various things are running on that node and that key
pieces of memory have been allocated on that node. In this case we never got as
far as the sanity checks because calling QueryServiceConfig2()
itself was failing with error 87, ERROR_INVALID_PARAMETER
. We weren’t passing an
invalid parameter. The code worked just fine if the machine had not been rebooted
between the service installing itself, and calling
ChangeServiceConfig2() with SERVICE_CONFIG_PREFERRED_NODE
and then, in a separate run of the service, but without a reboot, calling
QueryServiceConfig2()
to retrieve the value of SERVICE_CONFIG_PREFERRED_NODE
. The persisted registry
configuration was correct. We then switched to reading the “PreferredNode” value
directly from the registry ourselves. This worked but then the sanity checks failed,
correctly, if the SCM had started our service on the wrong node. The fact that the
SCM was starting our service on the wrong node clearly showed that it wasn’t just
querying the value that was broken but the SCM also failed to use the value when
starting the service.
Testing this code on my very old NUMA test box showed that it worked just fine. The old box runs Windows Server 2012 R2. Testing on a Windows Server 2016 box worked fine. The code failed on Windows Server 2019 (and, presumably Server 2022). I’m pretty sure that the code HAS worked on Windows Server 2019 in the past, so it may be an update that has changed the behaviour.
It seems we’re not the only ones suffering from this problem, though my searches have that many hits. However, this and this look like the same issue.
I’ve worked around the problem with an extra level of indirection. The service that we need to run on a preferred node now depends on a second, bootstrap, service. The bootstrap service scans the registry for our main service’s entries, reads the “PreferredNode” setting and calls ChangeServiceConfig2() for the main service to configure the SCM to use the setting. The main service then starts and the SCM uses the correct node (because it works if there’s no reboot between setting and requesting the values).
It took quite a while to get here. The solution appears to be robust and our installation and removal process has been changed to ensure that it’s all set up correctly on the operating systems that need it. It’s a horrible hack though.
Using ProcMonitor to create a boot log of system activity shows that services.exe
looks for, and in the case of our service, reads the PreferredNode
registry setting
correctly. It also writes the value when our bootstrap service reconfigures the
service. It never reads it again, but that’s consistent with it having a cache. So
the problem here is somewhere in services.exe
between where it reads the value and
where it caches it. When using the ChangeServiceConfig2()
API the internal cache is updated correctly, when reading from the registry at
boot time the cache is not updated correctly and the SCM returns ERROR_INVALID_PARAMETER
.
This kind of problem solving, and detective work gets in the way of the “real” work but I actually quite enjoy it…