One of the first things that I tried to do with the latest release of my TickShifter tool was to run it on itself. Of course, that didn’t work.
There are several reasons why the tool might have had problems running on itself but I thought I’d addressed most of them when I dealt with the issues around getting multiple copies of the tool running at the same time. The problems there were mostly around the names that I used for the named kernel objects that were required; the control program communicates with the dll that it injects so that it can control the dll’s operation and collect data. This communication requires some named objects, pipes, shared memory, events, etc, so that the dll can correctly hook up with the control program. Initially these names were fixed and that meant that only one instance of the tool could run at a time, adding the process Id of the running instance of the tool to the names fixed that little problem.
Updated 4th May 2023 to fix broken links
The tools all use API hooking to do their work. The control program, the bit of the tool that you see and interact with, injects a dll into the target program and this dll hooks specific APIs within the target program and communicates back with the control program. The API hooking code is pretty standard stuff, based on the design that’s straight out of Richter’s books. Part of this design ensures that when each new dll is loaded into the target process it’s hooked correctly. This requires that the code that does the hooking of your target APIs also hooks
LoadLibrary() so that it can hook all newly loaded dlls and
GetProcAddress() so that it can lie about function addresses to anyone that asks.
The problem I had was that to inject the dll into the target process the control program needs to find out the address of
LoadLibrary() in itself and, since it’s implemented in Kernel32.dll, and that’s always loaded in the same place in all processes, that address is the same in the target process. The result of this is used in the call to create a remote thread in the target process and this remote thread loads the dll into the target. My problem was that in the tool that was running under a tool, the address that I was getting back from
LoadLibrary() was the address of the hooked function in the tool process and not the address of the real function in Kernel32. Since the hooked function only existed in the hooked tool and not in the target process the target process died with an access violation when the remote thread tried to call
It’s a fairly obvious problem once you realise what’s going on. Of course getting to that realisation stage took a while. Debugging debuggers that inject code into target processes isn’t the simplest thing in the world and I ended up doing it the “old fashioned way” with
OutputDebugString(). This also wasn’t as easy as it could have been because my tool, the control program, is a Win32 debugger and, as such, it receives the
OutputDebugString() output from all of its child processes. So, to see that child process output I needed to have the debugger call
OutputDebugString() on the strings that it was passed via the
Once I’d located the problem, solving it took a little while. The obvious solution is to simply provide the real address when required. Unfortunately, possibly due to my design, that’s a little harder than it should be. When I fixed the code so that it worked with the tool that had been hooked by another tool the tool that wasn’t hooked stopped working… Eventually I figured it out and the latest, as yet unreleased, version of TickShifter can now shift the ticks on an instance of itself which can shift the ticks on an instance of itself, etc…
This is handy as it means that I could now run my deadlock detection tool on the other tools whilst I’m developing them, should I need to do so!