I've been lazy this week

As I mentioned in an earlier posting I’ve been working on a tool this week. I’m too lazy to do a job manually and so I decided to write a tool to help me do it…

Note: the deadlock detector mentioned in this blog post is now available for download from www.lockexplorer.com.

The tool is designed to help me track down deadlocks in code. I decided I needed this tool because I wrote a piece about debugging deadlocks in Visual C++ and realised that using trial and error to locate deadlocks in some client code simply wasn’t good enough. The trial and error thrash test required that the code under test actually ended up in a deadlock and, of course, the main problem with locating deadlocks is that the often never happen under test and always happen in production on slightly different hardware with slightly different scheduling. Thus the main aim for the tool was to reliably tell me if a program could deadlock, even if it didn’t deadlock during that particular run.

Yesterday evening I got to the point where the report coming out of the program could pinpoint potential deadlocks and today I used that information to locate and fix one of the issues in the code. Cool!

The part of the report that I used this morning was this. It shows two sequences of lock acquisition:

Sequence 1:       0x057f5634 @ 1960,  1:[0x0012facc @ 1970], 2:[0x0012fa5c @ 1995], 0x00571d60 @ 2004
Threads: 3904

Sequence 2:    2:[0x0012fa5c @ 2722],    0x057f4e74 @ 2723,  1:[0x0012facc @ 2729], 0x0012f948 @ 2738, 0x00571d60 @ 2741
Threads: 1920

For these sequences only one thread uses each particular sequence but the sequences acquire the locks tagged as 1 and 2 in different orders. This means that if both threads are executing at the same time they can deadlock each other; Thread 3904 acquires lock 1 and, before it can acquire lock 2, thread 1920 acquires lock 2 and then tries to acquire lock 1…

The @ XXXX bits are code locations. They can be expanded like this:

Location: 1970
MTSCSS - I:\JetByteTools\Win32Tools\CriticalSection.cpp: 99 - Win32::CCriticalSection::Owner::Owner
MTSCSS - I:\MTSCSS\ConnectionCacheBase.cpp: 300 - CConnectionCacheBase::GetConnection
MTSCSS - I:\MTSCSS\ProtocolHandler.cpp: 486 - CProtocolHandler::Connect
MTSCSS - I:\MTSCSS\ProtocolHandler.cpp: 340 - CProtocolHandler::ProcessCommand
MTSCSS - I:\MTSCSS\ProtocolHandler.cpp: 262 - CProtocolHandler::OnLocalDataReceived
MTSCSS - I:\MTSCSS\ProtocolHandler.cpp: 162 - CProtocolHandler::OnDataReceived
MTSCSS - ..\JetByteTools\IOTools\TProtocolHandlerImpl.h: 114 - IO::TProtocolHandlerExImpl<1,IO::IProtocolHandler>::OnDataReceived
MTSCSS - I:\MTSCSS\SocketServer.cpp: 215 - CSocketServer::ReadCompleted
MTSCSS - I:\JetByteTools\SocketTools\AsyncSocketConnectionManager.cpp: 336 - Socket::CAsyncSocketConnectionManager::HandleOperation
MTSCSS - I:\JetByteTools\SocketTools\AsyncSocket.cpp: 582 - Socket::CAsyncSocket::HandleOperation
MTSCSS - I:\JetByteTools\IOTools\IOPool.cpp: 222 - IO::CIOPool::WorkerThread::Run
MTSCSS - I:\JetByteTools\Win32Tools\Thread.cpp: 149 - Win32::CThread::ThreadFunction
MTSCSS - threadex.c: 212 - _threadstartex

With this information I can locate the potential deadlock and work out how to fix it. The good thing about running the tool is that it’s alerted me to this potential deadlock without the deadlock ever having to have actually happened and this makes it considerably more useful, to me, than, say, John Robbins’ Deadlock Detection Tool.

It’s been quite hard work to get the tool to this state, and it’s not finished yet. When it’s finished it will be able to tell me if code will deadlock and I won’t have to look through all the code manually to try and work that out… Lazy is good! ;)