More on the socket server leaks

2006-01-21

It seems that the leak that I found isn’t likely to be the one that my new user of the code is having problems with. They’ve built the code with Visual Studio 2005, made a few (correct) changes to the code to get it to build with the stricter compiler and they find that several of the servers leak; at first it looked like it was just the more complex servers but I’ve just tested the simplest echo server and that leaks in just the same way. They’re obvious (big) leaks. Unfortunately, though the leak itself is obvious, the cause of the leak is not. I’m very busy with client work at the moment so I can’t promise that I’ll post a fix for this free code for at least a couple of weeks. But I guess the good news is that I will be posting a new version of the socket server code shortly and it will build (and not leak) with Visual Studio 2005.

My gut instinct is that it’s an STL thing…

[Update: My gut instinct seemed to be right… This seems to be due to a bug in VS2005’s STL. See here for a workaround.]

And now, more about the leak that I …

The ThreadPoolLargePacketEchoServer is, as I mentioned yesterday, the most contrived of the server samples. It presents a server that runs with IO buffers of one, small, size and messages that are much, much bigger. The server keeps a single buffer of a larger size per connection and accumulates the message into the large buffer until it has a complete message and at that point it sends it across to a second thread pool to “process” it. It’s contrived and I’ve never put such a design into production but it demonstrates several techniques to keep the completion of multiple pending reads in sequence.

There are, at least, 3 issues with the code as it stands. Firstly, as mentioned in the comments on the original article, there’s a server event handler missing for when sockets are released. This is called by the framework when all activity has completed on the socket and the socket has been closed and it about to be released to the pool (or destroyed). It’s the place where “per socket” data can be safely cleaned up and, well, the code as it stands doesn’t bother to clean it up.

Add this:

void CSocketServer::OnSocketReleased(
   Socket *pSocket)
{
   Output(_T("OnSocketReleased"));
  
   CPerConnectionData *pPerConnectionData = GetPerConnectionData(pSocket);
  
   SetPerConnectionData(pSocket, 0);
  
   delete pPerConnectionData;
  
   m_pool.OnSocketReleased(pSocket);
}

Next we have the code that keeps the buffers in sequence. I’ve never really liked this code. The latest version is a bit better than this old version but it’s still fairly crufty. Anyway, it needs a destructor in case it’s ever destroyed whilst containing buffers - this is unlikely but it could happen.

CIOBuffer::InOrderBufferList::~InOrderBufferList()
{
   for (BufferSequence::iterator it = m_list.begin(); it != m_list.end(); ++it)
   {
      CIOBuffer *pBuffer = it->second;
  
      pBuffer->Release();
   }
}

Finally the leak that I was looking for yesterday. The code that processes the buffers in the “business logic” thread pool jumps through all manner of hoops to try and deal with the fact that the old code’s design was fairly poor at handling writes to sockets that had been closed. Given how the test harness that ships with the code works there’s a real chance that the server will be in the middle of echoing a message back to the test harness when the test decides that all is done and closes the socket. Yes the test harness should behave better but the code in question in the server should also deal with the situation better. Right now it tries to but, if the socket closure happens at just the right time it will leak the current data buffer.

Add this:

   if (pBuffer)
   {
      pBuffer->Release();
   }

right at the end of void CThreadPoolWorkerThread::ProcessMessage().

To be quite honest, I’d be surprised to find an architecture where this particular example server would be useful. It’s an interesting demonstration of some gnarly problems but it’s not code to just take and use without understanding what it’s trying to demonstrate.

I’ll post a complete new code drop for these servers once I get a chance to switch them to VS2005 and test them. Note that this will not be the all new, redesigned, refactored, much more performant version of the code that I sell to my clients. It’ll be a minor fix to the existing code.