Log file rotation

I’m in the process of adjusting the asynchronous file log that we use in some of our servers. The log works well and write performance is good for a variety of reasons that I deal with here. The current changes are mainly due to the fact that we have some clients who want to automatically create a new log file every so often whilst the server is running. This hasn’t been a requirement before because most of our servers don’t continually spew out log data. This is probably due to my feelings about debug trace logs, but we simply don’t have that many clients who actually run with this kind of logging turned on. However, we’ve had a couple of recent requests for this kind of thing.

Adding the ability to create new log files based on a time schedule isn’t especially hard, the problem comes from maintaining the performance that you gain from having the file written in an async manner. The first issue that I came across was that I couldn’t destroy the old log file until all of the outstanding writes to it had completed. Up until now the CAsyncFileWriter class exported a Wait() method which could be used to wait until any pending writes had completed. The destructor to the class called this, so a simple call to delete on an instance of the async file writer could take a while… It’s usually better to call Wait() explicitly before you do the delete to make your intentions clearer but even so, a simplistic file replacement strategy of “create new file, swap for old, delete old” would cause the thread that was doing the delete to wait for all outstanding writes to complete. On a busy log file this could be unacceptable.

The solution is to add a new callback method to the callback interface that users of the async file log currently use to be informed of errors during the async writes and other such stuff. By adding a method that is called when the last outstanding write occurs we can use that to trigger the deletion of the object. Of course the class that implements the callback interface needs to hang around long enough to do that but that’s fine for us. So, the way we replace the file log is as follows: “store old log file class in a set of ‘previously active logs’, set the old log file to callback on final write completion, create new file.”. The callback will occur on one of the log’s IO threads and when it does we can look in the set of previously active logs, remove the log and delete it. If the class that houses all of these logs is deleted whilst there are items in the previously active logs set then it can wait for them to finish writing and then delete them.

The next problem is that creating a new file and switching to it was a potentially time consuming operating which, again, could stall the actual logging. I’m currently investigating the performance implications, but I hope that we should be able to create the new log file fractionally ahead of time on a timer queue thread and then switch to the new log simply by exchanging pointers at the actual time that we want to change the log over…