Async Pop

2004-03-25

A while back I finally started on the async version of the POP3 client. It ended up as a state machine and seemed to work well. The next step was to write an async version of the mail collector that I used to pull all mail from a mailbox on a server and optionally delete it.

The synchronous version of the mail collector is a fairly simple thing. You pass it the mailbox and server details and it decides how to log in; it uses APOP if the server supports it and USER/PASS if it doesn’t. Once logged in it does a STAT to find out about the mail that is waiting and then calls RETR and, optionally, DELE for each message; simple stuff. The async version works in a similar way but uses the async client, it has its own state machine and operates at a level above the state machine in the pop client. So you just pass it the connection details, tell it to start and push all of the data that comes in on the connection into it and it reports on its current state as the state machine collects mail. Works nicely and uses much less processor time than the sync version. :)

This was all done test first without any network connectivity required. The tests simply mocked up the input and output streams and pushed data into the collector and watched the cogs turn. Once the tests were passing I integrated the code into a real client app which used the newly refactored server framework’s IOCP based client connection handler to deal with the network. Once started, the collector only runs in response to an IOCP read completion packet on the socket; data arrives, it gets fed into the collector, the collector passes it to the client, the client’s state changes as commands complete and the collector sends more commands. What was nice was that I didn’t need to bend the code at all during integration. Of course, I know the domain fairly well now, but it was nice to see that the tests hadn’t taken me off in the wrong direction…

With the async client running fine as a standalone application I decided to merge the client, filter chain and server into one application for ease of management. Now that the client used the same networking design as the server they could share IO threads and the structure of the program was cleaner and ran more efficiently. I hard coded in some of my mailboxes and set up the client to kick off the collections so that they would run and then queue themselves to run again after a certain period of time. When the collection completes the filter chain for that mailbox runs. The server runs independently and allows local mail clients to access the filtered mail.

There are a couple of low priority items on my list at present which are more technical challenges than actual requirements. At present the message store interacts with the file system using blocking reads and writes and I’d prefer to adjust it so that it uses the IOCP method… The filter chain should be running in its own thread pool rather than on the IO threads (a dirty desire for instant gratification took me down a route where I simply run the filter chain when the collector finishes and this is in the context of the same completion notification as the collector and it shouldn’t be; ideally we’d fire off a request to the other thread pool to filter the mailbox as the filtering blocks an IO thread for a length of time). These will, no doubt, wait as what I have “works”…

The actual requirements are more interesting; top of the list is to remove the hard coded configuration. At present the system is set up to run for 4 of my POP3 accounts and filter them in a particular hard coded way. This needs to become a configurable thing. Once it’s configurable we need a way to change the configuration. I expect we’ll plug a simple web server into the app and use that for configuration… Meanwhile I could work on the filters some more, I have some more filters to write and the framework as a whole is now working pretty well and is getting some decent testing (I still haven’t switched my main email accounts over yet though…)