Dogfood

2004-04-19

I’ve been running my main mail feeds through my POP3 code for several weeks now. All my email gets pulled from the POP3 servers into my home-brew email system, it runs through my hardcoded filters which split “bad” mail from mail that might be good and finally Outlook connects to my mail server and pulls the email from my system back into production quality code land…

Occasionally I see ‘interesting things’ and these either become the focus for swift retesting session or else find their way into FogBugz…

Currently the most interesting unresolved issue appears to be a weirdness with Norton Anti Virus’ incoming email scanning. I have an email account that is now so much of a permanent spamfest that I never read anything that comes in via it, it’s a great source of test emails though… Sometimes this one account gets a mail in it that Norton AV barfs on. It barfs in such a way that it simply closes the connection to the client; this ‘feature’ exposes an annoying bug in my naive mail collection strategy…

Norton AV’s email scanning is probably implemented as a Winsock LSP. It intercepts socket traffic, works out its POP3 traffic and does ‘clever stuff’ TM to the bodies of the emails that you download - unless you use SSL, when it can’t, because it can’t ‘see’ the POP3 protocol for the SSL… My mail system acts as a client to external POP servers and so that I can test the SSL and non-SSL code paths I run Outlook as an SSL client against my server for some accounts and as a non-SSL client for some other ones. To make sure I get one dose of Norton on each email pipeline I run the mail gatherer as non-SSL on the SSL client accounts and as SSL on the non-SSL client accounts… A couple of emails a week on one particular POP3 account cause Norton to give up the ghost and close the client’s connection. I’ve yet to be interested enough to work out exactly what the root cause is, but… Currently my rather naive email collection strategy is this; connect to server, issue a STAT command to find out how many mails are waiting, for(size_t i = 0; i < resultOfSTAT; ++i) retrieve the email at index i using RETR i. If there’s an error, go bang, catch the banger, clean up and try again a bit later… When Norton decides it doesn’t like a particular email my system blocks up as it continually fails to collect the message at position i and then waits and tries again later. Until the bad mail at i is downloaded successfully the system won’t bother to try and get any later emails; bugger!

Outlook does it a little more sensibly. The POP3 protocol exposes an ‘optional’ command called UIDL that provides ‘a unique ID listing’ for a message. Outlook grabs the UIDL for all available messages and then downloads ones that it hasn’t flagged as downloaded before; this makes it more robust to failures that might otherwise result in duplicate message downloads and also allows it to work around ‘broken emails’ like the one that cases Norton AV to die… It seems that once Outlook starts to download a message and issues a RETR it flags the UIDL as downloaded. When Norton craps out, the session fails but Outlook never tries to download that message again, so it side steps the problem, whilst leaving the poison message on the server…

Using a UIDL aware message download strategy is pencilled in for a future version of my code, but for now, HTTP based configuration remains in the number one new feature slot.