09 May 2007

The two tail problem

A and B are rarely separated by a clear unambiguous line. Statisticians refer to this as the two tail experiment. The attempt to separate A and B into 2 categories automatically will actually result in 4 groups:
  • correctly categorised as A (positives)
  • incorrectly categorised as A (false positives)
  • correctly categorised as B (negatives)
  • incorrectly categorised as B (false negatives)
In most real situations there isn't enough cash to reduce the size of the false positives and false negatives to zero so the designer has to decide where he will accept a higher level of false results. Over the weekend I have had an opportunity to review how one of my email providers had set the limits on their junk mail filters.

Like most people I get junk mail. Like most people I moan a bit about it but generally ignore it. Over the week-end one of my email providers had a significant problem and in only 4 days 87 megabytes of junk mail was sent into my junk mail box - almost 3000 messages - most of which was delivered last Friday and Saturday. That's a pretty unusual blip and since then it seems to be running at a more normal 1 megabyte per day.

Clearly, this email provider had attempted to reduce the level of false positives (emails incorrectly categorised as junk) at the expense of a higher level of false negatives (emails put into my normal inbox which were actually junk. I reviewed the junk mail box to check if there was anything in there that was worth keeping and found 4 false positives - but I was able to sort the junk in a different way than my email provider and was able to immediately throw out stuff from mailer daemons; postmasters; mail delivery systems; mail delivery sub systems and so on which meant that in the end I needed to review only a handful of messages to find the 4 messages which have now been correctly classified in their filter system.

No system is perfect and we need to keep checking that the results that we are offered by these type of systems are actually correct.