More on Spam

This is a follow up on my previous blog on spam blocking. Here I’ll detail my setup (pretty straight forward) and give some idea of the problems I had.

Stefan suggests I try Mozilla 1.3. I did know about it but I wanted to move my email delivery away from an all-in-one approach to one where I can control the processing more easily. If I think there is a better alternative to bogofilter, I want to be able to swap it in. If I want to add additional filtering, I want to be able to do that. Anyway, I think kMail is superior to Mozilla-Mail. I have also been experiencing crashes when browsing with Mozilla 1.2.1 and 1.3. It’s annoying when these take out your email too.

My system is a standard RedHat 8.0 system. I started by installing fetchmail. I’ve had some problems with fetchmail which I’ll detail further below. Configuring fetchmail is pretty easy. Fetchmail retrieves mail from my pop server and forwards it onto the local SMTP port. Sendmail picks this email up and forwards it to procmail. This is all standard on RedHat and I did not have to change any config there.

To invoke bogofilter, I needed a .procmailrc file. Stripped of its comments this file looks like this

PMDIR=$HOME/.procmail
LOGFILE=$PMDIR/pmlog
MAILDIR=$HOME/.procmail       # Make sure this directory exists!
INCLUDERC=$PMDIR/rc.bogo
INCLUDERC=$PMDIR/rc.main

The rc.bogo file contains procmail rules to run bogofilter. It looks like this

:0fw
| bogofilter -2 -u -e -p

# if bogofilter failed, return the mail to the queue, the MTA will
# retry to deliver it later
# 75 is the value for EX_TEMPFAIL in /usr/include/sysexits.h
:0e
{ EXITCODE=75 HOST }

# file the mail to spam-bogofilter if it's spam.
:0:
* ^X-Bogosity: Yes, tests=bogofilter
spam-bogofilter

This is pretty much from the bogofiklter docs. It runs bogofilter and causes it to classify mail as spam or not. It also updates the word lists (good and spam). Mail that is determined to be spam is placed into mbox file “spam-bogofilter”

The rc.main file is:

:0:
inbox

This just transfers mail from my system spool file to a file, “inbox”, in the .procmail directory. The next step is to configure kMail to pick up its email from these mbox files. Each mbox file, the inbox and spam files, is added as a receiving source with procmail locking. There is a lot of warnings in the documentation about locking mail files in kMail. It is important that the procmail rules use lock files, which the above do. kMail can send each receiving source into a separate folder which I do.

Fetchmail problems.

The problems I have had with Fetchmail come about mostly because I access my POP mailbox from both my work system and my home system. For this setup to work I leave mail on the server when it is downloaded. I like to have access to complete threads on each of my systems. Unfortunately I found that fetchmail would not download all mail on both systems. It seems to decide that some mail has already been read and not download it. If I direct it to fetch all mail, it would duplicate some email even though I have asked it to use UIDL to avoid that.

Also, I need to clean up the old mail to stop my POP account from overflowing. I do that by manually flushing the account. I miss the Outlook feature that allows you to delete files which are over a certain age. According to the fetchmail FAQ this feature is often requested but won’t be implemented as it is apparently out of fetchmail’s scope.

Faced with these problems, I have written a small Javamail based utility to perform the role of fetchmail with a few added features. I called it “popper” and it seems to be working quite nicely. I’ll blog about that in a few day’s time.

So, my email system now works as follows. Popper pops the mail from my POP servers and passes it to my local SMTP server, which is sendmail. Sendmail invokes procmail. Procmail invokes bogofilter to classify the email and finally kMail fetches the processed email and allows me to at last read it. It probably sounds more complicated than it is but it’s better than wading through spam.

2 Replies to “More on Spam”

  1. I didn’t mean to suggest you should use Mozilla 1.3,
    I wouldn’t read my mail with Mozilla either 😎

    My setup is quite similar to yours – well, in a way –
    and I’ve gotten around the fetchmail problem by not
    leaving mail on the server but “rsync”ing my mail
    between home and work.

    Writing a fetchmail replacement tailored to my needs
    would have been option 2, I guess.

  2. Thanks for the detailed information Conor. I also have a similar setup to you, except that I store my mail on my own server and access it via IMAP.

    This way I use both Microsoft Outlook (when I have my laptop) and SquirrelMail (when I have a browser only) to read my e-mail. I do not have any sync problems as this is the problem IMAP was setup to solve.

    That other advantage is that by using procmail to sort my e-mail, I have my sorting rules applied on the server. This is pure magic.

Comments are closed.