More on Spam

This is a follow up on my previous blog on spam blocking. Here I’ll detail my setup (pretty straight forward) and give some idea of the problems I had.

Stefan suggests I try Mozilla 1.3. I did know about it but I wanted to move my email delivery away from an all-in-one approach to one where I can control the processing more easily. If I think there is a better alternative to bogofilter, I want to be able to swap it in. If I want to add additional filtering, I want to be able to do that. Anyway, I think kMail is superior to Mozilla-Mail. I have also been experiencing crashes when browsing with Mozilla 1.2.1 and 1.3. It’s annoying when these take out your email too.

My system is a standard RedHat 8.0 system. I started by installing fetchmail. I’ve had some problems with fetchmail which I’ll detail further below. Configuring fetchmail is pretty easy. Fetchmail retrieves mail from my pop server and forwards it onto the local SMTP port. Sendmail picks this email up and forwards it to procmail. This is all standard on RedHat and I did not have to change any config there.

To invoke bogofilter, I needed a .procmailrc file. Stripped of its comments this file looks like this

PMDIR=$HOME/.procmail
LOGFILE=$PMDIR/pmlog
MAILDIR=$HOME/.procmail       # Make sure this directory exists!
INCLUDERC=$PMDIR/rc.bogo
INCLUDERC=$PMDIR/rc.main

The rc.bogo file contains procmail rules to run bogofilter. It looks like this

:0fw
| bogofilter -2 -u -e -p

# if bogofilter failed, return the mail to the queue, the MTA will
# retry to deliver it later
# 75 is the value for EX_TEMPFAIL in /usr/include/sysexits.h
:0e
{ EXITCODE=75 HOST }

# file the mail to spam-bogofilter if it's spam.
:0:
* ^X-Bogosity: Yes, tests=bogofilter
spam-bogofilter

This is pretty much from the bogofiklter docs. It runs bogofilter and causes it to classify mail as spam or not. It also updates the word lists (good and spam). Mail that is determined to be spam is placed into mbox file “spam-bogofilter”

The rc.main file is:

:0:
inbox

This just transfers mail from my system spool file to a file, “inbox”, in the .procmail directory. The next step is to configure kMail to pick up its email from these mbox files. Each mbox file, the inbox and spam files, is added as a receiving source with procmail locking. There is a lot of warnings in the documentation about locking mail files in kMail. It is important that the procmail rules use lock files, which the above do. kMail can send each receiving source into a separate folder which I do.

Fetchmail problems.

The problems I have had with Fetchmail come about mostly because I access my POP mailbox from both my work system and my home system. For this setup to work I leave mail on the server when it is downloaded. I like to have access to complete threads on each of my systems. Unfortunately I found that fetchmail would not download all mail on both systems. It seems to decide that some mail has already been read and not download it. If I direct it to fetch all mail, it would duplicate some email even though I have asked it to use UIDL to avoid that.

Also, I need to clean up the old mail to stop my POP account from overflowing. I do that by manually flushing the account. I miss the Outlook feature that allows you to delete files which are over a certain age. According to the fetchmail FAQ this feature is often requested but won’t be implemented as it is apparently out of fetchmail’s scope.

Faced with these problems, I have written a small Javamail based utility to perform the role of fetchmail with a few added features. I called it “popper” and it seems to be working quite nicely. I’ll blog about that in a few day’s time.

So, my email system now works as follows. Popper pops the mail from my POP servers and passes it to my local SMTP server, which is sendmail. Sendmail invokes procmail. Procmail invokes bogofilter to classify the email and finally kMail fetches the processed email and allows me to at last read it. It probably sounds more complicated than it is but it’s better than wading through spam.

Compulsory Voting

I have to vote this weekend. Voting in Australia is compulsory. That’s pretty unusual around the world. If you don’t vote here you get fined! We never hear about low voter turnout figures.

I guess it is an unusual model but I’m pretty used to it. Is voting more than a right? Is it a duty in a democratic system? I heard once that the theory behind compulsory voting is to stop people from being intimidated against showing up to vote.

Actually, strictly speaking, only attendance at a polling booth is compulsory – once there you can do what you like with the ballot paper including not voting at all.

This is a state election and the choice this weekend comes down to the right-wing government and the right-wing opposition. The campaign has been pretty subdued due to the larger issues in play around the world in the last few weeks.

Another unusual aspect of the Australian voting system is that it is preferential. You get to order your preference for each candidate. I like this system – it means you can vote for a minority candidate while still having a say in the selection of the major party candidates.

So, I’m off to number the boxes on Saturday …

C++ Retro

For the last few months at work I have been working on PocketPC development in C++. It’s funny to go back to a language I once knew so well and get back into its groove. It has helped to heighten my appreciation of Java, while also showing me some things I like about C++ that Java does not have.

Firstly, the separation of the interface and implementation of C++ classes (into .h and .cpp files) probably sounds like good practice but it’s just a pain. To have to deal with your class split over two files especially if some of the code is inlined in the header is tedious. Java’s all-in-one approach is much easier for me.

Having to know who owns allocated memory is blah. If I receive a buffer from a caller, am I expected to free it or will the caller? Are we in sync? Double frees and memory leaks are often difficult to track down. I resurrected my reference-counting templates to remove some of this headache. I also have to interact with a VB UI in this project and I’d not done a lot of that before so I’m still worried about all those BSTRs.

Templates are cool, when they work. Of course, getting templates to instantiate is always something of a black art. Either they are not instantiated at all or you get multiple symbols. I usually end up inlining most things in my template classes.

I love operator overloading. I never abuse it – really. I like to be able to have dynamic structures look like arrays.

Resource Acquisition is Initialization (RAII). This technique is great for things like associating critical sections with code blocks and not having to remember to cover all exit paths, etc. Sort of like a Java synchronized block. In C++ you are always thinking about allocation and management of everything you do. In Java, you don’t worry about this mostly. That ease of use can, however, lull you into to not properly managing non-memory resources such as JDBC connections.

The Microsoft IDE for embedded development makes some of these things easier but it has some annoying failings. Many times my classes just disappear from the class view. I don’t know what triggers this. When this happens lots of the automated wizardy things stop working. Adding a method to an interface under these conditions will not get a stub implementation added to the class providing the interface.

The most grevious problem, however, is that sometimes the IDE grinds to a halt whenever a line is added or removed in the current buffer. I used the SysInternals tools to try and understand what is going on. During these editor pauses the IDE is checking whether a disk is still mounted thousands of times. I’m guessing this is something in my environment but it has affected different machines (laptops, desktops, etc). I have had to resort to an external editor.

The PocketPC emulator sometimes actually starts up. When it’s runnign it seems to pop-up at the most inconvenient times.

I see from Simon Fell’s Blog that there may be an IDE update. I’ll have to check it out.

Anyway it’s good to look again at how different languages do things. It’s relaxing to come back to Java but perhaps the refreshed awareness of all those allocations will make me more careful about object creation in Java.

My First Program

Matt blogs about the the first program he ever wrote. His reverie has prompted me to relate the story of my own first program and early programming experiences.

My first program was written in BASIC in 1979 and executed on my brother’s Unix account on a PDP-11. He had given me some class notes he had for BASIC as part of his Chemical Engineering course. I was absorbed and wrote the progam on paper, executing it with paper virtual variables and writing it out many times. The program was, of course, a game – a ballistic cannon ball thing – enter the amount of powder and the angle to hit a target at a random distance. It took a while before I got access to the actual BASIC interpreter. It worked first go which was pretty cool.

I was hooked, then and there. My parents spent a fortune to buy a Tandy Model 1 TRS-80. Thinking of the machine’s specs now is almost funny (1 MHz 8 bit processor, 16k of RAM, 1k video RAM, 128 x 48 graphics res, cassette tape storage). Yet, as pathetic as it now seems, I would learn an immense amount from getting to know that machine.

I remember using POKE and VARPTR in a program to perform a mysterious operation called a “left shift”. I was disappointed to find it just multiplied by 2 – I could do that in BASIC already. Later I would get into hardware. through the machine’s expansion port and a 40-way edge connector. I remember slowing down programs by pulling the HALT line low for a split second. If you left it too long, it would blank the DRAM since the refresh also stopped, causing a crash.

I’m glad I started with such humble stuff because I squeezed the most I could from it. It was simple enough to understand everything about it – something which is probably not true today. I even still like to occasionally write assembler programs.

From that day I have used a variety of machines, rarely the most powerful available. The idea is to make the most of what you’ve got, not to covet the latest and greatest available. I did that once – anyone want a PowerMac 8100/80AV?

Disclaimer: I may be off on the MHz values, etc – all a bit hazy now.

Peaceful Thoughts or a Guinness

It seems we are on course for war. We all expect Australia to be a member of the coalition of the willing (is this the nation sized equivalent of the lynch mob?). The population appears, by and large, unwilling.

My wife suggests peaceful thoughts. It won’t do any good but what else is there? Well considering the day, I guess there is always a pint of Guinness. Eat, drink and be merry …