I've been checking my spamassassin scores, and wasn't particularly happy with it's ability to rate SPAM. Given it's reputation I was pretty sure that it was due to a configuration error on my part.
I had a sneeky suspicion that not all the "tests" I was running were actually happening. I ran spamd in debug mode, and started seeing a bunch of interesting errors. In short, I was missing Net::DNS, so anything that required any sort of DNS lookup was failing.
Simply did the perl -MCPAN -e 'install Net::DNS' thing and that got DNS lookups working.
Then started to debug the other errors in the log. Apparently the bayes plugin was having problems talking to my bayes files because I didn't have DB_File. Tried the CPAN trick again, which failed because it couldn't find db.h. Turns out I didn't have Berkeley DB installed. Short trip to sunfreeware.com later:
- db-4.2.52.NC-sol10-intel-local
It installs in /usr/localBerkeleyDB.4.2 . Given that CPAN was looking in /usr/local/BerkeleyDB, I created the link, and hey presto perl -MCPAN -e 'install DB_File' worked fine.
At that point I ran sa-learn against my Junk folder to generate some content, and checked the permissions of the bayes files. (Note: bayes_path in the local.cf for spamassassin needs to contain the prefix of the bayes files. e.g.:
use_bayes 1
bayes_path /export/home/spamd/bayes/bayes
bayes_file_mode 0666
# ls /export/home/spamd/bayes/
bayes_journal bayes_seen bayes_toks bayes.mutex
Now that its all sorted, I'm getting much higher SPAM scores, and it seems much more accurate.
Some would say I should have just used Blastwave, but this was more fun, and a good learning experience.