the Procmail Filters Kit

(NOT CURRENTLY BEING MAINTAINED)


News

This set of anti-spam filters is very old and has not been updated in more than two years. I am working somewhat sporadically on a much better set of filters. The sanitizer portion of this is, however, actively being maintained and if you decide to use this kit even though it's very old, you should replace the html-trap.procmail from the kit with the current release of the sanitizer.

Due to very serious Denial-of-Service and buffer overflow bugs in IE4.x, MS Outlook and Outlook Express, Netscape Mail and Eudora Pro (see the bugtraq archives), I strongly recommend that anyone using these email clients or providing email service to people using these clients should visit my Enhancing E-Mail Security With Procmail page. Please visit this page before you grab the entire anti-spam kit if all you are interested in is site security. The full kit is not recommended for sitewide installation as a default filter.


Introduction

Procmail is a program that processes email messages looking for particular information in the headers or body of each message, and takes actions based on what it finds. If you're familiar with the concept of "rules" as provided in many major user mail clients (such as the cc:Mail client), then you are already familiar with the concept of automatically processing email messages based on their content.

Procmail is a much more powerful tool and has the added advantage of being run at the time the mail is received, rather than when you run your mail client program to read your mail.

Procmail processes messages based on filters. Basically a filter is a set of strings of characters to look for within the message, and instructions about what to do if those strings are found. These strings are encoded as "Regular Expressions" - a Regular Expression is a very powerful way of specifying how to look for a given string of text.

The procmail filters in this kit are intended to be installed on your ISP's computer so that they filter your email before you ever try to transfer it to your computer at home. This is the most efficient way to block messages - you never have to deal with them.

Of course, this assumes a few things. The most important assumption is that your ISP has procmail installed, and this implies that your ISP is running some flavor of UNIX. ISPs based on other operating systems (such as Windows NT) probably won't have the necessary tools available to you.

The second assumption is that you have a "shell account" on your ISP's computer. That is, you can log on to your ISP's computer, store files there, and run commands there.

If you don't know whether these two requirements can be met, contact your ISP and ask whether or not they have procmail installed, and whether or not you have a shell account.

If the answer to either of these questions is "no," or "we're not willing to provide that," you're not completely out of luck - simply run a UNIX variant, such as Linux, on your computer and filter your mail there. This is a little less efficient than filtering on your ISP's computer as the messages must be transferred to your system before they can be filtered.


Installation

Once you have determined where procmail will be run, you need to retrieve the filters kit. You can retrieve a copy from:
[ FTP Mirror 1 (US: UT) | HTTP Mirror 1 (US: WA) ]

The file is a compressed tar archive of procmail filter files. The filter files need to be uncompressed and extracted from the archive before they can be used. The following command line will do this:

tar -zxvf procmail-kit-beta.tgz

...if your tar complains about the -z argument, it is an older version that does not internally support compression. Try:

gzcat procmail-kit-beta.tgz | tar -xvf -
or
gzcat procmail-kit-beta.tgz > procmail-kit-beta.tar
tar -xvf procmail-kit-beta.tar

This will create a bunch of files named *.procmail in the current directory. If you have problems extracting the files from the archive, it can be tested by running the command:

gunzip -tv procmail-kit-beta.tgz

Take a look at root.procmail, the central file in the filters kit. If you're already using procmail, you may want to incorporate portions of this file into your existing .procmailrc file. You will also want to edit it to change the portion that deals with mailing lists, so that it matches the mailing lists you are subscribed to rather than the examples.

If this is your first exposure to procmail, then the simplest way to proceed is to copy root.procmail to .procmailrc, which is where procmail expects to find its filters.

All trapped messages are briefly logged in a log file named procmail.log, and backup copies of the last fifty messages received are kept in a directory named backup. (You should create this directory now.)

In addition, the trapped messages themselves are saved in spambox, frauds or quarantine as appropriate. You will want to periodically read and purge these files. They are mailbox files, so you can read them using, for example, Pine. You'll also want to periodically read and delete procmail.log.

Including text in automatic reply messages

You have the option of including some text when a message is automatically bounced after being trapped, for example, stating that a message trapped by the forgery rules is a forgery and that it should be investigated by the postmaster. The text files go in the same directory as the filter files, and have these filenames:

forgery-notice
fraud-notice
spam-notice

These files are not included in the kit (yet).

Microsoft email client users note:

The root.procmail included in the filters kit contains a call to a filter that strips off Microsoft formatting attachments from inbound email messages. Many Microsoft email client users neglect to turn off these attachments for Internet email, and for anyone who uses a non-Microsoft email reader they are an annoying waste of disk space and, more importantly if you pay by the hour, download time.

If you are using a Microsoft email client, you probably don't mind receiving these, so comment out the following filter by adding a # at the beginning of the line:

INCLUDERC=msbloat-trap.procmail

Unfortunately I cannot (yet) filter out the equally annoying habit some email clients have of sending the message, then adding a second complete copy of the message with (typically terribly inefficient) HTML formatting.


Limitations

These filters are not perfect. A short spam from a small-time spammer will probably be passed, as there is not enough context available for the filter to safely decide to trap it.

These filters are also very aggressive about spamhauses (those sites that are well known for spamming or for providing network access to lots of domains that spam - i.e. Cyberpromo, LLV, NancyNet, ad nauseum). If you have legitimate correspondence with someone at a spamhaus or who for some reason sends their mail through a spamhaus or one of the spamhaus minions, then you will have add a filter to explicitly accept mail from that person before the spam trap filters run. If you don't, their mail will be trapped and you won't see it.

The same applies to mailing lists. I have erred on the side of openness here. It would be easy to place the mailing list section after the spamhaus section, thus preventing spams to my mailing lists, but this would also bounce legitimate mail from spamhaus users to the mailing list. In my experience the spam on my mailing lists has been much less than the legitimate postings, so I let mailing list mail through largely unfiltered.


Updates

I have designed this kit to be easy for you to maintain - simply extract the updated kit on top of the existing kit to get the latest spam domain list - and easy for you to use - see the notes at the top of the notify and domain-trap filters for instructions on how to use them to add your own traps.

I have recently stripped a lot more of the traps out of root.procmail into more separate filter files. You'll want to redo your .procmailrc to take advantage of this. It should make keeping things up-to-date somewhat easier. I will eventually strip root.procmail down as much as possible.

I am currently incorporating enhancements such as a scripts directory, easier configuration and updating, and simpler goldlisting.


Notes

The regular expression syntax used by procmail is not exactly like that documented by the above link. Some of the more complex constructs, such as backreferences and GNU "Match-whatever" operators, are not supported, and there are some procmail-specific extensions. See the procmail man pages for details.

I can be contacted at <jhardin@impsec.org> (except by users in spam domains, of course :) - you could also visit my home page.

Fight spam!  Help stop spam - join CAUCE!


Bobby approved   Best viewed with Any Browser

$Id: procmail-kit.html,v 1.44 2001-11-25 10:21:39-08 jhardin Exp $