Cut Spam with SpamAssassin Rules

I don’t admit it often, but I read email headers. It’s my form of entertainment. I get, like everyone, a good amount of spam. The other day I decided to try and take an active role in blocking it. While I had always used the SpamAssassin options available through my control panel, I had never really gone deep into SpamAssassin Rules.

SpamAssassin is an open source project to assist with the filtering of spam. It runs on the server-side, and is available through many hosting plans. The popular CPanel interface has an icon dedicated to it. The examples in this post are based on the CPanel configuration, which will be enough for most users. If you are interested in more advanced configuration options, I provide a link below.

The idea is simple, while the actual function is complex. SpamAssassin examines all incoming emails and assigns scores based on a combination of header and text analysis, Bayesian filtering, DNS blocklists, and collaborative filtering databases. Each test adds to the score and the result is then used to decide if the email is spam, or not. The higher the number the more likely it is that the message is spam.

The most common settings are:

Enable/Disable: Enables or disables the filtering
Required Score: The score at which an email is considered spam
Auto Delete Enable/Disable: Enables auto-deletion of emails
Auto Delete Score: The score at which the emails will be deleted automatically
Blacklist: A list of email addresses or patterns that should always be blocked
Whitelist: A list of email addresses or patterns that should always be allowed
Rule/Test Scores: The ability to override the default scoring system

Many of these are self-explanatory, and the scoring system is interesting.

Note that, in the list above, there are three scores – Required Score, Auto Delete Score, and Rule/Test Score. The third one is the one I have come to see as very powerful, and we’ll get to that in a moment, but first lets examine the others. Both the Required Score and the Auto Delete Score default to 5. (Remember that a higher number means it is more likely to be spam). What many people don’t realize is that these two scores do not need to be the same. Before you start changing these, you need to understand that if you are too aggressive you may end up with false positives, which would mark valid emails as spam. Worse yet, you could auto delete an important email. Your clients will frown on this, so be careful. If you adjust it in small amounts, and then watch your spam folder, you can tune it over time.

I have my Required Score set at 2.7. That means any email that scores 2.7 or higher will be marked as spam. SpamAssassin then adds ***SPAM*** to the start of the subject line, and adds a flag in the email header as well. Your email client can then use the header information to filter the spam into a SPAM /Junk Mail folder. I suggest you watch those, read their headers, and learn what score those emails are getting. Are there any false positives? If so, adjust the settings as needed.

I have the Auto Delete Score set at 4. That means any email scoring 4 or higher is removed. I never see it. The result is that any email scoring 2.6 or less makes it to my inbox, 2.7 – 3.9 is flagged as SPAM, and 4.0 and higher are deleted.

These settings were actually working for a long time. Recently however, I have noticed some changes. I have seen an increase in emails that are clearly spam but not scoring high enough to be auto-deleted. These spammers, while I hate them, are getting smarter. They figured out how to beat the default, well published rules. No problem, we can change the default scoring to match their messages. First, we need to know what rules to change, and that’s where reading the headers comes in. Below is a sample email header from a recent spam email:

X-Spam-Status: Yes, score=2.9

X-Spam-Score: 29

X-Spam-Bar: ++

X-Spam-Report: Content analysis details:   (2.9 points, 2.7 required) pts rule name              description —- ———————- ————————————————– -0.0 SPF_HELO_PASS          SPF: HELO matches SPF record 0.2 HTML_IMAGE_RATIO_04    BODY: HTML has a low ratio of text to image area 0.0 HTML_MESSAGE           BODY: HTML included in message 1.7 MIME_HTML_ONLY         BODY: Message only has text/html MIME parts 1.1 HTML_MIME_NO_HTML_TAG  HTML-only message, but there is no HTML tag

X-Spam-Flag: YES

In the list of rules you will notice that each rule is preceded with the score given this email. For example:

1.1 HTML_MIME_NO_HTML_TAG  HTML-only message, but there is no HTML tag

In this case, the test name is HTML_MIME_NO_HTML_TAG and it has a default score of 1.1. Why would an HTML message not have an HTML tag? Most clients, such as Apple Mail or Outlook, would create properly written emails – we hope. So SpamAssassin gave this email 1.1 points for the error. What you need to do is identify a pattern in the spam messages you are getting.

In my case I found the rule RDNS_NONE was very common. This test checks the reverse DNS for the last untrusted relay. The default score for this rule is .1 and I was seeing this in well over 50% of my spam. I decided to change it to 2. After a week or so, I noticed that the new score was causing more spam to be caught. I was getting no false positives, and some obvious spam messages were now being flagged properly. For most,the resulting score was around 2.9. That meant the message would have had only a .9 before my change. A lot of messages were now landing in the 3.5-3.8 area. In an attempt to push those to the auto delete range, I bumped the RDNS_NONE score to 2.5 and the amount of Inbox spam fell greatly. Some was auto-deleted, and some was simply moved to the Spam / Junk Mail folder. While I still check the Spam folder for false positives, the main goal was achieved –  it was not in my Inbox.

The process for entering a test with a custom score is not obvious, so the next screen shot displays the test entered before it is applied:

It is important to note that the RDNS_NONE test is not without risk. As stated on the official site:

Note that this may be done by interpreting information in the relevant Received header – if reverse DNS checks are not performed by the first trusted relay, or if they are not recorded in the Received header, this test will be triggered (regardless of the actual rDNS status).

That means that false positives are possible, but that is why my Auto Delete Score is higher than the Required Score. While a legitimate message may get 2.5 points for this test, I am gambling it won’t get enough total points to be auto-deleted.

RDNS_NONE is just one example of possible tests, but why stop there? By customizing a well thought out list of test scores you can greatly cut down on the amount of spam in your inbox. You can find a list of  the available tests for your version of SpamAssassin, and their default scores, here.

For those that want more control, you would require access to a few configuration files. Depending on the file, changes can be site-wide or user-specific. See the documentation for more information.

2 Responses to “Cut Spam with SpamAssassin Rules”

  1. Darren says:

    Thanks for this post. I am always looking for other real world examples of people using SpamAssinsin and how they are combating spam. I don’t read headers that often, but with a recent increase of spam to my inbox, I’m reviewing all my settings and reading headers once again. Thanks for sharing your experience!

  2. Razz says:

    Hey Darren, Glad you liked it and I hope it helps you. I have included a few more below with descriptions and default scores (local, net, with bayes, with bayes+net). They all with “Day Old Bread”, or a new domain.

    DNS_FROM_DOB = Sender from new domain (Day Old Bread): 0 0.341 0 0.732
    RCVD_IN_DOB = Received via relay in new domain (Day Old Bread): 0 0.835 0 1.103
    URIBL_RHS_DOB = Contains an URI of a new domain (Day Old Bread): 0 0.901 0 1.083

    Thanks for stopping by!

    Razz