Spam filtering
Our website uses postfix as it's MTA (mail transport agent), which also has the ability to forward incoming mails to so called MILTERS ("mail filters"). One of those milters configured on our server is the spam filter Rspamd.
The WebUI of the service is located at rspamd.muonpi.org.
The rspamd service itself is running on localhost:11332
with the UI running on localhost:11334
.
Contents
Terminology
Spam is unwanted mail, while Ham is referring to false positives.
From Wikipedia on Spam (food):
Spam has affected popular culture, including a Monty Python skit, which repeated the name many times, leading to its name being borrowed to describe unsolicited electronic messages, especially email.
Setup
Setup and configuration was done according to this guide.
Configuration for postfix is rather simple, just add inet:localhost:11332
to the smtpd_milters
field:
smtpd_milters = unix:/opendkim/opendkim.sock,inet:localhost:11332
Training the filter
Rspamd has the ability to learn to better filter spam. The training data is stored using a redis database.
Training can be done by two different methods: Via the WebUIs Scan/Learn Section of via the command line interface rspamc
.
The WebUI
After logging into the WebUI you can see the status of the filter. On the right hand side, a pie chart shows how many mails have been processed and the portions of rejected or annotated mails. The Table named Bayesian statistics shows how many mails have been declared as 'Spam' or as 'Ham'.
Training is done in the Scan/Learn tab of the WebUI. Paste the raw message source into the text field and click Scan message. Below you will see the result of the scan. The 'action' indicates what Rspamd would do if it where to receive that mail. The symbols listed give the merits on which a message was evaluated. They can either have a positive or negative value, indicating if their presence indicates if the given message is spam or ham.
You can tell rspamd to learn from this message by choosing Upload Ham or Upload Spam.
The CLI
Similar to the WebUI, you can check if a given mail is spam or not by calling rspamc suspicious.eml
.
After analysis, the found symbols and the collective score is shown.
To train the filter on this suspicious mail call rspamc learn_spam suspicious.eml
and rspamc learn_ham suspicious.eml
respectively.
Dovecot CLI: 'doveadm'
Since most of the incomming mail is recieved by OSTicket via support@muonpi.org which does not provide the raw message source in the tickets, doveadm
is used to get the message source.
NOTE: This can only be done with superuser permissions and gives full read/write access to all users mails. So be careful!
Use doveadm search [-u <user>|-A] [-S <socket_path>] <search query>
to search for mails. See wiki.dovecot.org for command reference and this page for search_query reference.
This one-liner will search and save mails from user <user> which were sent/recieved on the date <YYYY-MM-DD>:
doveadm search -u <user> ON <YYYY-MM-DD> | while read guid uid; do doveadm fetch -u <user> text mailbox-guid $guid uid $uid > $uid.eml; done