<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Digital-Traffic.net &#187; SpamAssassin</title>
	<atom:link href="http://digital-traffic.net/tag/spamassassin/feed" rel="self" type="application/rss+xml" />
	<link>http://digital-traffic.net</link>
	<description>Public thoughts of a systems administrator</description>
	<lastBuildDate>Tue, 03 Apr 2012 17:16:34 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>SpamAssassin: Dealing with unrecognized spam</title>
		<link>http://digital-traffic.net/technology/spamassassin-dealing-with-unrecognized-spam</link>
		<comments>http://digital-traffic.net/technology/spamassassin-dealing-with-unrecognized-spam#comments</comments>
		<pubDate>Sat, 03 May 2008 20:07:23 +0000</pubDate>
		<dc:creator>Brian Shacklett</dc:creator>
				<category><![CDATA[Technology]]></category>
		<category><![CDATA[linux]]></category>
		<category><![CDATA[Maildir]]></category>
		<category><![CDATA[scripting]]></category>
		<category><![CDATA[spam]]></category>
		<category><![CDATA[SpamAssassin]]></category>

		<guid isPermaLink="false">http://digital-traffic.net/blog/?p=25</guid>
		<description><![CDATA[Everyone hates spam, and one of the main ways that people are fighting it is through the use of SpamAssassin. I&#8217;ve been using it for a while now and have Sieve detecting spam headers and moving them to my Junk folder. The Problem Dealing with spam that went unrecognized has been more of a manual <a href='http://digital-traffic.net/technology/spamassassin-dealing-with-unrecognized-spam' class='excerpt-more'>[...]</a>]]></description>
			<content:encoded><![CDATA[<p>Everyone hates spam, and one of the main ways that people are fighting it is through the use of <a href="http://spamassassin.apache.org/">SpamAssassin</a>. I&#8217;ve been using it for a while now and have Sieve detecting spam headers and moving them to my Junk folder.
</p>
<h3>The Problem</h3>
<p>
Dealing with spam that went unrecognized has been more of a manual process. Every once in a while, I&#8217;d have to segregate all of my useful mail from the spam and run &#8220;sa-learn&#8221; on the leftovers. This isn&#8217;t horrible, because I tend to shell into my server fairly frequently, but I really prefer to have menial tasks like this automated.
</p>
<h3>A solution</h3>
<p>
First of all, I created a folder in my mailbox called &#8220;Unrecognized Spam&#8221;. The name isn&#8217;t important, really. It just needs to be a place to file away all of those messages that SpamAssassin didn&#8217;t catch on the way in.<br />
Once that was done, I wrote a very simple little script, which I dropped in /etc/cron.daily/:
</p>
<p><span id="more-25"></span><br />
[sourcecode language='bash']#!/bin/bash</p>
<p>SPAM_DIR=&#8221;/home/bshacklett/Maildir/Unrecognized Spam/cur&#8221;</p>
<p>cd &#8220;$SPAM_DIR&#8221;<br />
sa-learn &#8211;spam .;<br />
rm *<br />
[/sourcecode]</p>
<p>
Nasty, I know, but it did the job. All I had to do when I got spam that went unnoticed by SpamAssassin was drag it into my &#8220;Unrecognized Spam folder&#8221; and it would be learned and gone within 24 hours. Of course, I was also getting mail from the cron daemon complaining when there weren&#8217;t any emails to learn from or delete.
</p>
<h3>Improvements</h3>
<p>
So, this morning I had a little spare time, so I decided to improve on the script a bit:
</p>
<p>[sourcecode language='bash']#!/bin/bash</p>
<p># Constants<br />
SPAM_PATH=&#8221;Maildir/.Unrecognized Spam/cur&#8221;;</p>
<p># Find all of the directories directly under /home/<br />
homeDirectories=(`find /home/ -maxdepth 1 -mindepth 1 -type d`);</p>
<p># Loop through the found directories and check for spam<br />
for homeDirectory in ${homeDirectories[*]}<br />
do<br />
    fullSpamPath=${homeDirectory}/${SPAM_PATH};</p>
<p>    #Check if the spam directory exists under this home directory<br />
    if [ -d  "${fullSpamPath}" ]; then</p>
<p>        # Check if there is mail under the spam directory<br />
        if [ "$( ls -A "${fullSpamPath}" )" ]; then<br />
            sa-learn &#8211;spam &#8220;$fullSpamPath&#8221;;<br />
            rm &#8220;${fullSpammPath}/&#8221;*;<br />
        fi<br />
    fi</p>
<p>done<br />
[/sourcecode]</p>
<p>
Now I know I&#8217;m not a great shell scripter, but this is working pretty well. It basically scans all of the home directories and looks for the &#8220;Unrecognized Spam&#8221; directory under each one. If it finds it, it will test to make sure that there are emails in the folder, then learn them and remove them.
</p>
<h3>Caveats</h3>
<ul>
<li> This isn&#8217;t going to scale all that well. I&#8217;m guessing it would be fine for 200 users or less, as it runs at night, but it would need some tweaking for anything more.</li>
<li> As it is, this requires that your mail be stored in the <a href="http://en.wikipedia.org/wiki/Maildir">Maildir</a> format. I know that sa-learn can work with mBox stores, but I&#8217;m not sure how you&#8217;d target it effectively.</li>
</ul>
<div name="googleone_share_1" style="position:relative;z-index:5;float: right; margin-left: 10px;"><g:plusone size="standard" count="1" href="http://digital-traffic.net/technology/spamassassin-dealing-with-unrecognized-spam"></g:plusone></div>]]></content:encoded>
			<wfw:commentRss>http://digital-traffic.net/technology/spamassassin-dealing-with-unrecognized-spam/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>

