Mailing List Archive

[Spamassassin Wiki] Trivial Update of "CorpusCleaning" by JustinMason
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Spamassassin Wiki" for change notification.

The following page has been changed by JustinMason:
http://wiki.apache.org/spamassassin/CorpusCleaning

------------------------------------------------------------------------------
Firstly, do a mass-check. You will wind up with a 'spam.log' and 'ham.log' file. Run these commands to get a list of the 200 lowest-scoring spams, create a mbox file with just those messages, then open that mbox up in the "mutt" mail client:

{{{
+ cd /path/to/your/spamassassin/masses
sort -n +1 spam.log | head -200 > id.low
./mboxget < id.low > mbox
mutt -f mbox
@@ -29, +30 @@

Doing the same operation for FalseNegatives is similar, but reverses a few things... here's the commands to do that:

{{{
+ cd /path/to/your/spamassassin/masses
sort -rn +1 ham.log | head -200 > id.hi
./mboxget < id.hi > mbox
mutt -f mbox