Mailing List Archive

selling a client on SA
I have a client who is vacillating between SA and some commercial
products. Most of the commercial products are not as accurate as SA but
my client has some valid points. I have limited experience setting up SA
(always plain vanilla installs through RH and single user systems) and I
wanted to get some feedback on what is possible/feasible from the list.
It would be used in a multi mail server env where all external mail
would be routed through it for checks and modified headers, subjects
would tag spam.
The issues are:
1) Does SA auto upgrade to newer versions (of the spam defs more
importantly) or could it be set up to do so through an automated CPAN
install?
2) Can individual users easily have their own training files (I know it
is possible, looking more for feasibility) when they do not have an
account on the system. ie; their training file could be located via the
addressed email. If so, is there anything out there to manage that? If
not would it be possible to write a mail handler to parse the mail
addressed to a spam@host mail box that users could forward spam to? This
script could then use the from line to determine which user to train on.
3) What other issues will I run into in a multi user environment where
the SA server is simply forwarding on mail between MTAs and not the
final delivery agent?

Thanks in advance,
Tom
RE: selling a client on SA [ In reply to ]
> 1) Does SA auto upgrade to newer versions (of the spam defs more
> importantly) or could it be set up to do so through an automated CPAN
> install?

not that i'm aware of, I suspect it could be done, but in the year or so
I've been using it on our relay here I just spend ten minutes whenever a new
release comes out and update it.. it doesn't take long and (so far) just
works..

> 3) What other issues will I run into in a multi user
> environment where
> the SA server is simply forwarding on mail between MTAs and not the
> final delivery agent?

not that i've encountered, i'm surprised it does as well as it does given
it's using one set of settings for 600 or so mailboxes (i'd be interested in
the answer to question #2).
Re: selling a client on SA [ In reply to ]
From: "Thomas Bolioli" <tpblists@terranovum.com>

> I have a client who is vacillating between SA and some commercial
> products. Most of the commercial products are not as accurate as SA but
> my client has some valid points. I have limited experience setting up SA
> (always plain vanilla installs through RH and single user systems) and I
> wanted to get some feedback on what is possible/feasible from the list.
> It would be used in a multi mail server env where all external mail
> would be routed through it for checks and modified headers, subjects
> would tag spam.
> The issues are:
> 1) Does SA auto upgrade to newer versions (of the spam defs more
> importantly) or could it be set up to do so through an automated CPAN
> install?
> 2) Can individual users easily have their own training files (I know it
> is possible, looking more for feasibility) when they do not have an
> account on the system. ie; their training file could be located via the
> addressed email. If so, is there anything out there to manage that? If
> not would it be possible to write a mail handler to parse the mail
> addressed to a spam@host mail box that users could forward spam to? This
> script could then use the from line to determine which user to train on.
> 3) What other issues will I run into in a multi user environment where
> the SA server is simply forwarding on mail between MTAs and not the
> final delivery agent?
>
> Thanks in advance,
> Tom

Setup is not all that hard. It takes time to train the filters. My
experience indicates this is required. Generally each user must train
their own Baysian filter or else a generic training must be setup by
the administrator. That involves looking at spam and ham both as they
come through. *I* would not do that since it involves looking at
another person's mail.

And in an environment with (we're stuck with it) Windows Outlook Express
for the mail readers and Linux for the spam filters generating the spam
and ham databases is a royal pita. (I know *I* am not about to run
Outlook. And converting to another mail tool at this point is somewhat
er "awkward" to say nothing about painful.)

Note that I am still using SpamAssassin in preference over other
potential tools. It works. I have personal control over it. But then,
I am a programmer by trade these days. I'd not try to install it for
my brother under these conditions, for example. (He's a rather er
unimaginative counter of Ford automobile dealership beans.)

{^_^}
Re: selling a client on SA [ In reply to ]
Thomas Bolioli wrote:

Hi Thomas

> 1) Does SA auto upgrade to newer versions (of the spam defs more
> importantly) or could it be set up to do so through an automated CPAN
> install?

No, and you might not like that. Auto-update sounds too commercial to
me, and takes off from you the privilege of knowing what goes on during
an update (which includes what and why something went wrong).

Look, when I moved from SA 2.55 to 2.61 I had to rebuild the Bayes
database, which required to stop spamd daemon. 2.7 might work only on
Perl 5.8+, ... you don't want auto-update to update Perl as well, don't
you?! :-)

> 2) Can individual users easily have their own training files (I know it
> is possible, looking more for feasibility) when they do not have an

Look, here we are routing about 2GB of Internet incoming mail every day
(40000+ mails per day), towards about 5-6000 mailboxes spread over a
dozen of domains. None of the individuals ever asked for rules
customization. I think it's a matter of how you "sell" the antispam
solution (guarantee 90% hit ratio, so they'll tolerate what passes
through, then produce statistics).

> 3) What other issues will I run into in a multi user environment where
> the SA server is simply forwarding on mail between MTAs and not the
> final delivery agent?

Bounces! One of the domains here receives only spam (90% of the whole
traffic). These do bounce back to inexistant addresses, usually.

High-Availability is another issue.

Rather than marking subjects you might consider quarantining spam on the
transit MTA. This reduces bounces too!

Paolo
Re: selling a client on SA [ In reply to ]
Op maandag 16 februari 2004 16:08, schreef Thomas Bolioli:
> 2) Can individual users easily have their own training files (I know it
> is possible, looking more for feasibility) when they do not have an
> account on the system. ie; their training file could be located via the
> addressed email. If so, is there anything out there to manage that? If
> not would it be possible to write a mail handler to parse the mail
> addressed to a spam@host mail box that users could forward spam to? This
> script could then use the from line to determine which user to train on.

Actually, I'm deploying SpamAssassin as we speak in just an environment like
you describe. I'm using PostgreSQL as central information server. For the
record, I'm trying to automate a server that provides Virtual Servers for
clients. They get an all-in package at my place, which means unlimited
email-account, unlimited forwarders, SpamAssassin, virusscan, etc.

I'm using PostgreSQL as a central information server. Postfix (my preferred
MTA) get's most of it's configuration from PostgreSQL, Courier-IMAP and -POP
get it's information from the same PostgreSQL catalog. This week, I'm going
to install SA that's in Subversion, because it sports Bayes and AWL totally
DB-based. Which means everything will be done in the database, users only
need diskspace for their maildir (which could actually be in the DB too, but
we've chosen a different path for that).

It should be possible with Michael's neat DB-patches for Bayes and AWL in a
DB. I'm going to try this week if I can get it to work and I will hopefully
write a HowTo on how to do this in the coming six weeks. If I can get it to
work.

The largest difficulty I see at this time, is how I can get the procmail in
order, so people can teach their own filters and stuff. I'll find a way,
though.

--
Met vriendelijke groet,
Tim Stoop
Complete Internet Development
http://www.cidev.nl

Random quote/fortune:
You're dead, Jim. -- McCoy, "Amok Time", stardate 3372.7
Re: selling a client on SA [ In reply to ]
>
> And in an environment with (we're stuck with it) Windows Outlook Express
> for the mail readers and Linux for the spam filters generating the spam
> and ham databases is a royal pita. (I know *I* am not about to run
> Outlook. And converting to another mail tool at this point is somewhat
> er "awkward" to say nothing about painful.)
>

I assume you mean that it's a pain to submit SPAM and HAM messages to feed
bayes...

It isn't as big a PITA as you might think. (I added this to the Wiki
recently, so perhaps you've not seen it.) It is more work than "auto"
processing - but for small sites, I much prefer to hand check submissions.
We are using a shared bayes DB for say 50 or less users.)

You CAN get messages from users unmodified pretty easily. You don't have to
use IMAP shared folders either.

First I setup two mail drop boxes - call them spam@domain.com
ham@domain.com.

Then I simply have users do this in Outlook/Outlook Express

Open a new mail message.
Address new message to the correct drop box. (spam@domain.com or
ham@domain.com)
Drag messages that apply, Ham or Spam into the new message - they'll be sent
as attachments.
(Make sure users don't do both ham and spam together in the same message.
That will make life a pain!)
Then I can pick up the messages myself with IMAP and review them for real
hammy/spammy-ness - and drag them into another IMAP folder for processing.
Then use sa-learn to teach bayes on the IMAP folder using the --mbox option.

(Plus, I can do all of this remotely. I'm a consultant, and the more I can
do from off site, the better!)
(I'm a newbie too, so perhaps I'm misunderstanding things - but this is a
pretty decent way to do things.)

Greg

----- Original Message -----
From: "jdow" <jdow@earthlink.net>
To: <spamassassin-users@incubator.apache.org>
Sent: Monday, February 16, 2004 7:26 AM
Subject: Re: selling a client on SA


> From: "Thomas Bolioli" <tpblists@terranovum.com>
>
> > I have a client who is vacillating between SA and some commercial
> > products. Most of the commercial products are not as accurate as SA but
> > my client has some valid points. I have limited experience setting up SA
> > (always plain vanilla installs through RH and single user systems) and I
> > wanted to get some feedback on what is possible/feasible from the list.
> > It would be used in a multi mail server env where all external mail
> > would be routed through it for checks and modified headers, subjects
> > would tag spam.
> > The issues are:
> > 1) Does SA auto upgrade to newer versions (of the spam defs more
> > importantly) or could it be set up to do so through an automated CPAN
> > install?
> > 2) Can individual users easily have their own training files (I know it
> > is possible, looking more for feasibility) when they do not have an
> > account on the system. ie; their training file could be located via the
> > addressed email. If so, is there anything out there to manage that? If
> > not would it be possible to write a mail handler to parse the mail
> > addressed to a spam@host mail box that users could forward spam to? This
> > script could then use the from line to determine which user to train on.
> > 3) What other issues will I run into in a multi user environment where
> > the SA server is simply forwarding on mail between MTAs and not the
> > final delivery agent?
> >
> > Thanks in advance,
> > Tom
>
> Setup is not all that hard. It takes time to train the filters. My
> experience indicates this is required. Generally each user must train
> their own Baysian filter or else a generic training must be setup by
> the administrator. That involves looking at spam and ham both as they
> come through. *I* would not do that since it involves looking at
> another person's mail.

> Note that I am still using SpamAssassin in preference over other
> potential tools. It works. I have personal control over it. But then,
> I am a programmer by trade these days. I'd not try to install it for
> my brother under these conditions, for example. (He's a rather er
> unimaginative counter of Ford automobile dealership beans.)
>
> {^_^}
Re: selling a client on SA [ In reply to ]
Gregory Sloop, Sloop Network & Computer Consulting said:

> I assume you mean that it's a pain to submit SPAM and HAM messages to feed
> bayes...
<how to submit ham/spam snipped>
I have SA running in a relay situation, and it's only handled 50K messages
and I have yet to train it either with spam or ham.
I just set my scores high enough, and only get 70% of the incoming spam,
but I don't have false postitives, or statistically insignifigant false
positives.
I'd have to manually examine the score the determine my actual block rate,
this is a guestimate based on the number of mail at each spam score.

Really it all comes down to how much spam you are willing to block and how
many false postives you want.
It's fascinating that while the volume of spam varies day to day the
average spam score only varies by less than .25 points in any day.

--
Luke Computer Science System Administrator
Security Administrator,College of Engineering
Montana State University-Bozeman,Montana
RE: selling a client on SA [ In reply to ]
Hrrm - are you sure it's learning properly with the emails as attachments
(and with multiple attachments)?

I know that when you open the attachment you can see the properties but when
it's stored won't bayes misunderstand since the spam is encapsulated within
a non-spam message (or are you manually extracting the attachments for
placement within imap)? Using a quick combinetic of O=original and
#=attachment number and the original message is 0+1+2+3, is bayes under the
learn token process going to drop 0, and learn 1, 2 and 3 separately or is
it going to learn 0+1+2+3 and make additional tokens based on improper
correlations between attachments 1, 2 and 3?

If bayes is learning just the attachment portions and not the original
message, how much of an support issue is it to ensure that these attachments
are in a bayes readable format (ie: not in winmail.dat) from your clients?

This seems to me to be a whole lot of effort beyond a global imap
connection.

Todd

-----Original Message-----
From: Gregory Sloop, Sloop Network & Computer Consulting
[mailto:lsgregs@sloop.net]
Sent: Monday, February 16, 2004 10:27 AM
To: jdow; spamassassin-users@incubator.apache.org
Subject: Re: selling a client on SA

>
> And in an environment with (we're stuck with it) Windows Outlook Express
> for the mail readers and Linux for the spam filters generating the spam
> and ham databases is a royal pita. (I know *I* am not about to run
> Outlook. And converting to another mail tool at this point is somewhat
> er "awkward" to say nothing about painful.)
>

I assume you mean that it's a pain to submit SPAM and HAM messages to feed
bayes...

It isn't as big a PITA as you might think. (I added this to the Wiki
recently, so perhaps you've not seen it.) It is more work than "auto"
processing - but for small sites, I much prefer to hand check submissions.
We are using a shared bayes DB for say 50 or less users.)

You CAN get messages from users unmodified pretty easily. You don't have to
use IMAP shared folders either.

First I setup two mail drop boxes - call them spam@domain.com
ham@domain.com.

Then I simply have users do this in Outlook/Outlook Express

Open a new mail message.
Address new message to the correct drop box. (spam@domain.com or
ham@domain.com)
Drag messages that apply, Ham or Spam into the new message - they'll be sent
as attachments.
(Make sure users don't do both ham and spam together in the same message.
That will make life a pain!)
Then I can pick up the messages myself with IMAP and review them for real
hammy/spammy-ness - and drag them into another IMAP folder for processing.
Then use sa-learn to teach bayes on the IMAP folder using the --mbox option.

(Plus, I can do all of this remotely. I'm a consultant, and the more I can
do from off site, the better!)
(I'm a newbie too, so perhaps I'm misunderstanding things - but this is a
pretty decent way to do things.)

Greg

----- Original Message -----
From: "jdow" <jdow@earthlink.net>
To: <spamassassin-users@incubator.apache.org>
Sent: Monday, February 16, 2004 7:26 AM
Subject: Re: selling a client on SA


> From: "Thomas Bolioli" <tpblists@terranovum.com>
>
> > I have a client who is vacillating between SA and some commercial
> > products. Most of the commercial products are not as accurate as SA but
> > my client has some valid points. I have limited experience setting up SA
> > (always plain vanilla installs through RH and single user systems) and I
> > wanted to get some feedback on what is possible/feasible from the list.
> > It would be used in a multi mail server env where all external mail
> > would be routed through it for checks and modified headers, subjects
> > would tag spam.
> > The issues are:
> > 1) Does SA auto upgrade to newer versions (of the spam defs more
> > importantly) or could it be set up to do so through an automated CPAN
> > install?
> > 2) Can individual users easily have their own training files (I know it
> > is possible, looking more for feasibility) when they do not have an
> > account on the system. ie; their training file could be located via the
> > addressed email. If so, is there anything out there to manage that? If
> > not would it be possible to write a mail handler to parse the mail
> > addressed to a spam@host mail box that users could forward spam to? This
> > script could then use the from line to determine which user to train on.
> > 3) What other issues will I run into in a multi user environment where
> > the SA server is simply forwarding on mail between MTAs and not the
> > final delivery agent?
> >
> > Thanks in advance,
> > Tom
>
> Setup is not all that hard. It takes time to train the filters. My
> experience indicates this is required. Generally each user must train
> their own Baysian filter or else a generic training must be setup by
> the administrator. That involves looking at spam and ham both as they
> come through. *I* would not do that since it involves looking at
> another person's mail.

> Note that I am still using SpamAssassin in preference over other
> potential tools. It works. I have personal control over it. But then,
> I am a programmer by trade these days. I'd not try to install it for
> my brother under these conditions, for example. (He's a rather er
> unimaginative counter of Ford automobile dealership beans.)
>
> {^_^}