Mailing List Archive

Pipe Transport + SpamAssassin = Double Exim Processes. Solution?
We're testing Exim 4.05, Maildirs, and SpamAssassin 2.31. We need to do
filtering only for customers who have "opted in", and save any spam to
"$home/SpamMaildir", which the user owns.

About half of our daily 120,000 messages need to be filtered.

Nearly every recommended method for using SpamAssassin involves a pipe
transport within Exim, and a command such as:

command = "spamassassin -P | exim -oMr spam-scanned -i -f"

Along with a system filter.

This seems to triple the amount of work. Exim handles the message as it
comes in, then sends the entire message to spamassassin or spamc, who
sends it to another Exim, which is delivered by the system filter. What
if the message is 10M in size? And keep in mind, it has to do this 2,500
times an hour.

People have implemented exiscan, which avoids pipes by scanning the
message at SMTP time. However, failing a message after the DATA stage is
considered a "soft error" by some MTAs and RFCs. These MTAs will
continue to attempt delivery of the "spam", over and over. We don't want
to reject or bounce the spam anyway, as the return address is usually
fake, and we want our customers to be able to see the mail we have
labeled "spam".

We are so far unable to use Procmail. Procmail does not allow us to use
Maildir options like Exim does, such as "maildir_tag", which modifies
message filenames, and is needed for Maildir NFS quotas. Having Procmail
do final deliveries for half our customers, and Exim for the other half,
is something we'd like to avoid, too.

So, is using a pipe transport our only option? Is there no way to run
"spamc -c" from Exim, and use that exit code to make a delivery
determination? It'd be great to see something like:

spamfilter:
driver = accept
check_local_user
require_files = $home/.filterme
command = "spamc -c"
condition = ${if eq{$exitcode}{}{1}{0}}
transport = spamdeliver

spamdeliver:
driver = appendfile
maildir_format = true
directory = $home/SpamMaildir
delivery_date_add
etc..

That way, only email for filtered persons is ever checked, Procmail and
.forward files are avoided, there's no need to pipe back to exim, no
need to check the header for easily-faked X flags, and no need to run as
a special user to use received_protocol.

Right now, the pipe transport considers a message failed if a command
returns any output. Maybe there's a way to change delivery based on that
output?

Rich
richs@whidbey.net
Re: Pipe Transport + SpamAssassin = Double Exim Processes. Solution? [ In reply to ]
On Wed, 2002-07-24 at 21:28, Richard, WhidbeyNet NOC wrote:
> This seems to triple the amount of work. Exim handles the message as it
> comes in, then sends the entire message to spamassassin or spamc, who
> sends it to another Exim, which is delivered by the system filter. What
> if the message is 10M in size? And keep in mind, it has to do this 2,500
> times an hour.
>
By default SpamAssassin lets through 'large' messages, which by default
are those of 250KB in size. So 10MB is not likely to be a spam message
really.

> So, is using a pipe transport our only option? Is there no way to run
> "spamc -c" from Exim, and use that exit code to make a delivery
> determination? It'd be great to see something like:
>
Take a look at the Exim ${run function which returns $runrc, and
spamc/spamd can be run simply to check mail and return a return code. We
don't use it but it may be what you want.

In our case, yes, SA seems to cause twice the work - the message arrives
is processed by SA and then re-injected into Exim. The logs show two
entries per scanned message. I'm looking at using SA via local_scan
instead to scan the message once, then add a header to say if it needs
processing per-recipient or something like that.


Regards,

John.
--
John Horne, University of Plymouth, UK Tel: +44 (0)1752 233914
E-mail: jhorne@plymouth.ac.uk
PGP key available from public key servers
Re: Pipe Transport + SpamAssassin = Double Exim Processes. Solution? [ In reply to ]
On 24 Jul 2002, John Horne wrote:

> > So, is using a pipe transport our only option? Is there no way to run
> > "spamc -c" from Exim, and use that exit code to make a delivery
> > determination? It'd be great to see something like:

Remember that a message may have many recipients. You certainly don't
want to run it through a spam checker for each of them.

The original poster mentioned an opt-in system. Therefore, you may have
a single message, some of whose recipients require the checking, and
some not.

You could use a router to select those recipients for which you want
checking, and then set batch_max on the transport to send a single copy
to the checker via a pipe. However, that does require a double pass
through Exim.

The best way of doing this is surely to run the checker once per
message, using the local_scan() interface, and use it to add headers to
the message. These can then be used subsequently, in filters or in
routers, to do different things for different recipients.

--
Philip Hazel University of Cambridge Computing Service,
ph10@cus.cam.ac.uk Cambridge, England. Phone: +44 1223 334714.