Mailing List Archive

Is pyzor recommended by folks on this list?
I just installed pyzor and did a random spot check of about 10 spam
emails to try to evaluate it using this command:

pyzor check < some_spam

Only one message gave me a hit on pyzor.

But I take my results with a grain of salt because I may not have pyzor
configured optimally.

For one, I'm using the public pyzor server. Maybe there are other more
useful servers?

Second, I'm not sure if my tests will work on my spam samples which have
the spam encapsulated with the "report_safe" setting set to a value of
"1". By the way, anyone know of a CLI utility for extracting the
original spam email from these files?

So before I explore pyzor any further, I'm wondering if the default
rules built into SA are good enough or if pyzor improves the accuracy of
SA enough to be worth the extra cycles to install it and keep it
functional.

What do you think?
Re: Is pyzor recommended by folks on this list? [ In reply to ]
On 2021-04-11 15:13, Steve Dondley wrote:

> What do you think?

pyzor is usefull if running pyzord localy, design of pyzor was imho ment
to be local pyzord and have the pyzor client query local, but pyzord
could be get results from other pyzord server farms, but this have never
happended, sadly

reson i say this is that i had used pyzord localy with well trained spam
corpus, this gave better results then remote pyzord servers ever gave

now i dont need it anymore
Re: Is pyzor recommended by folks on this list? [ In reply to ]
On 2021-04-11 09:34 AM, Benny Pedersen wrote:
> On 2021-04-11 15:13, Steve Dondley wrote:
>
>> What do you think?
>
> pyzor is usefull if running pyzord localy, design of pyzor was imho
> ment to be local pyzord and have the pyzor client query local, but
> pyzord could be get results from other pyzord server farms,

Interesting. I wonder if it might be worth it to set up my own pyzor
server for my own network of mail servers. That's probably going to be
easier than sharing spam/ham samples around between users.
Re: Is pyzor recommended by folks on this list? [ In reply to ]
On 2021-04-11 16:04, Steve Dondley wrote:

> Interesting. I wonder if it might be worth it to set up my own pyzor
> server for my own network of mail servers. That's probably going to be
> easier than sharing spam/ham samples around between users.

yes its more light in sieve scripting to use it this way then train
bayes, i dont know if results from pyzor is used in bayes digest, if it
is it would be wonderfull
Re: Is pyzor recommended by folks on this list? [ In reply to ]
On 11.04.21 09:13, Steve Dondley wrote:
>I just installed pyzor and did a random spot check of about 10 spam
>emails to try to evaluate it using this command:
>
>pyzor check < some_spam
>
>Only one message gave me a hit on pyzor.

I have pyzor enabled and for sure I have changes pyzor_timeout to 5.
logs for last week show about 24 hits out of 577 mails.

some mail got rejected or pusher over required score thanks to pyzor

>But I take my results with a grain of salt because I may not have
>pyzor configured optimally.
>
>For one, I'm using the public pyzor server. Maybe there are other more
>useful servers?

I don't know of any.
Using public servers is best if you want to know how many people reported
the mail as spam.

I would only consider using private server in addition to public servers.

>Second, I'm not sure if my tests will work on my spam samples which
>have the spam encapsulated with the "report_safe" setting set to a
>value of "1". By the way, anyone know of a CLI utility for extracting
>the original spam email from these files?

I recomment using report_safe 0.

>So before I explore pyzor any further, I'm wondering if the default
>rules built into SA are good enough or if pyzor improves the accuracy
>of SA enough to be worth the extra cycles to install it and keep it
>functional.
>
>What do you think?

enable and install RAZOR and DCC. all of them help.

--
Matus UHLAR - fantomas, uhlar@fantomas.sk ; http://www.fantomas.sk/
Warning: I wish NOT to receive e-mail advertising to this address.
Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
Enter any 12-digit prime number to continue.
Re: Is pyzor recommended by folks on this list? [ In reply to ]
> value of "1". By the way, anyone know of a CLI utility for extracting
> the original spam email from these files?

Here's a very crude perl script that does the trick:

#!/usr/bin/perl

use strict;
use warnings;

my $email;
while (<>) {
$email .= $_;
}

my ($boundary) = $email =~ /boundary="(.*)"/;
my ($orig_content) = $email =~
/^--$boundary.*^--$boundary(.*)$boundary--/ms;

print $orig_content;

You would use it like this:

./spam_extractor.pl < email_file_with_encapsualted_spam
Re: Is pyzor recommended by folks on this list? [ In reply to ]
On 11 Apr 2021, at 13:21, Steve Dondley wrote:

>> value of "1". By the way, anyone know of a CLI utility for extracting
>> the original spam email from these files?

spamassassin -d < wrappedspam.eml

As documented in the spamassassin-run man page or by running
'spamassassin --help'

>
> Here's a very crude perl script that does the trick:
>
> #!/usr/bin/perl
>
> use strict;
> use warnings;
>
> my $email;
> while (<>) {
> $email .= $_;
> }
>
> my ($boundary) = $email =~ /boundary="(.*)"/;
> my ($orig_content) = $email =~
> /^--$boundary.*^--$boundary(.*)$boundary--/ms;
>
> print $orig_content;
>
> You would use it like this:
>
> ./spam_extractor.pl < email_file_with_encapsualted_spam

Ewww. FWIW, that leaves the MIME headers for the embedded message in
place with a blank line before and after, so the output is not a proper
email message.

--
Bill Cole
bill@scconsult.com or billcole@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Not Currently Available For Hire
Re: Is pyzor recommended by folks on this list? [ In reply to ]
On 2021-04-11 03:09 PM, Bill Cole wrote:
> On 11 Apr 2021, at 13:21, Steve Dondley wrote:
>
>>> value of "1". By the way, anyone know of a CLI utility for extracting
>>> the original spam email from these files?
>
> spamassassin -d < wrappedspam.eml

Ah, ok. I was familiar with the -d option but did not know it could be
used to redirect to output like this:

spamassassin -d < filtred_email > orig_email

I tried it and it did what I needed. Thanks.
Re: Is pyzor recommended by folks on this list? [ In reply to ]
On Sun, 11 Apr 2021 09:13:26 -0400
Steve Dondley wrote:



> Second, I'm not sure if my tests will work on my spam samples which
> have the spam encapsulated with the "report_safe" setting set to a
> value of "1".

I wouldn't expect it to work at all. "report_safe" encapsulation
creates a new email which isn't a spam.
Re: Is pyzor recommended by folks on this list? [ In reply to ]
>> Second, I'm not sure if my tests will work on my spam samples which
>> have the spam encapsulated with the "report_safe" setting set to a
>> value of "1".
>
> I wouldn't expect it to work at all. "report_safe" encapsulation
> creates a new email which isn't a spam.

From what I read on pyzor's home page and how it works, pyzor strips off
all headers. So I would assume it doesn't matter if it's encapsulated. I
could be, and quite likely am, totally wrong about this, of course.
Re: Is pyzor recommended by folks on this list? [ In reply to ]
On Sun, 11 Apr 2021 10:04:03 -0400
Steve Dondley wrote:

> On 2021-04-11 09:34 AM, Benny Pedersen wrote:
> > On 2021-04-11 15:13, Steve Dondley wrote:
> >
> >> What do you think?
> >
> > pyzor is usefull if running pyzord localy, design of pyzor was imho
> > ment to be local pyzord and have the pyzor client query local, but
> > pyzord could be get results from other pyzord server farms,
>
> Interesting. I wonder if it might be worth it to set up my own pyzor
> server for my own network of mail servers. That's probably going to
> be easier than sharing spam/ham samples around between users.

I don't see the advantage. You might just as well submit to the shared
server so everyone benefits.

Pyzor is not a realistic substitute for Bayes.
Re: Is pyzor recommended by folks on this list? [ In reply to ]
On 2021-04-11 23:20, RW wrote:
> On Sun, 11 Apr 2021 10:04:03 -0400
> Steve Dondley wrote:
>
>> On 2021-04-11 09:34 AM, Benny Pedersen wrote:
>> > On 2021-04-11 15:13, Steve Dondley wrote:
>> >
>> >> What do you think?
>> >
>> > pyzor is usefull if running pyzord localy, design of pyzor was imho
>> > ment to be local pyzord and have the pyzor client query local, but
>> > pyzord could be get results from other pyzord server farms,
>>
>> Interesting. I wonder if it might be worth it to set up my own pyzor
>> server for my own network of mail servers. That's probably going to
>> be easier than sharing spam/ham samples around between users.
>
> I don't see the advantage. You might just as well submit to the shared
> server so everyone benefits.
>
> Pyzor is not a realistic substitute for Bayes.

and centralizion on prolems is just another problem

i prefer jabber over irc or any other solotions, my point is valid as
writed, remote pyzor servers dont know what is spam or not localy, but
it could share results if wanted, but this was never implemented into
pyzord or pyzor client
Re: Is pyzor recommended by folks on this list? [ In reply to ]
On Sun, 11 Apr 2021 16:57:54 -0400
Steve Dondley wrote:

> >> Second, I'm not sure if my tests will work on my spam samples which
> >> have the spam encapsulated with the "report_safe" setting set to a
> >> value of "1".
> >
> > I wouldn't expect it to work at all. "report_safe" encapsulation
> > creates a new email which isn't a spam.
>
> From what I read on pyzor's home page and how it works, pyzor strips
> off all headers. So I would assume it doesn't matter if it's
> encapsulated. I could be, and quite likely am, totally wrong about
> this, of course.

But the new email has its own body text and the whole of the original
email, including headers, is part of the body.
Re: Is pyzor recommended by folks on this list? [ In reply to ]
On Sunday 11 April 2021 at 23:27:26, Benny Pedersen wrote:

> On 2021-04-11 23:20, RW wrote:
> >
> > I don't see the advantage. You might just as well submit to the shared
> > server so everyone benefits.
> >
> > Pyzor is not a realistic substitute for Bayes.
>
> and centralizion on prolems is just another problem

Why do you say that? Surely sharing a larger common pool of spam indicators
is in everyone's interests?

Every individual trying to solve the same problems each on their own is
certaily the least efficient solution possible?

> i prefer jabber over irc or any other solotions,

I don't understand the relevance of that comment?

> my point is valid as writed, remote pyzor servers dont know what is spam or
> not localy, but it could share results if wanted, but this was never
> implemented into pyzord or pyzor client

I must be confused then - what do you believe *is* the purpose of pyzor?


Antony.

--
These clients are often infected by viruses or other malware and need to be
fixed. If not, the user at that client needs to be fixed...

- Henrik Nordstrom, on Squid users' mailing list

Please reply to the list;
please *don't* CC me.
Re: Is pyzor recommended by folks on this list? [ In reply to ]
On Sun, April 11, 2021 17:44, Antony Stone wrote:
>> my point is valid as writed, remote pyzor servers dont know what is spam or not localy, but it
>> could share results if wanted, but this was never implemented into pyzord or pyzor client
>
> I must be confused then - what do you believe *is* the purpose of pyzor?

I think the OP is confused as to what a pyzor listing means.

A pyzor or DCC listing along with other badness will add a point or two in our scoring system depending on what the badness is.

IMO using pyzor or DCC as a spam indicator is a bad idea.

John Capo
Re: Is pyzor recommended by folks on this list? [ In reply to ]
On 2021-04-12 18:21, John Capo wrote:
> On Sun, April 11, 2021 17:44, Antony Stone wrote:
>>> my point is valid as writed, remote pyzor servers dont know what is
>>> spam or not localy, but it
>>> could share results if wanted, but this was never implemented into
>>> pyzord or pyzor client
>>
>> I must be confused then - what do you believe *is* the purpose of
>> pyzor?
>
> I think the OP is confused as to what a pyzor listing means.
>
> A pyzor or DCC listing along with other badness will add a point or
> two in our scoring system depending on what the badness is.
>
> IMO using pyzor or DCC as a spam indicator is a bad idea.

i am not expert, but remote servers never know what is spam for me, the
only valid point of dcc, pyzor, razor is that it indicate it sent to
other mailboxes, not just to me, training any of them localy does not
help remote servers at all, hence it could still be usefull to do the
server part localy

YMMV

PS: there is other digests, eq IxHash, and HashCash, all fails :=)
Re: Is pyzor recommended by folks on this list? [ In reply to ]
>>> my point is valid as writed, remote pyzor servers dont know what is spam or not localy, but it
>>> could share results if wanted, but this was never implemented into pyzord or pyzor client

>On Sun, April 11, 2021 17:44, Antony Stone wrote:
>> I must be confused then - what do you believe *is* the purpose of pyzor?

On 12.04.21 12:21, John Capo wrote:
>I think the OP is confused as to what a pyzor listing means.
>
>A pyzor or DCC listing along with other badness will add a point or two in
> our scoring system depending on what the badness is.

>IMO using pyzor or DCC as a spam indicator is a bad idea.

razor is indicator of spamminess - spam mails are to be manually reported.

pyzor was originally razor rewritten in python, but now uses own servers,
with the same intention AFAIK.

DCC is indicator of bulkiness - all mails are to be automatically reported.
DCC version 2 contains indicators of server reputation.

using all of them as indication of spamminess is fine, but not enough.

--
Matus UHLAR - fantomas, uhlar@fantomas.sk ; http://www.fantomas.sk/
Warning: I wish NOT to receive e-mail advertising to this address.
Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
- Holmes, what kind of school did you study to be a detective?
- Elementary, Watkins. -- Daffy Duck & Porky Pig
Re: Is pyzor recommended by folks on this list? [ In reply to ]
On Tue, 13 Apr 2021 14:10:02 +0200
Matus UHLAR - fantomas wrote:


> pyzor was originally razor rewritten in python, but now uses own
> servers, with the same intention AFAIK.

It's not just a matter of servers they do very different things. Pyzor
hashes selected lines from a preprocessed version of the email. Razor2
is based on URIs and body length.