Mailing List Archive

Spam with Pyzor and DCC scores
Hi everyone,

A few times a month we have spam messages getting through, often in
German, that have some spam score but not enough to be marked/discarded.
Always these messages are marked by DCC, since they're of course bulk
spam, but it's also not uncommon to see Pyzor as well. I've been
wondering if there are realistic cases where both DCC and Pyzor would
mark as spam while the message was ham. I feel like when both co-occur
it's a pretty solid sign it's spam. Therefore, I'm wondering if an
upstream amplification (or a local one) would make sense.

Some examples (I can also supply full emails, but fear this might
prevent my message from arriving):
X-Spam-Status: No, score=4.082 tagged_above=-9999 required=5
    tests=[.DCC_CHECK=1.1, DIGEST_MULTIPLE=0.001, FSL_BULK_SIG=0.001,
    HEADER_FROM_DIFFERENT_DOMAINS=0.25, HTML_IMAGE_RATIO_08=0.001,
    HTML_MESSAGE=0.001, MIME_HTML_ONLY=0.1, PYZOR_CHECK=1.985,
    SPF_HELO_NONE=0.001, SPF_NEUTRAL=0.652, T_SCC_BODY_TEXT_LINE=-0.01]
X-Spam-Status: No, score=4.816 tagged_above=-9999 required=5
    tests=[.DCC_CHECK=1.1, DIGEST_MULTIPLE=0.001, FSL_BULK_SIG=0.001,
    HEADER_FROM_DIFFERENT_DOMAINS=0.248, HTML_IMAGE_ONLY_28=0.726,
    HTML_IMAGE_RATIO_02=0.001, HTML_MESSAGE=0.001, MIME_HTML_ONLY=0.1,
    PYZOR_CHECK=1.985, SPF_HELO_NONE=0.001, SPF_NEUTRAL=0.652,
    T_REMOTE_IMAGE=0.01, T_SCC_BODY_TEXT_LINE=-0.01]
X-Spam-Status: No, score=4.109 tagged_above=-9999 required=5
    tests=[.DCC_CHECK=1.1, DIGEST_MULTIPLE=0.001, FSL_BULK_SIG=0.029,
    HEADER_FROM_DIFFERENT_DOMAINS=0.249, HTML_IMAGE_RATIO_04=0.001,
    HTML_MESSAGE=0.001, MIME_HTML_ONLY=0.1, PYZOR_CHECK=1.985,
    SPF_HELO_NONE=0.001, SPF_NEUTRAL=0.652, T_SCC_BODY_TEXT_LINE=-0.01]

What's people's opinion here?

Kind regards,
Bert Van de Poel
ULYSSIS
Re: Spam with Pyzor and DCC scores [ In reply to ]
On 11.07.22 12:57, Bert Van de Poel wrote:
>A few times a month we have spam messages getting through, often in
>German, that have some spam score but not enough to be
>marked/discarded. Always these messages are marked by DCC, since
>they're of course bulk spam, but it's also not uncommon to see Pyzor
>as well. I've been wondering if there are realistic cases where both
>DCC and Pyzor would mark as spam while the message was ham.

this is likely to happen if the message is empty or learly empty.
some people are stupid, send one-two words or a short link in message
without Subject: ...

> I feel like when both co-occur it's a pretty solid sign it's spam.
>Therefore, I'm wondering if an upstream amplification (or a local one)
>would make sense.

>Some examples (I can also supply full emails, but fear this might
>prevent my message from arriving):
>X-Spam-Status: No, score=4.082 tagged_above=-9999 required=5
>?? ?tests=[.DCC_CHECK=1.1, DIGEST_MULTIPLE=0.001, FSL_BULK_SIG=0.001,
>?? ?HEADER_FROM_DIFFERENT_DOMAINS=0.25, HTML_IMAGE_RATIO_08=0.001,
>?? ?HTML_MESSAGE=0.001, MIME_HTML_ONLY=0.1, PYZOR_CHECK=1.985,
>?? ?SPF_HELO_NONE=0.001, SPF_NEUTRAL=0.652, T_SCC_BODY_TEXT_LINE=-0.01]
>X-Spam-Status: No, score=4.816 tagged_above=-9999 required=5
>?? ?tests=[.DCC_CHECK=1.1, DIGEST_MULTIPLE=0.001, FSL_BULK_SIG=0.001,
>?? ?HEADER_FROM_DIFFERENT_DOMAINS=0.248, HTML_IMAGE_ONLY_28=0.726,
>?? ?HTML_IMAGE_RATIO_02=0.001, HTML_MESSAGE=0.001, MIME_HTML_ONLY=0.1,
>?? ?PYZOR_CHECK=1.985, SPF_HELO_NONE=0.001, SPF_NEUTRAL=0.652,
>?? ?T_REMOTE_IMAGE=0.01, T_SCC_BODY_TEXT_LINE=-0.01]
>X-Spam-Status: No, score=4.109 tagged_above=-9999 required=5
>?? ?tests=[.DCC_CHECK=1.1, DIGEST_MULTIPLE=0.001, FSL_BULK_SIG=0.029,
>?? ?HEADER_FROM_DIFFERENT_DOMAINS=0.249, HTML_IMAGE_RATIO_04=0.001,
>?? ?HTML_MESSAGE=0.001, MIME_HTML_ONLY=0.1, PYZOR_CHECK=1.985,
>?? ?SPF_HELO_NONE=0.001, SPF_NEUTRAL=0.652, T_SCC_BODY_TEXT_LINE=-0.01]

looks like you should implement bayes.
since these are generated by amavis, you could train amavis database.

--
Matus UHLAR - fantomas, uhlar@fantomas.sk ; http://www.fantomas.sk/
Warning: I wish NOT to receive e-mail advertising to this address.
Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
Atheism is a non-prophet organization.
Re: Spam with Pyzor and DCC scores [ In reply to ]
On 11/07/2022 15:44, Matus UHLAR - fantomas wrote:
> On 11.07.22 12:57, Bert Van de Poel wrote:
>> A few times a month we have spam messages getting through, often in
>> German, that have some spam score but not enough to be
>> marked/discarded. Always these messages are marked by DCC, since
>> they're of course bulk spam, but it's also not uncommon to see Pyzor
>> as well. I've been wondering if there are realistic cases where both
>> DCC and Pyzor would mark as spam while the message was ham.
>
> this is likely to happen if the message is empty or learly empty.
> some people are stupid, send one-two words or a short link in message
> without Subject: ...
>
Oh yeah, that's a case I hadn't thought of, good point!
>> I feel like when both co-occur it's a pretty solid sign it's spam. 
>> Therefore, I'm wondering if an upstream amplification (or a local
>> one) would make sense.
>
>> Some examples (I can also supply full emails, but fear this might
>> prevent my message from arriving):
>> X-Spam-Status: No, score=4.082 tagged_above=-9999 required=5
>>     tests=[.DCC_CHECK=1.1, DIGEST_MULTIPLE=0.001, FSL_BULK_SIG=0.001,
>>     HEADER_FROM_DIFFERENT_DOMAINS=0.25, HTML_IMAGE_RATIO_08=0.001,
>>     HTML_MESSAGE=0.001, MIME_HTML_ONLY=0.1, PYZOR_CHECK=1.985,
>>     SPF_HELO_NONE=0.001, SPF_NEUTRAL=0.652, T_SCC_BODY_TEXT_LINE=-0.01]
>> X-Spam-Status: No, score=4.816 tagged_above=-9999 required=5
>>     tests=[.DCC_CHECK=1.1, DIGEST_MULTIPLE=0.001, FSL_BULK_SIG=0.001,
>>     HEADER_FROM_DIFFERENT_DOMAINS=0.248, HTML_IMAGE_ONLY_28=0.726,
>>     HTML_IMAGE_RATIO_02=0.001, HTML_MESSAGE=0.001, MIME_HTML_ONLY=0.1,
>>     PYZOR_CHECK=1.985, SPF_HELO_NONE=0.001, SPF_NEUTRAL=0.652,
>>     T_REMOTE_IMAGE=0.01, T_SCC_BODY_TEXT_LINE=-0.01]
>> X-Spam-Status: No, score=4.109 tagged_above=-9999 required=5
>>     tests=[.DCC_CHECK=1.1, DIGEST_MULTIPLE=0.001, FSL_BULK_SIG=0.029,
>>     HEADER_FROM_DIFFERENT_DOMAINS=0.249, HTML_IMAGE_RATIO_04=0.001,
>>     HTML_MESSAGE=0.001, MIME_HTML_ONLY=0.1, PYZOR_CHECK=1.985,
>>     SPF_HELO_NONE=0.001, SPF_NEUTRAL=0.652, T_SCC_BODY_TEXT_LINE=-0.01]
>
> looks like you should implement bayes.
> since these are generated by amavis, you could train amavis database.
>
We have Bayes running on the main server, but my own local server
doesn't have it so hence why it's missing. I did however take all spam I
received myself in 2022 that wasn't caught and fed it to sa-learn (for
the amavis user), thx for that suggestion. Let's hope it works to remove
this minor inconvenience :)
Re: Spam with Pyzor and DCC scores [ In reply to ]
On 2022-07-12 00:09, Bert Van de Poel wrote:

> We have Bayes running on the main server, but my own local server
> doesn't have it so hence why it's missing. I did however take all spam
> I received myself in 2022 that wasn't caught and fed it to sa-learn
> (for the amavis user), thx for that suggestion. Let's hope it works to
> remove this minor inconvenience :)

razor, pyzor, dcc is detecting if mails is mailed to more then one
recipients, is does not detect as so if its spam or not, so only
massmailed is sure

for bayes training it aswell needs to know ham mails, only spam mails is
same as no training :=)

i do not use razor, pyzor, dcc anymore, software is more or less
outdated on gentoo, i do not plan to make updated ebuilds for it, its to
small goal for me, on the other hands i use fuglu, with is doing well
with python 3.10 now as gentoo portage defaults to python 3.10 now