Mailing List Archive

DMARC Aggregate reports - false positives
I find most DMARC reports I receive are flagged as spam by SA. 

How do people work around this? I've trained Bayes, and that is applying a -ve offset as expected, but they still end up at over 7.
 X-Spam-Status: Yes, score=7.215 tagged_above=-999 required=6.2
tests=[.BASE64_LENGTH_78_79=0.1, BASE64_LENGTH_79_INF=1.502,
BAYES_00=-1.9, DCC_CHECK=1.1, DIGEST_MULTIPLE=0.293,
DKIMWL_WL_MED=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1,
ENA_SUBJ_LONG_WORD=2.2, HTML_MESSAGE=0.001, LR_DMARC_PASS=-0.1,
MIME_BASE64_TEXT=1.741, MIME_HTML_MOSTLY=0.1, MPART_ALT_DIFF=0.79,
PYZOR_CHECK=1.392, RCVD_IN_DNSWL_NONE=-0.0001,
RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001,
T_SCC_BODY_TEXT_LINE=-0.01, T_TVD_MIME_NO_HEADERS=0.01]

-- 
Simon Wilson
M: 0400 121 116
Re: DMARC Aggregate reports - false positives [ In reply to ]
> I find most DMARC reports I receive are flagged as spam by SA.
Which submitters? I looked at a bunch of my reports and they are all
MIME_GOOD.
Re: DMARC Aggregate reports - false positives [ In reply to ]
On Thursday, June 22, 2023 20:37 AEST, Damian <spamassassin@arcsin.de> wrote:
  I find most DMARC reports I receive are flagged as spam by SA.> Which submitters? I looked at a bunch of my reports and they are all MIME_GOOD.

That one was from microsoft.

-- 
Simon Wilson
M: 0400 121 116
Re: DMARC Aggregate reports - false positives [ In reply to ]
>> submitters? I looked at a bunch of my reports and they are all MIME_GOOD.
>
> That one was from microsoft.

Ok, I see.

It seems to me that BASE64_LENGTH_79_INF is wrong. It is probably
motivated by RFC5322's "SHOULD be no more than 78 characters, excluding
the CRLF". My Microsoft reports trigger 79_INF even though they only
have 78 characters, excluding CRLF.

Personally, I have lower PYZOR_CHECK and DKIMWL scores.
Re: DMARC Aggregate reports - false positives [ In reply to ]
On 2023-06-22 at 06:29:53 UTC-0400 (Thu, 22 Jun 2023 20:29:53 +1000)
Simon Wilson via users <simon@simonandkate.net>
is rumored to have said:

> I find most DMARC reports I receive are flagged as spam by SA. 
>
> How do people work around this? I've trained Bayes, and that is
> applying a -ve offset as expected, but they still end up at over 7.

The best solution for robot-generated mail to and from predictable
addresses are the welcomelist feature(s). You can use more_spam_to or
all_spam_to for reporting addresses, or welcomelist_auth for senders.

Also, if you get a lot of robotic mail I would recommend that you not
use Pyzor, Razor, or DCC. All of those are engines for detecting
similarities in mail and they do that very well with regularly formatted
mail that looks much the same across many recipients.

>  X-Spam-Status: Yes, score=7.215 tagged_above=-999 required=6.2
> tests=[.BASE64_LENGTH_78_79=0.1, BASE64_LENGTH_79_INF=1.502,
> BAYES_00=-1.9, DCC_CHECK=1.1, DIGEST_MULTIPLE=0.293,
> DKIMWL_WL_MED=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1,
> ENA_SUBJ_LONG_WORD=2.2, HTML_MESSAGE=0.001, LR_DMARC_PASS=-0.1,
> MIME_BASE64_TEXT=1.741, MIME_HTML_MOSTLY=0.1, MPART_ALT_DIFF=0.79,
> PYZOR_CHECK=1.392, RCVD_IN_DNSWL_NONE=-0.0001,
> RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001,
> T_SCC_BODY_TEXT_LINE=-0.01, T_TVD_MIME_NO_HEADERS=0.01]
>
> -- 
> Simon Wilson
> M: 0400 121 116


--
Bill Cole
bill@scconsult.com or billcole@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Not Currently Available For Hire
Re: DMARC Aggregate reports - false positives [ In reply to ]
On Thursday, June 22, 2023 23:05 AEST, Bill Cole <sausers-20150205@billmail.scconsult.com> wrote:
 On 2023-06-22 at 06:29:53 UTC-0400 (Thu, 22 Jun 2023 20:29:53 +1000)
Simon Wilson via users <simon@simonandkate.net>
is rumored to have said:

> I find most DMARC reports I receive are flagged as spam by SA. 
>
> How do people work around this? I've trained Bayes, and that is
> applying a -ve offset as expected, but they still end up at over 7.

The best solution for robot-generated mail to and from predictable
addresses are the welcomelist feature(s). You can use more_spam_to or
all_spam_to for reporting addresses, or welcomelist_auth for senders.
?????
Also, if you get a lot of robotic mail I would recommend that you not
use Pyzor, Razor, or DCC. All of those are engines for detecting
similarities in mail and they do that very well with regularly formatted
mail that looks much the same across many recipients.

>  X-Spam-Status: Yes, score=7.215 tagged_above=-999 required=6.2
> tests=[.BASE64_LENGTH_78_79=0.1, BASE64_LENGTH_79_INF=1.502,
> BAYES_00=-1.9, DCC_CHECK=1.1, DIGEST_MULTIPLE=0.293,
> DKIMWL_WL_MED=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1,
> ENA_SUBJ_LONG_WORD=2.2, HTML_MESSAGE=0.001, LR_DMARC_PASS=-0.1,
> MIME_BASE64_TEXT=1.741, MIME_HTML_MOSTLY=0.1, MPART_ALT_DIFF=0.79,
> PYZOR_CHECK=1.392, RCVD_IN_DNSWL_NONE=-0.0001,
> RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001,
> T_SCC_BODY_TEXT_LINE=-0.01, T_TVD_MIME_NO_HEADERS=0.01]
>
> -- 
> Simon Wilson
> M: 0400 121 116


--
Bill Cole
bill@scconsult.com or billcole@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Not Currently Available For Hire
Thanks BIll. I am using 3.4.6 on RHEL8, so will need to use the legacy terms instead of welcomelist_auth I assume. 

I'll start there.

Simon

-- 
Simon Wilson
M: 0400 121 116
Re: DMARC Aggregate reports - false positives [ In reply to ]
On 6/22/2023 6:29 AM, Simon Wilson via users wrote:
>
> How do people work around this? I've trained Bayes, and that is
> applying a -ve offset as expected, but they still end up at over 7.
> X-Spam-Status: Yes, score=7.215 tagged_above=-999 required=6.2
> tests=[.BASE64_LENGTH_78_79=0.1, BASE64_LENGTH_79_INF=1.502,
> BAYES_00=-1.9, DCC_CHECK=1.1, DIGEST_MULTIPLE=0.293,
> DKIMWL_WL_MED=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1,
> ENA_SUBJ_LONG_WORD=2.2, HTML_MESSAGE=0.001, LR_DMARC_PASS=-0.1,
> MIME_BASE64_TEXT=1.741, MIME_HTML_MOSTLY=0.1, MPART_ALT_DIFF=0.79,
> PYZOR_CHECK=1.392, RCVD_IN_DNSWL_NONE=-0.0001,
> RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001,
> T_SCC_BODY_TEXT_LINE=-0.01, T_TVD_MIME_NO_HEADERS=0.01]
My Spam threshold is higher, so not a real problem for me.  But...

1) You might plead your case to KAM off-list and see if he can bump up
his regex length for ENA_SUBJ_LONG_WORD to something longer than 30,
like 33.
2) Lower the score for ENA_SUBJ_LONG_WORD
3) I don't run Pyzor; maybe lower the score the a little bit also?
4) Create an off-setting rule; like:

    meta    DMARC_OFFSET    (ENA_SUBJ_LONG_WORD && DKIM_VALID)
    score   DMARC_OFFSET    -2.2

Yes, for sure, ALL Microsoft DMARC messages hit ENA_SUBJ_LONG_WORD.
dokomo.ne.jp also hits (32 chars).  In the near-miss category, mail.ru
comes in OK at 29 characters.


-- Jared Hall
Re: DMARC Aggregate reports - false positives [ In reply to ]
Mine don't get reported as spam. But I'm getting daily reports from
mimecast.org that claim to be "Content-Type: application/gzip" but have
file extension .zip. Examination finds that they're really PK zip files.
So the script I use to process them tosses them as malformed. The source
domain has no way to contact them to inform them of the error.