Mailing List Archive

Mial hits MISSING rules despite presence of headers
Hi,
I have emails from wayfair and Dell that hit many of the MISSING_* rules
but these headers are clearly displayed.

* 0.5 MISSING_MID Missing Message-Id: header
* 1.0 MISSING_FROM Missing From: header
* 1.8 MISSING_SUBJECT Missing Subject: header
* 1.4 MISSING_DATE Missing Date: header
* 2.3 EMPTY_MESSAGE Message appears to have no textual parts and no
* Subject: text

This also consequently causes DMARC/DKIM to fail.

https://pastebin.com/yFCRx76x

$ spamassassin --version
SpamAssassin version 4.0.0-r1904221
running on Perl version 5.36.0
Re: Mial hits MISSING rules despite presence of headers [ In reply to ]
On 2022-11-27 at 15:58:58 UTC-0500 (Sun, 27 Nov 2022 15:58:58 -0500)
Alex <mysqlstudent@gmail.com>
is rumored to have said:

> Hi,
> I have emails from wayfair and Dell that hit many of the MISSING_*
> rules
> but these headers are clearly displayed.
>
> * 0.5 MISSING_MID Missing Message-Id: header
> * 1.0 MISSING_FROM Missing From: header
> * 1.8 MISSING_SUBJECT Missing Subject: header
> * 1.4 MISSING_DATE Missing Date: header
> * 2.3 EMPTY_MESSAGE Message appears to have no textual parts and no
> * Subject: text
>
> This also consequently causes DMARC/DKIM to fail.
>
> https://pastebin.com/yFCRx76x
>
> $ spamassassin --version
> SpamAssassin version 4.0.0-r1904221
> running on Perl version 5.36.0

Cannot reproduce. Pasting a copy of that from the 'raw' view and feeding
it to 'spamassassin -t' doesn't result in hits on any of those rules.

How are you calling SA?

I have a theory about what might be happening, but it would require
using report_safe=1 and a flow that passes twice through SA...

--
Bill Cole
bill@scconsult.com or billcole@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Not Currently Available For Hire
Re: Mial hits MISSING rules despite presence of headers [ In reply to ]
Hi,

> I have emails from wayfair and Dell that hit many of the MISSING_*
> > rules
> > but these headers are clearly displayed.
> >
> > * 0.5 MISSING_MID Missing Message-Id: header
> > * 1.0 MISSING_FROM Missing From: header
> > * 1.8 MISSING_SUBJECT Missing Subject: header
> > * 1.4 MISSING_DATE Missing Date: header
> > * 2.3 EMPTY_MESSAGE Message appears to have no textual parts and no
> > * Subject: text
> >
> > This also consequently causes DMARC/DKIM to fail.
> >
> > https://pastebin.com/yFCRx76x
> >
> > $ spamassassin --version
> > SpamAssassin version 4.0.0-r1904221
> > running on Perl version 5.36.0
>
> Cannot reproduce. Pasting a copy of that from the 'raw' view and feeding
> it to 'spamassassin -t' doesn't result in hits on any of those rules.
>
> How are you calling SA?
>
> I have a theory about what might be happening, but it would require
> using report_safe=1 and a flow that passes twice through SA...
>

I'm calling SA through amavis, but it happens even when running SA from the
command-line:

$ spamassassin -t < email.eml

I do actually notice it does print the rules that are triggered twice, but
I don't think the scores are duplicated.

report_safe=1 is set in 10_defaults.pref in the updates.spamassassin.org
ruleset.







>
> --
> Bill Cole
> bill@scconsult.com or billcole@apache.org
> (AKA @grumpybozo and many *@billmail.scconsult.com addresses)
> Not Currently Available For Hire
>
Re: Mial hits MISSING rules despite presence of headers [ In reply to ]
Hi,

> I have emails from wayfair and Dell that hit many of the MISSING_*
>> > rules
>> > but these headers are clearly displayed.
>> >
>> > * 0.5 MISSING_MID Missing Message-Id: header
>> > * 1.0 MISSING_FROM Missing From: header
>> > * 1.8 MISSING_SUBJECT Missing Subject: header
>> > * 1.4 MISSING_DATE Missing Date: header
>> > * 2.3 EMPTY_MESSAGE Message appears to have no textual parts and no
>> > * Subject: text
>> >
>> > This also consequently causes DMARC/DKIM to fail.
>> >
>> > https://pastebin.com/yFCRx76x
>> >
>> > $ spamassassin --version
>> > SpamAssassin version 4.0.0-r1904221
>> > running on Perl version 5.36.0
>>
>> Cannot reproduce. Pasting a copy of that from the 'raw' view and feeding
>> it to 'spamassassin -t' doesn't result in hits on any of those rules.
>>
>> How are you calling SA?
>>
>> I have a theory about what might be happening, but it would require
>> using report_safe=1 and a flow that passes twice through SA...
>>
>
> I'm calling SA through amavis, but it happens even when running SA from
> the command-line:
>
> $ spamassassin -t < email.eml
>
> I do actually notice it does print the rules that are triggered twice, but
> I don't think the scores are duplicated.
>
> report_safe=1 is set in 10_defaults.pref in the updates.spamassassin.org
> ruleset.
>

It has something to do with this shortcircuit rule I added to my local.cf
some time ago:

shortcircuit RCVD_IN_VALIDITY_SAFE on

Commenting this out results in normal operation. Any idea how that could
possibly happen?!
Re: Mial hits MISSING rules despite presence of headers [ In reply to ]
Well, a short circuit rule kind of breaks things in the middle so I do not
think you should really spend too much time on rules that hit/didn't hit.

I like validity but I don't think it justifies a short circuit, FYI.

Regards,
KAM
--
Kevin A. McGrail
Member, Apache Software Foundation
Chair Emeritus Apache SpamAssassin Project
https://www.linkedin.com/in/kmcgrail - 703.798.0171


On Sun, Nov 27, 2022 at 8:19 PM Alex <mysqlstudent@gmail.com> wrote:

> Hi,
>
> > I have emails from wayfair and Dell that hit many of the MISSING_*
>>> > rules
>>> > but these headers are clearly displayed.
>>> >
>>> > * 0.5 MISSING_MID Missing Message-Id: header
>>> > * 1.0 MISSING_FROM Missing From: header
>>> > * 1.8 MISSING_SUBJECT Missing Subject: header
>>> > * 1.4 MISSING_DATE Missing Date: header
>>> > * 2.3 EMPTY_MESSAGE Message appears to have no textual parts and no
>>> > * Subject: text
>>> >
>>> > This also consequently causes DMARC/DKIM to fail.
>>> >
>>> > https://pastebin.com/yFCRx76x
>>> >
>>> > $ spamassassin --version
>>> > SpamAssassin version 4.0.0-r1904221
>>> > running on Perl version 5.36.0
>>>
>>> Cannot reproduce. Pasting a copy of that from the 'raw' view and feeding
>>> it to 'spamassassin -t' doesn't result in hits on any of those rules.
>>>
>>> How are you calling SA?
>>>
>>> I have a theory about what might be happening, but it would require
>>> using report_safe=1 and a flow that passes twice through SA...
>>>
>>
>> I'm calling SA through amavis, but it happens even when running SA from
>> the command-line:
>>
>> $ spamassassin -t < email.eml
>>
>> I do actually notice it does print the rules that are triggered twice,
>> but I don't think the scores are duplicated.
>>
>> report_safe=1 is set in 10_defaults.pref in the updates.spamassassin.org
>> ruleset.
>>
>
> It has something to do with this shortcircuit rule I added to my local.cf
> some time ago:
>
> shortcircuit RCVD_IN_VALIDITY_SAFE on
>
> Commenting this out results in normal operation. Any idea how that could
> possibly happen?!
>
>
>
Re: Mial hits MISSING rules despite presence of headers [ In reply to ]
Hi,

> Well, a short circuit rule kind of breaks things in the middle so I do not
> think you should really spend too much time on rules that hit/didn't hit.
>
> I like validity but I don't think it justifies a short circuit, FYI.
>

Okay, it's been removed, but somehow the presence of that didn't have the
effect of bypassing any further checks, but actually causing it to be
classified as spam and was quarantined.
Re: Mial hits MISSING rules despite presence of headers [ In reply to ]
What's the score on that short circuit Validity rule? I think the
expectation is that it's a -100 type rule but I could be wrong. Did you
confirm with -D that the behavior is as you describe and more rules kept
running after the short circuit? I don't use the short circuit.

Also, would be helpful to know if this is different than 3.4.6's behavior.

Regards,
KAM
--
Kevin A. McGrail
Member, Apache Software Foundation
Chair Emeritus Apache SpamAssassin Project
https://www.linkedin.com/in/kmcgrail - 703.798.0171


On Mon, Nov 28, 2022 at 10:38 AM Alex <mysqlstudent@gmail.com> wrote:

> Hi,
>
>> Well, a short circuit rule kind of breaks things in the middle so I do
>> not think you should really spend too much time on rules that hit/didn't
>> hit.
>>
>> I like validity but I don't think it justifies a short circuit, FYI.
>>
>
> Okay, it's been removed, but somehow the presence of that didn't have the
> effect of bypassing any further checks, but actually causing it to be
> classified as spam and was quarantined.
>
>
>
Re: Mial hits MISSING rules despite presence of headers [ In reply to ]
On Mon, Nov 28, 2022 at 10:42 AM Kevin A. McGrail <kmcgrail@apache.org>
wrote:

> What's the score on that short circuit Validity rule?
>

-2.0 RCVD_IN_VALIDITY_SAFE RBL: Sender in Validity Safe - Contact
certification@validity.com
[Return Path SenderScore Safe List (formerly]
[Habeas Safelist) - <http://www.senderscorecertified.com
>]
-0.0 SHORTCIRCUIT Not all rules were run, due to a shortcircuited
rule

So despite saying it's shortcircuiting rules, it still erroneously adds
enough points for it to be marked as spam when without the rule it wouldn't
have been marked as such.

I think the expectation is that it's a -100 type rule but I could be
> wrong. Did you confirm with -D that the behavior is as you describe and
> more rules kept running after the short circuit? I don't use the short
> circuit.
>
> Also, would be helpful to know if this is different than 3.4.6's behavior.
>

Oh yes, I meant to mention that it is different behavior for 3.4.6. Same
score for the rule, but it appears to actually shortcircuits the processing
of additional rules. At the least, it doesn't add those MISSING_* rules.
Re: Mial hits MISSING rules despite presence of headers [ In reply to ]
On 2022-11-28 at 11:03:29 UTC-0500 (Mon, 28 Nov 2022 11:03:29 -0500)
Alex <mysqlstudent@gmail.com>
is rumored to have said:

> On Mon, Nov 28, 2022 at 10:42 AM Kevin A. McGrail
> <kmcgrail@apache.org>
> wrote:
[...]
>> Also, would be helpful to know if this is different than 3.4.6's
>> behavior.
>>
>
> Oh yes, I meant to mention that it is different behavior for 3.4.6.
> Same
> score for the rule, but it appears to actually shortcircuits the
> processing
> of additional rules. At the least, it doesn't add those MISSING_*
> rules.

This is almost certainly a side-effect of recent reworking of the
housekeeping around which rules have been run.

As a temporary work-around, I think it would be wise to give any rule
that gets SHORTCIRCUITed an overwhelming score in whichever direction it
operates.


--
Bill Cole
bill@scconsult.com or billcole@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Not Currently Available For Hire
Re: Mial hits MISSING rules despite presence of headers [ In reply to ]
Damn. Was hoping that wasn't the case. Can we get a bug open?

On Mon, Nov 28, 2022, 11:47 Bill Cole <
sausers-20150205@billmail.scconsult.com> wrote:

> On 2022-11-28 at 11:03:29 UTC-0500 (Mon, 28 Nov 2022 11:03:29 -0500)
> Alex <mysqlstudent@gmail.com>
> is rumored to have said:
>
> > On Mon, Nov 28, 2022 at 10:42 AM Kevin A. McGrail
> > <kmcgrail@apache.org>
> > wrote:
> [...]
> >> Also, would be helpful to know if this is different than 3.4.6's
> >> behavior.
> >>
> >
> > Oh yes, I meant to mention that it is different behavior for 3.4.6.
> > Same
> > score for the rule, but it appears to actually shortcircuits the
> > processing
> > of additional rules. At the least, it doesn't add those MISSING_*
> > rules.
>
> This is almost certainly a side-effect of recent reworking of the
> housekeeping around which rules have been run.
>
> As a temporary work-around, I think it would be wise to give any rule
> that gets SHORTCIRCUITed an overwhelming score in whichever direction it
> operates.
>
>
> --
> Bill Cole
> bill@scconsult.com or billcole@apache.org
> (AKA @grumpybozo and many *@billmail.scconsult.com addresses)
> Not Currently Available For Hire
>
Re: Mial hits MISSING rules despite presence of headers [ In reply to ]
On 11/28/22 17:47, Bill Cole wrote:
> On 2022-11-28 at 11:03:29 UTC-0500 (Mon, 28 Nov 2022 11:03:29 -0500)
> Alex <mysqlstudent@gmail.com>
> is rumored to have said:
>
>> On Mon, Nov 28, 2022 at 10:42 AM Kevin A. McGrail <kmcgrail@apache.org>
>> wrote:
> [...]
>>> Also, would be helpful to know if this is different than 3.4.6's behavior.
>>>
>>
>> Oh yes, I meant to mention that it is different behavior for 3.4.6. Same
>> score for the rule, but it appears to actually shortcircuits the processing
>> of additional rules. At the least, it doesn't add those MISSING_* rules.
>
> This is almost certainly a side-effect of recent reworking of the housekeeping around which rules have been run.
>
> As a temporary work-around, I think it would be wise to give any rule that gets SHORTCIRCUITed an overwhelming score in whichever direction it operates.
>
>
Confirmed, r1904981 is the commit that is causing this behavior.
Giovanni
Re: Mial hits MISSING rules despite presence of headers [ In reply to ]
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=8078 is now open on this
issue.
--
Kevin A. McGrail
Member, Apache Software Foundation
Chair Emeritus Apache SpamAssassin Project
https://www.linkedin.com/in/kmcgrail - 703.798.0171


On Tue, Nov 29, 2022 at 1:11 PM <giovanni@paclan.it> wrote:

> On 11/28/22 17:47, Bill Cole wrote:
> > On 2022-11-28 at 11:03:29 UTC-0500 (Mon, 28 Nov 2022 11:03:29 -0500)
> > Alex <mysqlstudent@gmail.com>
> > is rumored to have said:
> >
> >> On Mon, Nov 28, 2022 at 10:42 AM Kevin A. McGrail <kmcgrail@apache.org>
> >> wrote:
> > [...]
> >>> Also, would be helpful to know if this is different than 3.4.6's
> behavior.
> >>>
> >>
> >> Oh yes, I meant to mention that it is different behavior for 3.4.6. Same
> >> score for the rule, but it appears to actually shortcircuits the
> processing
> >> of additional rules. At the least, it doesn't add those MISSING_* rules.
> >
> > This is almost certainly a side-effect of recent reworking of the
> housekeeping around which rules have been run.
> >
> > As a temporary work-around, I think it would be wise to give any rule
> that gets SHORTCIRCUITed an overwhelming score in whichever direction it
> operates.
> >
> >
> Confirmed, r1904981 is the commit that is causing this behavior.
> Giovanni
>
Re: Mial hits MISSING rules despite presence of headers [ In reply to ]
Fixed simply with some rule changes as described in the bug.


On Tue, Nov 29, 2022 at 05:28:00PM -0500, Kevin A. McGrail wrote:
> https://bz.apache.org/SpamAssassin/show_bug.cgi?id=8078 is now open on this
> issue.
> --
> Kevin A. McGrail
> Member, Apache Software Foundation
> Chair Emeritus Apache SpamAssassin Project
> https://www.linkedin.com/in/kmcgrail?- 703.798.0171
>
>
> On Tue, Nov 29, 2022 at 1:11 PM <giovanni@paclan.it> wrote:
>
> On 11/28/22 17:47, Bill Cole wrote:
> > On 2022-11-28 at 11:03:29 UTC-0500 (Mon, 28 Nov 2022 11:03:29 -0500)
> > Alex <mysqlstudent@gmail.com>
> > is rumored to have said:
> >
> >> On Mon, Nov 28, 2022 at 10:42 AM Kevin A. McGrail <
> kmcgrail@apache.org>
> >> wrote:
> > [...]
> >>> Also, would be helpful to know if this is different than 3.4.6's
> behavior.
> >>>
> >>
> >> Oh yes, I meant to mention that it is different behavior for 3.4.6. Same
> >> score for the rule, but it appears to actually shortcircuits the
> processing
> >> of additional rules. At the least, it doesn't add those MISSING_* rules.
> >
> > This is almost certainly a side-effect of recent reworking of the
> housekeeping around which rules have been run.
> >
> > As a temporary work-around, I think it would be wise to give any rule
> that gets SHORTCIRCUITed an overwhelming score in whichever direction it
> operates.
> >
> >
> Confirmed, r1904981 is the commit that is causing this behavior.
> ? Giovanni
>
Re: Mial hits MISSING rules despite presence of headers [ In reply to ]
I have not checked but does the short circuiting actually work? The goal of
it is to lower the resource usage of the tool. If it continues to run and
generate longer than we have a problem still.

On Sun, Dec 4, 2022, 08:50 Henrik K <hege@hege.li> wrote:

>
> Fixed simply with some rule changes as described in the bug.
>
>
> On Tue, Nov 29, 2022 at 05:28:00PM -0500, Kevin A. McGrail wrote:
> > https://bz.apache.org/SpamAssassin/show_bug.cgi?id=8078 is now open on
> this
> > issue.
> > --
> > Kevin A. McGrail
> > Member, Apache Software Foundation
> > Chair Emeritus Apache SpamAssassin Project
> > https://www.linkedin.com/in/kmcgrail - 703.798.0171
> >
> >
> > On Tue, Nov 29, 2022 at 1:11 PM <giovanni@paclan.it> wrote:
> >
> > On 11/28/22 17:47, Bill Cole wrote:
> > > On 2022-11-28 at 11:03:29 UTC-0500 (Mon, 28 Nov 2022 11:03:29
> -0500)
> > > Alex <mysqlstudent@gmail.com>
> > > is rumored to have said:
> > >
> > >> On Mon, Nov 28, 2022 at 10:42 AM Kevin A. McGrail <
> > kmcgrail@apache.org>
> > >> wrote:
> > > [...]
> > >>> Also, would be helpful to know if this is different than 3.4.6's
> > behavior.
> > >>>
> > >>
> > >> Oh yes, I meant to mention that it is different behavior for
> 3.4.6. Same
> > >> score for the rule, but it appears to actually shortcircuits the
> > processing
> > >> of additional rules. At the least, it doesn't add those MISSING_*
> rules.
> > >
> > > This is almost certainly a side-effect of recent reworking of the
> > housekeeping around which rules have been run.
> > >
> > > As a temporary work-around, I think it would be wise to give any
> rule
> > that gets SHORTCIRCUITed an overwhelming score in whichever
> direction it
> > operates.
> > >
> > >
> > Confirmed, r1904981 is the commit that is causing this behavior.
> > Giovanni
> >
>
Re: Mial hits MISSING rules despite presence of headers [ In reply to ]
Of course it does and processing doesn't need to stop into a brickwall when
it activates. It simply finishes metas which is not that expensive and
might provide some additional useful hits. No sense postponing 4.0.0 to try
to tweak this further.

On Sun, Dec 04, 2022 at 09:28:02AM -0500, Kevin A. McGrail wrote:
> I have not checked but does the short circuiting actually work? The goal of it
> is to lower the resource usage of the tool. If it continues to run and generate
> longer than we have a problem still.
>
> On Sun, Dec 4, 2022, 08:50 Henrik K <[1]hege@hege.li> wrote:
>
>
> Fixed simply with some rule changes as described in the bug.
>
>
> On Tue, Nov 29, 2022 at 05:28:00PM -0500, Kevin A. McGrail wrote:
> > [2]https://bz.apache.org/SpamAssassin/show_bug.cgi?id=8078 is now open on
> this
> > issue.
> > --
> > Kevin A. McGrail
> > Member, Apache Software Foundation
> > Chair Emeritus Apache SpamAssassin Project
> > [3]https://www.linkedin.com/in/kmcgrail?- 703.798.0171
> >
> >
> > On Tue, Nov 29, 2022 at 1:11 PM <[4]giovanni@paclan.it> wrote:
> >
> >? ? ?On 11/28/22 17:47, Bill Cole wrote:
> >? ? ?> On 2022-11-28 at 11:03:29 UTC-0500 (Mon, 28 Nov 2022 11:03:29
> -0500)
> >? ? ?> Alex <[5]mysqlstudent@gmail.com>
> >? ? ?> is rumored to have said:
> >? ? ?>
> >? ? ?>> On Mon, Nov 28, 2022 at 10:42 AM Kevin A. McGrail <
> >? ? ?[6]kmcgrail@apache.org>
> >? ? ?>> wrote:
> >? ? ?> [...]
> >? ? ?>>> Also, would be helpful to know if this is different than 3.4.6's
> >? ? ?behavior.
> >? ? ?>>>
> >? ? ?>>
> >? ? ?>> Oh yes, I meant to mention that it is different behavior for
> 3.4.6. Same
> >? ? ?>> score for the rule, but it appears to actually shortcircuits the
> >? ? ?processing
> >? ? ?>> of additional rules. At the least, it doesn't add those MISSING_*
> rules.
> >? ? ?>
> >? ? ?> This is almost certainly a side-effect of recent reworking of the
> >? ? ?housekeeping around which rules have been run.
> >? ? ?>
> >? ? ?> As a temporary work-around, I think it would be wise to give any
> rule
> >? ? ?that gets SHORTCIRCUITed an overwhelming score in whichever direction
> it
> >? ? ?operates.
> >? ? ?>
> >? ? ?>
> >? ? ?Confirmed, r1904981 is the commit that is causing this behavior.
> >? ? ?? Giovanni
> >
>
>
> References:
>
> [1] mailto:hege@hege.li
> [2] https://bz.apache.org/SpamAssassin/show_bug.cgi?id=8078
> [3] https://www.linkedin.com/in/kmcgrail
> [4] mailto:giovanni@paclan.it
> [5] mailto:mysqlstudent@gmail.com
> [6] mailto:kmcgrail@apache.org
Re: Mial hits MISSING rules despite presence of headers [ In reply to ]
I think that will have to go to discussion since if the rules don't short
circuit the way they used to, other rules outside of the ones we control
are going to act oddly. The one that was reported was with validity for
example.

What happens if I have a local rule that's high scoring and meta that would
have been short circuited prior? In 3.4 I would have expected to stop when
I hit the validity rule, now I continue running and hit another rule that's
very high scoring and end up with a mis classification.

From what I understand that is the real world scenario of what it's
occurring.

At a minimum we would have to announce this change for people to look at
their short circuit rules.

What are your thoughts?

On Sun, Dec 4, 2022, 09:36 Henrik K <hege@hege.li> wrote:

>
> Of course it does and processing doesn't need to stop into a brickwall when
> it activates. It simply finishes metas which is not that expensive and
> might provide some additional useful hits. No sense postponing 4.0.0 to
> try
> to tweak this further.
>
> On Sun, Dec 04, 2022 at 09:28:02AM -0500, Kevin A. McGrail wrote:
> > I have not checked but does the short circuiting actually work? The goal
> of it
> > is to lower the resource usage of the tool. If it continues to run and
> generate
> > longer than we have a problem still.
> >
> > On Sun, Dec 4, 2022, 08:50 Henrik K <[1]hege@hege.li> wrote:
> >
> >
> > Fixed simply with some rule changes as described in the bug.
> >
> >
> > On Tue, Nov 29, 2022 at 05:28:00PM -0500, Kevin A. McGrail wrote:
> > > [2]https://bz.apache.org/SpamAssassin/show_bug.cgi?id=8078 is now
> open on
> > this
> > > issue.
> > > --
> > > Kevin A. McGrail
> > > Member, Apache Software Foundation
> > > Chair Emeritus Apache SpamAssassin Project
> > > [3]https://www.linkedin.com/in/kmcgrail - 703.798.0171
> > >
> > >
> > > On Tue, Nov 29, 2022 at 1:11 PM <[4]giovanni@paclan.it> wrote:
> > >
> > > On 11/28/22 17:47, Bill Cole wrote:
> > > > On 2022-11-28 at 11:03:29 UTC-0500 (Mon, 28 Nov 2022 11:03:29
> > -0500)
> > > > Alex <[5]mysqlstudent@gmail.com>
> > > > is rumored to have said:
> > > >
> > > >> On Mon, Nov 28, 2022 at 10:42 AM Kevin A. McGrail <
> > > [6]kmcgrail@apache.org>
> > > >> wrote:
> > > > [...]
> > > >>> Also, would be helpful to know if this is different than
> 3.4.6's
> > > behavior.
> > > >>>
> > > >>
> > > >> Oh yes, I meant to mention that it is different behavior for
> > 3.4.6. Same
> > > >> score for the rule, but it appears to actually
> shortcircuits the
> > > processing
> > > >> of additional rules. At the least, it doesn't add those
> MISSING_*
> > rules.
> > > >
> > > > This is almost certainly a side-effect of recent reworking
> of the
> > > housekeeping around which rules have been run.
> > > >
> > > > As a temporary work-around, I think it would be wise to give
> any
> > rule
> > > that gets SHORTCIRCUITed an overwhelming score in whichever
> direction
> > it
> > > operates.
> > > >
> > > >
> > > Confirmed, r1904981 is the commit that is causing this
> behavior.
> > > Giovanni
> > >
> >
> >
> > References:
> >
> > [1] mailto:hege@hege.li
> > [2] https://bz.apache.org/SpamAssassin/show_bug.cgi?id=8078
> > [3] https://www.linkedin.com/in/kmcgrail
> > [4] mailto:giovanni@paclan.it
> > [5] mailto:mysqlstudent@gmail.com
> > [6] mailto:kmcgrail@apache.org
>
Re: Mial hits MISSING rules despite presence of headers [ In reply to ]
Feel free to reopen the bug if you want, I really have no time or desire to
work on these right now. I didn't analyze if skipping do_meta_tests for
shortcircuiting has any negative consequences, but if someone wants to prove
it doesn't, go for it and I'll vote on it. It not enough to just post a
patch that is a "possible fix".


On Sun, Dec 04, 2022 at 09:42:59AM -0500, Kevin A. McGrail wrote:
> I think that will have to go to discussion since if the rules don't short
> circuit the way they used to, other rules outside of the ones we control are
> going to act oddly. The one that was reported was with validity for example.
>
> What happens if I have a local rule that's high scoring and meta that would
> have been short circuited prior?? In 3.4 I would have expected to stop when I
> hit the validity rule, now I continue running and hit another rule that's very
> high scoring and end up with a mis classification.
>
> From what I understand that is the real world scenario of what it's occurring.
>
> At a minimum we would have to announce this change for people to look at their
> short circuit rules.
>
> What are your thoughts?
>
> On Sun, Dec 4, 2022, 09:36 Henrik K <[1]hege@hege.li> wrote:
>
>
> Of course it does and processing doesn't need to stop into a brickwall when
> it activates.? It simply finishes metas which is not that expensive and
> might provide some additional useful hits.? No sense postponing 4.0.0 to
> try
> to tweak this further.
>
> On Sun, Dec 04, 2022 at 09:28:02AM -0500, Kevin A. McGrail wrote:
> > I have not checked but does the short circuiting actually work? The goal
> of it
> > is to lower the resource usage of the tool. If it continues to run and
> generate
> > longer than we have a problem still.
> >
> > On Sun, Dec 4, 2022, 08:50 Henrik K <[1][2]hege@hege.li> wrote:
> >
> >
> >? ? ?Fixed simply with some rule changes as described in the bug.
> >
> >
> >? ? ?On Tue, Nov 29, 2022 at 05:28:00PM -0500, Kevin A. McGrail wrote:
> >? ? ?> [2][3]https://bz.apache.org/SpamAssassin/show_bug.cgi?id=8078 is
> now open on
> >? ? ?this
> >? ? ?> issue.
> >? ? ?> --
> >? ? ?> Kevin A. McGrail
> >? ? ?> Member, Apache Software Foundation
> >? ? ?> Chair Emeritus Apache SpamAssassin Project
> >? ? ?> [3][4]https://www.linkedin.com/in/kmcgrail?- 703.798.0171
> >? ? ?>
> >? ? ?>
> >? ? ?> On Tue, Nov 29, 2022 at 1:11 PM <[4][5]giovanni@paclan.it> wrote:
> >? ? ?>
> >? ? ?>? ? ?On 11/28/22 17:47, Bill Cole wrote:
> >? ? ?>? ? ?> On 2022-11-28 at 11:03:29 UTC-0500 (Mon, 28 Nov 2022 11:03:29
> >? ? ?-0500)
> >? ? ?>? ? ?> Alex <[5][6]mysqlstudent@gmail.com>
> >? ? ?>? ? ?> is rumored to have said:
> >? ? ?>? ? ?>
> >? ? ?>? ? ?>> On Mon, Nov 28, 2022 at 10:42 AM Kevin A. McGrail <
> >? ? ?>? ? ?[6][7]kmcgrail@apache.org>
> >? ? ?>? ? ?>> wrote:
> >? ? ?>? ? ?> [...]
> >? ? ?>? ? ?>>> Also, would be helpful to know if this is different than
> 3.4.6's
> >? ? ?>? ? ?behavior.
> >? ? ?>? ? ?>>>
> >? ? ?>? ? ?>>
> >? ? ?>? ? ?>> Oh yes, I meant to mention that it is different behavior for
> >? ? ?3.4.6. Same
> >? ? ?>? ? ?>> score for the rule, but it appears to actually shortcircuits
> the
> >? ? ?>? ? ?processing
> >? ? ?>? ? ?>> of additional rules. At the least, it doesn't add those
> MISSING_*
> >? ? ?rules.
> >? ? ?>? ? ?>
> >? ? ?>? ? ?> This is almost certainly a side-effect of recent reworking of
> the
> >? ? ?>? ? ?housekeeping around which rules have been run.
> >? ? ?>? ? ?>
> >? ? ?>? ? ?> As a temporary work-around, I think it would be wise to give
> any
> >? ? ?rule
> >? ? ?>? ? ?that gets SHORTCIRCUITed an overwhelming score in whichever
> direction
> >? ? ?it
> >? ? ?>? ? ?operates.
> >? ? ?>? ? ?>
> >? ? ?>? ? ?>
> >? ? ?>? ? ?Confirmed, r1904981 is the commit that is causing this
> behavior.
> >? ? ?>? ? ?? Giovanni
> >? ? ?>
> >
> >
> > References:
> >
> > [1] mailto:[8]hege@hege.li
> > [2] [9]https://bz.apache.org/SpamAssassin/show_bug.cgi?id=8078
> > [3] [10]https://www.linkedin.com/in/kmcgrail
> > [4] mailto:[11]giovanni@paclan.it
> > [5] mailto:[12]mysqlstudent@gmail.com
> > [6] mailto:[13]kmcgrail@apache.org
>
>
> References:
>
> [1] mailto:hege@hege.li
> [2] mailto:hege@hege.li
> [3] https://bz.apache.org/SpamAssassin/show_bug.cgi?id=8078
> [4] https://www.linkedin.com/in/kmcgrail
> [5] mailto:giovanni@paclan.it
> [6] mailto:mysqlstudent@gmail.com
> [7] mailto:kmcgrail@apache.org
> [8] mailto:hege@hege.li
> [9] https://bz.apache.org/SpamAssassin/show_bug.cgi?id=8078
> [10] https://www.linkedin.com/in/kmcgrail
> [11] mailto:giovanni@paclan.it
> [12] mailto:mysqlstudent@gmail.com
> [13] mailto:kmcgrail@apache.org
Re: Mial hits MISSING rules despite presence of headers [ In reply to ]
"Kevin A. McGrail" <kmcgrail@apache.org> writes:

> I think that will have to go to discussion since if the rules don't short
> circuit the way they used to, other rules outside of the ones we control
> are going to act oddly. The one that was reported was with validity for
> example.
>
> What happens if I have a local rule that's high scoring and meta that would
> have been short circuited prior? In 3.4 I would have expected to stop when
> I hit the validity rule, now I continue running and hit another rule that's
> very high scoring and end up with a mis classification.

Perspective from someone who does not deeply understand short
circuiting:

0) I have never had the impression that there were guarantees about the
order of rule evaluations. I do have the impression that network tests
are kicked off in parallel.

1) My impression has always been that short circuiting is about early
termination of scoring and skipping further tests for two reasons:

avoiding both CPU time and remote queries for further tests

avoiding the elapsed time that such tests will take, so that
short-circuited ham can be delivered in a few seconds rather than a
minute

I have always expected that short circuiting should be done for rules
that are -100 or +100, where when they hit you have made a decision.
It seems strange to me that someone would configure short circuiting for
a rule that does not have overwhelming weight.

2) It seems strange to me to have a situation where a message might hit a
+100 and a -100 rule both (on purpose) and further strange that one
might have a scheme where one is marked short circuit and the proper
classification relies on that happening before the others.



Putting on my CS pedant hat, I guess the big question is if there is a
violation of a previously published specification.


I am probably way off, but I hope this is helpful as a proxy for the
typical understanding of someone who does not really understand.
Re: Mial hits MISSING rules despite presence of headers [ In reply to ]
On 2022-12-04 at 09:57:09 UTC-0500 (Sun, 04 Dec 2022 09:57:09 -0500)
Greg Troxel <gdt@lexort.com>
is rumored to have said:

> Putting on my CS pedant hat, I guess the big question is if there is a violation of a previously published specification.

If not, it would only be a consequence of no definitive clear spec existing.

The logic around rule ordering, completion of meta rules, and shortcircuiting is mind-numbingly subtle. If there is a clear unified description of how it has worked in the past, I cannot find it. My sense from the 3-year odyssey that was Bug 7735 is that we've never worked out a complete flowchart or state diagram that covers the whole realm of possible situations. I wouldn't even bet on the existing relevant documentation spread around the project being 100% internally self-consistent.



--
Bill Cole
bill@scconsult.com or billcole@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Not Currently Available For Hire
Re: Mial hits MISSING rules despite presence of headers [ In reply to ]
Bill Cole <sausers-20150205@billmail.scconsult.com> writes:

> On 2022-12-04 at 09:57:09 UTC-0500 (Sun, 04 Dec 2022 09:57:09 -0500)
> Greg Troxel <gdt@lexort.com>
> is rumored to have said:
>
>> Putting on my CS pedant hat, I guess the big question is if there is a violation of a previously published specification.
>
> If not, it would only be a consequence of no definitive clear spec existing.
>
> The logic around rule ordering, completion of meta rules, and
> shortcircuiting is mind-numbingly subtle. If there is a clear unified
> description of how it has worked in the past, I cannot find it. My
> sense from the 3-year odyssey that was Bug 7735 is that we've never
> worked out a complete flowchart or state diagram that covers the whole
> realm of possible situations. I wouldn't even bet on the existing
> relevant documentation spread around the project being 100% internally
> self-consistent.

That's more or less what I was getting at. If there is not a clear
specification (i.e. the documentation says that it works like X) that
people can properly rely on, then the pedant in me says that behavior
changing slightly, but still within the swim lane implied by the
previous non-spec, is not a bug.
Re: Mial hits MISSING rules despite presence of headers [ In reply to ]
OK, so then we have really two Choices:

#1 accept that no code changes are needed, we've fixed a rule(s) we know
might trigger wrong around MISSING HEADERS and we just document the
change in the UPGRADE that shortcircuit may continue to run more meta
rules to finish them out which might not have occurred previously.

Some users using SHORT CIRCUIT would likely be best to weigh in on this
because we are going to conceivably change the classification of mails
unexpectedly different from 3.4.6 SHORT CIRCUIT behavior.

#2 Work on the code so that short circuiting or at least the scoring
behaves as with 3.4.6.

Regards,
KAM

On 12/4/2022 1:42 PM, Greg Troxel wrote:
> That's more or less what I was getting at. If there is not a clear
> specification (i.e. the documentation says that it works like X) that
> people can properly rely on, then the pedant in me says that behavior
> changing slightly, but still within the swim lane implied by the
> previous non-spec, is not a bug.

--
Kevin A. McGrail
KMcGrail@Apache.org

Member, Apache Software Foundation
Chair Emeritus Apache SpamAssassin Project
https://www.linkedin.com/in/kmcgrail - 703.798.0171
Re: Mial hits MISSING rules despite presence of headers [ In reply to ]
As someone that is running a large distributed spamassassin installation, I
depend on shortcircuit to handle large amounts of mail quickly that does
not need scored further. The change in behavior has potential for negative
impact that I will have to test carefully before moving to v4.

On Sun, Dec 4, 2022 at 3:02 PM Kevin A. McGrail <kmcgrail@apache.org> wrote:

> OK, so then we have really two Choices:
>
> #1 accept that no code changes are needed, we've fixed a rule(s) we know
> might trigger wrong around MISSING HEADERS and we just document the
> change in the UPGRADE that shortcircuit may continue to run more meta
> rules to finish them out which might not have occurred previously.
>
> Some users using SHORT CIRCUIT would likely be best to weigh in on this
> because we are going to conceivably change the classification of mails
> unexpectedly different from 3.4.6 SHORT CIRCUIT behavior.
>
> #2 Work on the code so that short circuiting or at least the scoring
> behaves as with 3.4.6.
>
> Regards,
> KAM
>
> On 12/4/2022 1:42 PM, Greg Troxel wrote:
> > That's more or less what I was getting at. If there is not a clear
> > specification (i.e. the documentation says that it works like X) that
> > people can properly rely on, then the pedant in me says that behavior
> > changing slightly, but still within the swim lane implied by the
> > previous non-spec, is not a bug.
>
> --
> Kevin A. McGrail
> KMcGrail@Apache.org
>
> Member, Apache Software Foundation
> Chair Emeritus Apache SpamAssassin Project
> https://www.linkedin.com/in/kmcgrail - 703.798.0171
>
>
Re: Mial hits MISSING rules despite presence of headers [ In reply to ]
Following up on my previous note I think we are working on #2.  I see
that 8078 was reopened and there is some improvements / weighing in on a
patch from Giovanni that might resolve the issue too!

On 12/4/2022 3:02 PM, Kevin A. McGrail wrote:
> OK, so then we have really two Choices:
>
> #1 accept that no code changes are needed, we've fixed a rule(s) we
> know might trigger wrong around MISSING HEADERS and we just document
> the change in the UPGRADE that shortcircuit may continue to run more
> meta rules to finish them out which might not have occurred previously.
>
> Some users using SHORT CIRCUIT would likely be best to weigh in on
> this because we are going to conceivably change the classification of
> mails unexpectedly different from 3.4.6 SHORT CIRCUIT behavior.
>
> #2 Work on the code so that short circuiting or at least the scoring
> behaves as with 3.4.6.
>
> Regards,
> KAM
>
> On 12/4/2022 1:42 PM, Greg Troxel wrote:
>> That's more or less what I was getting at.  If there is not a clear
>> specification (i.e. the documentation says that it works like X) that
>> people can properly rely on, then the pedant in me says that behavior
>> changing slightly, but still within the swim lane implied by the
>> previous non-spec, is not a bug.
>
--
Kevin A. McGrail
KMcGrail@Apache.org

Member, Apache Software Foundation
Chair Emeritus Apache SpamAssassin Project
https://www.linkedin.com/in/kmcgrail - 703.798.0171
Re: Mial hits MISSING rules despite presence of headers [ In reply to ]
"Kevin A. McGrail" <kmcgrail@apache.org> writes:

> #2 Work on the code so that short circuiting or at least the scoring
> behaves as with 3.4.6.

As penance for ranting I went back and re-read everything more
carefully, but feel free to ignore me if I am being unhelpful.

I don't think a -2 shortcircuit rule makes any sense. It seems to me
that the idea of shortcircuit is "I can more or less prove that
skipping the rest won't change the classification in any meaningful
way, so save the resources", and -2 just isn't like that.

Reading the bz entry, I think the real bug is a meta rule evaluating
when the rules it refers to have not finished. It seems obvious (I
say knowing I probably don't understand something) that this leads to
wrong results, and they aren't structurally of the "skip processing"
type that's within "acceptable wrong results".

Wrong meta results seem to me to be outside the vague spec from before.

So I would lean to "do not allow meta rules to evaluate unless all of
the rules they refer to have completed", and if there's a new
special-case that they eval anyway after short circuit -- bypassing the
usual dependency, then don't do that.

As always I may be confused.
Re: Mial hits MISSING rules despite presence of headers [ In reply to ]
On 11/27/22 21:58, Alex wrote:
> Hi,
> I have emails from wayfair and Dell that hit many of the MISSING_* rules but these headers are clearly displayed.
>
>  *  0.5 MISSING_MID Missing Message-Id: header
>  *  1.0 MISSING_FROM Missing From: header
>  *  1.8 MISSING_SUBJECT Missing Subject: header
>  *  1.4 MISSING_DATE Missing Date: header
>  *  2.3 EMPTY_MESSAGE Message appears to have no textual parts and no
>  *      Subject: text
>
> This also consequently causes DMARC/DKIM to fail.
>
> https://pastebin.com/yFCRx76x <https://pastebin.com/yFCRx76x>
>
Could you try if patch in bz 8078 (https://bz.apache.org/SpamAssassin/attachment.cgi?id=5863&action=diff) fixes the issue ?
Spample is no more available on Pastebin.
Thanks
Giovanni

1 2  View All