Mailing List Archive

Defining what the default welcomelist means
The de-welcomelisting of MS marketing raises the question: Why do we maintain a "default" welcomelist?

Based on the documentation, the original purpose of the def_welcomelist* (then whitelist) feature set was to give a set of senders of purely legitimate mail from FPs, with a listing having reduced power relative to normal welcomelist entires *because they were widely phished by spammers*. This was before sender authentication (SPF & DKIM) were in broad use and before authentication-empowered welcomelist features existed. Having these senders weakly protected in SA was a preventative measure to keep frustrated admins (or poisoned Bayes DBs) from rashly overprotecting them and all the phishes or blocklisting them. Today there's much less risk in the welcomelist features because we recommend and use the authenticated forms, which means that if you like, you can use SA WL/BL rules to demand only authenticated mail from some senders. In that context, the purpose of the default welcomelist has wandered.

This explains to some degree the lack of clear relevance and meaning of listings. It is a remnant feature that has outlived its original justification. The original list included big names of respected companies that "everyone" (i.e. as warped by SA committer and vocal user non-diversity...) got occasional important mail from and would never want to block mail from. It has drifted into being a list of "good guys" who, based on our committers' experiences, get FPs that they do not deserve. We have drifted perilously close to being a maintainer of a low-visibility free reputation service with lax oversight. We also come near that peril in our explicit lists of TLDs which are objectively dominated by spammers, but in that case I think we have that risk contained because we have a methodology for validating TLD inclusion and removal: testing single-TLD rules in QA. The default welcomelist is unconfined, because we don't have a clear explicit standard or even a formal transparent mechanism for inclusion and removal. My understanding of some listings is that they were based on one mid-sized site's FPs from their wanted mailstream dominated by one-to-few "personal" B2B email. That is very hard to validate, and at this point the list is too big to trim it back to just the original concept, especially since that concept no longer has much real-world use. It needs more structure to keep it from becoming just "friends, employers, and extorters of SA committers" or being perceived as such.

I believe that part of a way to avoid that is an absolute zero-tolerance policy for spam from listees. We cannot support any standard that gets us bogged down into debates with senders over whether their spam is enough spam to justify risking the FPs they would get without a listing, because we cannot measure that. We cannot be subservient to sender business models that require them to take shortcuts in assuring that they do not ever send spam. We must not be telling our users that they should just eat their spam because some sender doesn't want to spend on confirming users and seems to have a working unsub link.

I ultimately want to document how we add and remove listings and what users should expect from the default welcomelist. I think some important elements are:

1. We serve our users: receivers, not senders. Senders claiming FPs need the support of a corroborating would-be receiver.

2. If senders have FPs on objectively legitimate mail, their first and most important step is to identify WHY SpamAssassin thinks it is spam. and address that. Do you need the invisible text? Is the message embedded in a remotely-fetched image? The sea of "&zwnj" entities in your messages' HTML serves what purpose exactly? If there's a real FP problem with some rule that regularly is proved out by RuleQA, open a bug.

3. This is NOT a general-purpose reputation list. It exists to aid SA users who have FPs from SpamAssassin default rules for wanted mail, where we cannot determine any acceptable adjustment to rules which would avoid the problem. It is a "last resort" form of FP mitigation when we cannot find an acceptable general solution that isn't domain-specific to a widely accepted sender domain.

4. We should only add or remove listings based on specific requests backed by transparent evidence. Subversion commit messages are not enough, we need a bug report or a mailing list discussion.

5. Existing entries are presumed valid unless and until they cause a false "ham" classification of spam which can be shared publicly in a useful form.

6. New entries must pass prolonged RuleQA testing of sender-specific rules before being added to the default welcomelist.

As with everything SpamAssassin: input from users and other contributors is eagerly desired...,

--
Bill Cole
bill@scconsult.com or billcole@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Not Currently Available For Hire
Re: Defining what the default welcomelist means [ In reply to ]
I see it very slightly differently, but mostly agree

Bill Cole <billcole@apache.org> writes:

> 1. We serve our users: receivers, not senders. Senders claiming FPs
> need the support of a corroborating would-be receiver.

Agreed. Or maybe we take requests to add only from receivers.

> 2. If senders have FPs on objectively legitimate mail, their first and
> most important step is to identify WHY SpamAssassin thinks it is
> spam. and address that. Do you need the invisible text? Is the message
> embedded in a remotely-fetched image? The sea of "&zwnj" entities in
> your messages' HTML serves what purpose exactly? If there's a real FP
> problem with some rule that regularly is proved out by RuleQA, open a
> bug.

Sure, but if you serve receivers, often people will have misfiling and
the sender is opaque, even if not spam and dkim. So saying the sender
should fix is misaligned with serving receivers. Yes, they *should*,
but people shouldn't send html mail either :-)

I agree that requests from senders should be met with "make your mail
less spammy".

> 3. This is NOT a general-purpose reputation list. It exists to aid SA
> users who have FPs from SpamAssassin default rules for wanted mail,
> where we cannot determine any acceptable adjustment to rules which
> would avoid the problem. It is a "last resort" form of FP mitigation
> when we cannot find an acceptable general solution that isn't
> domain-specific to a widely accepted sender domain.

I see all spam classification as probabalistic and there is risk of FP.
If a domain emits *only ham* and is dkim signed, and we believe that
receivers want it, I think it makes sense to have it in.

I think of things like alerts from banks, airline saying your flight
time has changed, etc. where FPs are a real problem.

I am extremely skeptical of anything that smells of email marketing
here. I would expect only places sending transactional mail and alerts
to established customers.

> 4. We should only add or remove listings based on specific requests
> backed by transparent evidence. Subversion commit messages are not
> enough, we need a bug report or a mailing list discussion.

sure

> 5. Existing entries are presumed valid unless and until they cause a
> false "ham" classification of spam which can be shared publicly in a
> useful form.

I guess, or if someone makes an argument that they aren't right.

> 6. New entries must pass prolonged RuleQA testing of sender-specific
> rules before being added to the default welcomelist.

I don't follow this. Do you mean add 'def_welcomelist_dkim foo@bar' to
a testing ruleset and see if it's ok? That seems fine if so. If not, I
didn't follow you.


It might also make sense for each welcomelist rule to have a score.
Basically to bring the mail down to -2, to give it some headroom. But
that might be too complicated compared to benefit.
Re: Defining what the default welcomelist means [ In reply to ]
Also, I'm not sure you said this, but I would say:

default whitelist is dkim only

This means

All existing entries are converted to dkim as well as we can, not
worrying if they break. We'll prune ones that don't work as dkim,
and add a signing domain as we figure it out, as a lightweight
thing. But all non-dkim entries go away.

to consider a new entry, it must be dkim

or maybe that's already true
Re: Defining what the default welcomelist means [ In reply to ]
On 20240412 15:56:15, Greg Troxel wrote:
> I see it very slightly differently, but mostly agree
>
> Bill Cole<billcole@apache.org> writes:
>
>> 1. We serve our users: receivers, not senders. Senders claiming FPs
>> need the support of a corroborating would-be receiver.
> Agreed. Or maybe we take requests to add only from receivers.
>
>> 2. If senders have FPs on objectively legitimate mail, their first and
>> most important step is to identify WHY SpamAssassin thinks it is
>> spam. and address that. Do you need the invisible text? Is the message
>> embedded in a remotely-fetched image? The sea of "&zwnj" entities in
>> your messages' HTML serves what purpose exactly? If there's a real FP
>> problem with some rule that regularly is proved out by RuleQA, open a
>> bug.
> Sure, but if you serve receivers, often people will have misfiling and
> the sender is opaque, even if not spam and dkim. So saying the sender
> should fix is misaligned with serving receivers. Yes, they *should*,
> but people shouldn't send html mail either :-)
>
> I agree that requests from senders should be met with "make your mail
> less spammy".
>
>> 3. This is NOT a general-purpose reputation list. It exists to aid SA
>> users who have FPs from SpamAssassin default rules for wanted mail,
>> where we cannot determine any acceptable adjustment to rules which
>> would avoid the problem. It is a "last resort" form of FP mitigation
>> when we cannot find an acceptable general solution that isn't
>> domain-specific to a widely accepted sender domain.
> I see all spam classification as probabalistic and there is risk of FP.
> If a domain emits *only ham* and is dkim signed, and we believe that
> receivers want it, I think it makes sense to have it in.
>
> I think of things like alerts from banks, airline saying your flight
> time has changed, etc. where FPs are a real problem.
>
> I am extremely skeptical of anything that smells of email marketing
> here. I would expect only places sending transactional mail and alerts
> to established customers.
>
>> 4. We should only add or remove listings based on specific requests
>> backed by transparent evidence. Subversion commit messages are not
>> enough, we need a bug report or a mailing list discussion.
> sure
>
>> 5. Existing entries are presumed valid unless and until they cause a
>> false "ham" classification of spam which can be shared publicly in a
>> useful form.
> I guess, or if someone makes an argument that they aren't right.
>
>> 6. New entries must pass prolonged RuleQA testing of sender-specific
>> rules before being added to the default welcomelist.
> I don't follow this. Do you mean add 'def_welcomelist_dkim foo@bar' to
> a testing ruleset and see if it's ok? That seems fine if so. If not, I
> didn't follow you.
>
>
> It might also make sense for each welcomelist rule to have a score.
> Basically to bring the mail down to -2, to give it some headroom. But
> that might be too complicated compared to benefit.

One pesky detail still exists. There is a very broad fuzzy area where my spam is
your ham and vice versa. You could probably drive yourself to an early grave
trying to get the perfect Bayes training plus perfect rule set.

{o.o}   Joanne
Re: Defining what the default welcomelist means [ In reply to ]
jdow <jdow@earthlink.net> writes:

> One pesky detail still exists. There is a very broad fuzzy area where
> my spam is your ham and vice versa. You could probably drive yourself
> to an early grave trying to get the perfect Bayes training plus
> perfect rule set.

spam is bulk and unsolicited. So yes the same message could be either,
but if a sender spams anyone, they are spammer, even if they send mail
that isn't spam.
Re: Defining what the default welcomelist means [ In reply to ]
On 20240412 16:14:44, Greg Troxel wrote:
> jdow<jdow@earthlink.net> writes:
>
>> One pesky detail still exists. There is a very broad fuzzy area where
>> my spam is your ham and vice versa. You could probably drive yourself
>> to an early grave trying to get the perfect Bayes training plus
>> perfect rule set.
> spam is bulk and unsolicited. So yes the same message could be either,
> but if a sender spams anyone, they are spammer, even if they send mail
> that isn't spam.

Ah, no, that way leads to disaster. Some people resign from lists by declaring
the sender spam. That could end up cutting access to all the people who want the
emails. (At various times this list and most 'ix lists were unusually difficult
to resign from. And, yes, I have been around that long. I'm just too politically
incorrect for most lists these days, {^_-}) It is wise to be careful about how
soon you pull the "spammer" trigger. YMMV and YAMV (Attitude).

{^_^}
Re: Defining what the default welcomelist means [ In reply to ]
On 2024-04-12 at 18:56:15 UTC-0400 (Fri, 12 Apr 2024 18:56:15 -0400)
Greg Troxel <gdt@lexort.com>
is rumored to have said:

> I see it very slightly differently, but mostly agree
>
> Bill Cole <billcole@apache.org> writes:
>
>> 1. We serve our users: receivers, not senders. Senders claiming FPs
>> need the support of a corroborating would-be receiver.
>
> Agreed. Or maybe we take requests to add only from receivers.

Effectively, yes. Senders won't refrain from requesting to be welcomed by default just because we say we don't accept those requests. Only receivers can corroborate the existence of any FP problem which would be solved by a default welcomelist entry, and this isn't a 'just find one example' sort of issue.

>> 2. If senders have FPs on objectively legitimate mail, their first and
>> most important step is to identify WHY SpamAssassin thinks it is
>> spam. and address that. Do you need the invisible text? Is the message
>> embedded in a remotely-fetched image? The sea of "&zwnj" entities in
>> your messages' HTML serves what purpose exactly? If there's a real FP
>> problem with some rule that regularly is proved out by RuleQA, open a
>> bug.
>
> Sure, but if you serve receivers, often people will have misfiling and
> the sender is opaque, even if not spam and dkim. So saying the sender
> should fix is misaligned with serving receivers. Yes, they *should*,
> but people shouldn't send html mail either :-)

I don't see this as misaligned, but rather a way of saying that def_w* entries come behind site-local receiver mitigations and receiver/sender collaboration on fixing the shabby mail.

> I agree that requests from senders should be met with "make your mail
> less spammy".

Right. If SA is generating FPs, in nearly all cases this can be fixed without resorting to a global welcomelist entry. There's a balance between local rule mitigations, sender adjustments to lose spamsign patterns, and tweaks to the rules at the project level which validate in RuleQA in how FP issues are solved, and def_wl entries really should be a last resort.

One reason I opened this topic is that many existing listings were nothing like last resorts to solve concrete problems but seem to be more prophylactically applied. I.e. to assure that generally (and vaguely) 'good' senders will get their mail through despite using pointless antipatterns that are predominantly used by spammers. Maybe there's a need for that, but it should not be part of SA proper.


>> 3. This is NOT a general-purpose reputation list. It exists to aid SA
>> users who have FPs from SpamAssassin default rules for wanted mail,
>> where we cannot determine any acceptable adjustment to rules which
>> would avoid the problem. It is a "last resort" form of FP mitigation
>> when we cannot find an acceptable general solution that isn't
>> domain-specific to a widely accepted sender domain.
>
> I see all spam classification as probabalistic and there is risk of FP.
> If a domain emits *only ham* and is dkim signed, and we believe that
> receivers want it, I think it makes sense to have it in.

I see no point in that if there is no *evidence* of actual FPs. I don't think the default rules should try to game local incidents of Bayes or AWL dis-learning that ends up hitting banking notifications. Or (at the risk of being misinterpreted...) by the use of 3rd-party rules like the KAM channel that are much tougher on the bad HTML practices of corporate email composers.

> I think of things like alerts from banks, airline saying your flight
> time has changed, etc. where FPs are a real problem.

Right. I think we basically have that covered with the legacy entries, which are extensive, undocumented, and generally banal.

> I am extremely skeptical of anything that smells of email marketing
> here. I would expect only places sending transactional mail and alerts
> to established customers.

I share the skepticism, but I have been working with business customers and their love of other businesspeople's email marketing (and random non-work-related email...) for long enough that I have stopped arguing with the nature of email that people eagerly desire in their mailboxes. I care that it is contextually safe, legal, and solidly consensual. There are marketers who stay inside the lines.

>> 4. We should only add or remove listings based on specific requests
>> backed by transparent evidence. Subversion commit messages are not
>> enough, we need a bug report or a mailing list discussion.
>
> sure

Important because it brings us more in line with the transparency norms that all ASF projects are expected to follow and because it reduces the likelihood of snowballing conflict to have a record of open discussion of how & why decisions are made.

>> 5. Existing entries are presumed valid unless and until they cause a
>> false "ham" classification of spam which can be shared publicly in a
>> useful form.
>
> I guess, or if someone makes an argument that they aren't right.

Defining the validity of "aren't right" arguments is important.

I believe that we are ethically (and perhaps as a result legally) safe as long as we are acting on rational judgment grounded in relevant facts and not hunches.


>> 6. New entries must pass prolonged RuleQA testing of sender-specific
>> rules before being added to the default welcomelist.
>
> I don't follow this. Do you mean add 'def_welcomelist_dkim foo@bar' to
> a testing ruleset and see if it's ok?

No. That's not a useful test, as it gets lost in the rest of the list.

> That seems fine if so. If not, I
> didn't follow you.

It's easy to write a rule that will identify mail from a specific sender pattern, passing DKIM and/or SPF. The general welcomelist/blocklist mechanism exists because it's easier and more manageable than having rules for each pattern, but testing has to be done with specific rules.

If you want to see an example, look for the bugs in the past couple of years opened against the various abused TLDs that we have in a suspicious domains list used in various rules. The test has been to create rules examining specific TLDs, e.g. the xyz, best, online, site, fun, pro, and btc TLDs all have test rules in the current default ruleset. (LAt the moment it looks like xyz need to not be listed.)

> It might also make sense for each welcomelist rule to have a score.

Do you mean unique rules per domain? No, that's got a scaling problem.

> Basically to bring the mail down to -2, to give it some headroom. But
> that might be too complicated compared to benefit.


Anyone who feels it justified can (as I do) reduce the power of the default welcomelist:

score USER_IN_DEF_DKIM_WL -2
score USER_IN_DEF_SPF_WL -2

By default those each score -7.5 so a doubly-confirmed message gets the same insane -15 as a legacy listing (def_whitelist_from_rcvd) that doesn't require authentication. No such listings still exist in the default rules.



--
Bill Cole
bill@scconsult.com or billcole@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Not Currently Available For Hire
Re: Defining what the default welcomelist means [ In reply to ]
On 2024-04-12 at 19:01:21 UTC-0400 (Fri, 12 Apr 2024 19:01:21 -0400)
Greg Troxel <gdt@lexort.com>
is rumored to have said:

> Also, I'm not sure you said this, but I would say:
>
> default whitelist is dkim only

No. Existing practice is that we trust both DKIM and SPF, and I think that's fine.

There are no unauthenticated listings extant in the default rules and no new ones should ever be created.

> This means
>
> All existing entries are converted to dkim as well as we can, not
> worrying if they break. We'll prune ones that don't work as dkim,
> and add a signing domain as we figure it out, as a lightweight
> thing. But all non-dkim entries go away.
>
> to consider a new entry, it must be dkim
>
> or maybe that's already true


s/dkim/authenticated/ and it's already true.

This is part of how the default welcomelist has lost alignment with its origins. The original was a tactical mitigation against heavy phishing in a largely unauthenticated-sender world, deployed in part to forestall extreme responses to the problem of everyone claiming to send Paypal notifications to everyone.


--
Bill Cole
bill@scconsult.com or billcole@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Not Currently Available For Hire
Re: Defining what the default welcomelist means [ In reply to ]
On 2024-04-12 at 19:26:59 UTC-0400 (Fri, 12 Apr 2024 16:26:59 -0700)
jdow <jdow@earthlink.net>
is rumored to have said:

> On 20240412 16:14:44, Greg Troxel wrote:
>> jdow<jdow@earthlink.net> writes:
>>
>>> One pesky detail still exists. There is a very broad fuzzy area where
>>> my spam is your ham and vice versa. You could probably drive yourself
>>> to an early grave trying to get the perfect Bayes training plus
>>> perfect rule set.
>> spam is bulk and unsolicited. So yes the same message could be either,
>> but if a sender spams anyone, they are spammer, even if they send mail
>> that isn't spam.
>
> Ah, no, that way leads to disaster.

Not really. Maybe in theory, if you have a slightly wrong theory.

> Some people resign from lists by declaring the sender spam.

They are wrong, objectively, if they subscribed to the list without being deceived about what the list entailed.

There is definitely a tension between the recognition that in the ultimate analysis, 'spam' is in the eye of the beholder and the fact that there exist clear lines between the grey areas in which one can rationally debate whether a particular message is spam and the undebatable areas where the ham/spam discrimination is clear.

> That could end up cutting access to all the people who want the emails.

That's far outside the realm of possible effects of defining what the default welcomelist means and managing it transparently in line with that definition.

> (At various times this list and most 'ix lists were unusually difficult to resign from. And, yes, I have been around that long. I'm just too politically incorrect for most lists these days, {^_-}) It is wise to be careful about how soon you pull the "spammer" trigger. YMMV and YAMV (Attitude).

FWIW, we can't maintain SA to accommodate the obstinacy of gated BITNET LISTSERV nodes in '89. The only reasons for unsub difficulties in 2024 are technical failures and spammer excuses. Modern SpamAssassin is only supposed to deal with modern realities, not historical curiosities.



--
Bill Cole
bill@scconsult.com or billcole@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Not Currently Available For Hire
Re: Defining what the default welcomelist means [ In reply to ]
Bill Cole skrev den 2024-04-13 19:42:

> score USER_IN_DEF_DKIM_WL -2
> score USER_IN_DEF_SPF_WL -2
>
> By default those each score -7.5 so a doubly-confirmed message gets the
> same insane -15 as a legacy listing (def_whitelist_from_rcvd) that
> doesn't require authentication. No such listings still exist in the
> default rules.

one score set ? :)

score USER_IN_DEF_DKIM_WL (-2)
score USER_IN_DEF_SPF_WL (-2)

will dynamic reduce corpus score with 2

your way is not dynamic

xpoint@tux ~ $ grep -r -i USER_IN_DEF /var/lib/spamassassin/4.000000/
/var/lib/spamassassin/4.000000/updates_spamassassin_org/50_scores.cf:#score
USER_IN_DEF_WELCOMELIST -15.000 - Moved to 60_welcomelist.cf
/var/lib/spamassassin/4.000000/updates_spamassassin_org/50_scores.cf:score
USER_IN_DEF_SPF_WL -7.500
/var/lib/spamassassin/4.000000/updates_spamassassin_org/50_scores.cf:score
USER_IN_DEF_DKIM_WL -7.500
/var/lib/spamassassin/4.000000/updates_spamassassin_org/30_text_pt_br.cf:lang
pt_BR describe USER_IN_DEF_WELCOMELIST Endereço do From: está na
welcomelist padrão
/var/lib/spamassassin/4.000000/updates_spamassassin_org/30_text_pt_br.cf:lang
pt_BR describe USER_IN_DEF_DKIM_WL Endereço do From: está na welcomelist
de DKIM padrão
/var/lib/spamassassin/4.000000/updates_spamassassin_org/30_text_pt_br.cf:lang
pt_BR describe USER_IN_DEF_SPF_WL Endereço do From: está na welcomelist
de SPF padrão
/var/lib/spamassassin/4.000000/updates_spamassassin_org/30_text_pl.cf:lang
pl describe USER_IN_DEF_WELCOMELIST U¿ytkownik jest wymieniony w
domy¶lnej welcome-list (bia³ej li¶cie)
/var/lib/spamassassin/4.000000/updates_spamassassin_org/72_active.cf:
meta HTML_TEXT_INVISIBLE_FONT __FONT_INVIS_MANY &&
!__HAS_ERRORS_TO && !__URI_DOTGOV && !__LYRIS_EZLM_REMAILER && !__ML3 &&
!__THREADED && !__DKIMWL_WL_HI && !USER_IN_DEF_DKIM_WL &&
!__MOZILLA_MSGID
/var/lib/spamassassin/4.000000/updates_spamassassin_org/30_text_de.cf:lang
de describe USER_IN_DEF_WELCOMELIST Absenderadresse steht in der
allgemeinen weißen Liste
/var/lib/spamassassin/4.000000/updates_spamassassin_org/60_welcomelist_dkim.cf:
header USER_IN_DEF_DKIM_WL eval:check_for_def_dkim_welcomelist_from()
/var/lib/spamassassin/4.000000/updates_spamassassin_org/60_welcomelist_dkim.cf:
describe USER_IN_DEF_DKIM_WL From: address is in the default DKIM
welcome-list
/var/lib/spamassassin/4.000000/updates_spamassassin_org/60_welcomelist_dkim.cf:
tflags USER_IN_DEF_DKIM_WL nice noautolearn net
/var/lib/spamassassin/4.000000/updates_spamassassin_org/60_welcomelist_dkim.cf:
reuse USER_IN_DEF_DKIM_WL
/var/lib/spamassassin/4.000000/updates_spamassassin_org/60_welcomelist_dkim.cf:
header USER_IN_DEF_DKIM_WL eval:check_for_def_dkim_whitelist_from()
/var/lib/spamassassin/4.000000/updates_spamassassin_org/60_welcomelist_dkim.cf:
describe USER_IN_DEF_DKIM_WL From: address is in the default DKIM
welcome-list
/var/lib/spamassassin/4.000000/updates_spamassassin_org/60_welcomelist_dkim.cf:
tflags USER_IN_DEF_DKIM_WL nice noautolearn net
/var/lib/spamassassin/4.000000/updates_spamassassin_org/60_welcomelist_dkim.cf:
reuse USER_IN_DEF_DKIM_WL
/var/lib/spamassassin/4.000000/updates_spamassassin_org/60_welcomelist_spf.cf:
header USER_IN_DEF_SPF_WL eval:check_for_def_spf_welcomelist_from()
/var/lib/spamassassin/4.000000/updates_spamassassin_org/60_welcomelist_spf.cf:
describe USER_IN_DEF_SPF_WL From: address is in the default SPF
welcome-list
/var/lib/spamassassin/4.000000/updates_spamassassin_org/60_welcomelist_spf.cf:
tflags USER_IN_DEF_SPF_WL userconf nice noautolearn net
/var/lib/spamassassin/4.000000/updates_spamassassin_org/60_welcomelist_spf.cf:
reuse USER_IN_DEF_SPF_WL
/var/lib/spamassassin/4.000000/updates_spamassassin_org/60_welcomelist_spf.cf:
header USER_IN_DEF_SPF_WL eval:check_for_def_spf_whitelist_from()
/var/lib/spamassassin/4.000000/updates_spamassassin_org/60_welcomelist_spf.cf:
describe USER_IN_DEF_SPF_WL From: address is in the default SPF
welcome-list
/var/lib/spamassassin/4.000000/updates_spamassassin_org/60_welcomelist_spf.cf:
tflags USER_IN_DEF_SPF_WL userconf nice noautolearn net
/var/lib/spamassassin/4.000000/updates_spamassassin_org/60_welcomelist_spf.cf:
reuse USER_IN_DEF_SPF_WL
/var/lib/spamassassin/4.000000/updates_spamassassin_org/60_welcomelist_spf.cf:meta
ENV_AND_HDR_SPF_MATCH (USER_IN_DEF_SPF_WL && __ENV_AND_HDR_FROM_MATCH)
/var/lib/spamassassin/4.000000/updates_spamassassin_org/30_text_fr.cf:lang
fr describe USER_IN_DEF_WELCOMELIST Expéditeur dans la liste OK par
défaut de SpamAssassin
/var/lib/spamassassin/4.000000/updates_spamassassin_org/60_shortcircuit.cf:priority
USER_IN_DEF_WELCOMELIST -1000
/var/lib/spamassassin/4.000000/updates_spamassassin_org/60_shortcircuit.cf:priority
USER_IN_DEF_WHITELIST -1000
/var/lib/spamassassin/4.000000/updates_spamassassin_org/60_welcomelist.cf:
header USER_IN_DEF_WELCOMELIST eval:check_from_in_default_welcomelist()
/var/lib/spamassassin/4.000000/updates_spamassassin_org/60_welcomelist.cf:
describe USER_IN_DEF_WELCOMELIST From: user is listed in the default
welcome-list
/var/lib/spamassassin/4.000000/updates_spamassassin_org/60_welcomelist.cf:
tflags USER_IN_DEF_WELCOMELIST userconf nice noautolearn
/var/lib/spamassassin/4.000000/updates_spamassassin_org/60_welcomelist.cf:
score USER_IN_DEF_WELCOMELIST -15
/var/lib/spamassassin/4.000000/updates_spamassassin_org/60_welcomelist.cf:
meta USER_IN_DEF_WHITELIST (USER_IN_DEF_WELCOMELIST)
/var/lib/spamassassin/4.000000/updates_spamassassin_org/60_welcomelist.cf:
describe USER_IN_DEF_WHITELIST DEPRECATED: See USER_IN_WELCOMELIST
/var/lib/spamassassin/4.000000/updates_spamassassin_org/60_welcomelist.cf:
tflags USER_IN_DEF_WHITELIST userconf nice noautolearn
/var/lib/spamassassin/4.000000/updates_spamassassin_org/60_welcomelist.cf:
score USER_IN_DEF_WHITELIST -15
/var/lib/spamassassin/4.000000/updates_spamassassin_org/60_welcomelist.cf:
score USER_IN_DEF_WELCOMELIST -0.01
/var/lib/spamassassin/4.000000/updates_spamassassin_org/60_welcomelist.cf:
header USER_IN_DEF_WELCOMELIST eval:check_from_in_default_whitelist()
/var/lib/spamassassin/4.000000/updates_spamassassin_org/60_welcomelist.cf:
describe USER_IN_DEF_WELCOMELIST From: user is listed in the default
welcome-list
/var/lib/spamassassin/4.000000/updates_spamassassin_org/60_welcomelist.cf:
tflags USER_IN_DEF_WELCOMELIST userconf nice noautolearn
/var/lib/spamassassin/4.000000/updates_spamassassin_org/60_welcomelist.cf:
score USER_IN_DEF_WELCOMELIST -0.01
/var/lib/spamassassin/4.000000/updates_spamassassin_org/60_welcomelist.cf:
meta USER_IN_DEF_WHITELIST (USER_IN_DEF_WELCOMELIST)
/var/lib/spamassassin/4.000000/updates_spamassassin_org/60_welcomelist.cf:
describe USER_IN_DEF_WHITELIST DEPRECATED: See
USER_IN_DEF_WELCOMELIST
/var/lib/spamassassin/4.000000/updates_spamassassin_org/60_welcomelist.cf:
tflags USER_IN_DEF_WHITELIST userconf nice noautolearn
/var/lib/spamassassin/4.000000/updates_spamassassin_org/60_welcomelist.cf:
score USER_IN_DEF_WHITELIST -15
/var/lib/spamassassin/4.000000/updates_spamassassin_org/local.cf:#
shortcircuit USER_IN_DEF_WELCOMELIST on
/var/lib/spamassassin/4.000000/kam_sa-channels_mcgrail_com/KAM_deadweight3_meta.cf:score
ENV_AND_HDR_SPF_MATCH 0 # (USER_IN_DEF_SPF_WL &&
__ENV_AND_HDR_FROM_MATCH)

caos imho
Re: Defining what the default welcomelist means [ In reply to ]
Bill Cole <sausers-20150205@billmail.scconsult.com> writes:

> On 2024-04-12 at 18:56:15 UTC-0400 (Fri, 12 Apr 2024 18:56:15 -0400)
> Greg Troxel <gdt@lexort.com>
>
>> Bill Cole <billcole@apache.org> writes:
>>
>>> 1. We serve our users: receivers, not senders. Senders claiming FPs
>>> need the support of a corroborating would-be receiver.
>>
>> Agreed. Or maybe we take requests to add only from receivers.
>
> Effectively, yes. Senders won't refrain from requesting to be welcomed
> by default just because we say we don't accept those requests. Only
> receivers can corroborate the existence of any FP problem which would
> be solved by a default welcomelist entry, and this isn't a 'just find
> one example' sort of issue.

They won't refrain from writing, but it's fair to not let them open bugs
or have bugs open in the tracker. And to tell them

1) clean up your mail

2) we only take requests for defwl from actual receivers, so we're
done with this conversation. use of sock puppets is not ok.

That's what I meant by "not take requests from".

>>> 2. If senders have FPs on objectively legitimate mail, their first and
>>> most important step is to identify WHY SpamAssassin thinks it is
>>> spam. and address that. Do you need the invisible text? Is the message
>>> embedded in a remotely-fetched image? The sea of "&zwnj" entities in
>>> your messages' HTML serves what purpose exactly? If there's a real FP
>>> problem with some rule that regularly is proved out by RuleQA, open a
>>> bug.
>>
>> Sure, but if you serve receivers, often people will have misfiling and
>> the sender is opaque, even if not spam and dkim. So saying the sender
>> should fix is misaligned with serving receivers. Yes, they *should*,
>> but people shouldn't send html mail either :-)
>
> I don't see this as misaligned, but rather a way of saying that def_w*
> entries come behind site-local receiver mitigations and
> receiver/sender collaboration on fixing the shabby mail.

What I was trying to express is that often senders, even zero-spam
senders, are often enormous, opaque, and intractable. So while I agree
in theory, I guess the real question is whather we want to say to a
receiver:

your non-spam mail is spammy, and we aren't going to add a defwl
because first you need to get e.g. Bank of America to stop sending
html mail.

or

your non-spam mail is spammy and it's ok to add a defwl

I have occasionally complained to BigCorp and it has never been useful.
Sure, one can get the branch manager to reverse a fee, but I mean one
cannot get them to change their practices.

> One reason I opened this topic is that many existing listings were
> nothing like last resorts to solve concrete problems but seem to be
> more prophylactically applied. I.e. to assure that generally (and
> vaguely) 'good' senders will get their mail through despite using
> pointless antipatterns that are predominantly used by spammers. Maybe
> there's a need for that, but it should not be part of SA proper.

This is a slippery slope. We're trying to make correct classification
decisions for users. I can definitely see both sides.

But I don't mean generally/vaguely. I mean senders that are zero-spam
and likely important to receivers, in the bank/airline notification (and
similar) class. Meaning something with real-world consequences that is
timely. Not newsletters.

>> I see all spam classification as probabalistic and there is risk of FP.
>> If a domain emits *only ham* and is dkim signed, and we believe that
>> receivers want it, I think it makes sense to have it in.
>
> I see no point in that if there is no *evidence* of actual FPs. I
> don't think the default rules should try to game local incidents of
> Bayes or AWL dis-learning that ends up hitting banking
> notifications. Or (at the risk of being misinterpreted...) by the use
> of 3rd-party rules like the KAM channel that are much tougher on the
> bad HTML practices of corporate email composers.

FWIW, I have given up on the KAM rules. The scores are insanely high
for things that appear in ham, and I was having too-frequent
misclassification. Some of the scores were triggering on things which
are not even objectively spammy, e.g a watch rule on a technical
discussion of clocks where it was on topic and I was subscribed.

Because of the probabalistic nature, I see it as sensible to defwl
things like bank notifications (that are 100% non-spam and dkim) to
reduce the odds that future rules will cause problems. This is partly
from my KAM ruleset experience where I wake up to misfiled mail because
there is new overly aggressive rule. Much less likely in SA proper, but
still.

>> I am extremely skeptical of anything that smells of email marketing
>> here. I would expect only places sending transactional mail and alerts
>> to established customers.
>
> I share the skepticism, but I have been working with business
> customers and their love of other businesspeople's email marketing
> (and random non-work-related email...) for long enough that I have
> stopped arguing with the nature of email that people eagerly desire in
> their mailboxes. I care that it is contextually safe, legal, and
> solidly consensual. There are marketers who stay inside the lines.

If it's really 100% ok, fine. I just said that I'm skeptical and thus
require more convincing from and ESP than from bank alerts, to overcome
a presumption of "email marketing is rarely ok".

> It's easy to write a rule that will identify mail from a specific
> sender pattern, passing DKIM and/or SPF. The general
> welcomelist/blocklist mechanism exists because it's easier and more
> manageable than having rules for each pattern, but testing has to be
> done with specific rules.
>
> If you want to see an example, look for the bugs in the past couple of
> years opened against the various abused TLDs that we have in a
> suspicious domains list used in various rules. The test has been to
> create rules examining specific TLDs, e.g. the xyz, best, online,
> site, fun, pro, and btc TLDs all have test rules in the current
> default ruleset. (LAt the moment it looks like xyz need to not be
> listed.)

So you mean

we can add a defwl line if it passes the "this really isn't spam and
we [have evidence of FP|concern of future FP]"

and

if it's a *rule*, not a defwl, then the bar is vastly higher

and that makes sense to me.

>> It might also make sense for each welcomelist rule to have a score.
>
> Do you mean unique rules per domain? No, that's got a scaling problem.

I mean

defwl foo.org -4
defwl bar.org -2

and I get it that you object.

However, I think it would be good if one could express that, for users
to configure, even if doctrine says that the default ruleset doesn't do
that. I realize that's out of scope for this discussion.

> Anyone who feels it justified can (as I do) reduce the power of the default welcomelist:
>
> score USER_IN_DEF_DKIM_WL -2
> score USER_IN_DEF_SPF_WL -2

Thanks, useful to know.

> By default those each score -7.5 so a doubly-confirmed message gets
> the same insane -15 as a legacy listing (def_whitelist_from_rcvd) that
> doesn't require authentication. No such listings still exist in the
> default rules.


I am slightly skeptical of SPF vs DKIM, and I wonder how much mail there
is that

belongs on the defwl
does not have dkim

I'd be inclined to

drop defwl spf rules if there is a dkim rule

score USER_IN_DEF_SPF_WL -2.5
(in published rules)

but these are tiny nits not really relevant to your major point.
Re: Defining what the default welcomelist means [ In reply to ]
I believe we are in solid agreement, a few notes below explaining how...


On 2024-04-14 at 08:00:19 UTC-0400 (Sun, 14 Apr 2024 08:00:19 -0400)
Greg Troxel <gdt@lexort.com>
is rumored to have said:

> Bill Cole <sausers-20150205@billmail.scconsult.com> writes:
>
>> On 2024-04-12 at 18:56:15 UTC-0400 (Fri, 12 Apr 2024 18:56:15 -0400)
>> Greg Troxel <gdt@lexort.com>
>>
>>> Bill Cole <billcole@apache.org> writes:
>>>
>>>> 1. We serve our users: receivers, not senders. Senders claiming FPs
>>>> need the support of a corroborating would-be receiver.
>>>
>>> Agreed. Or maybe we take requests to add only from receivers.
>>
>> Effectively, yes. Senders won't refrain from requesting to be welcomed
>> by default just because we say we don't accept those requests. Only
>> receivers can corroborate the existence of any FP problem which would
>> be solved by a default welcomelist entry, and this isn't a 'just find
>> one example' sort of issue.
>
> They won't refrain from writing, but it's fair to not let them open bugs
> or have bugs open in the tracker. And to tell them
>
> 1) clean up your mail
>
> 2) we only take requests for defwl from actual receivers, so we're
> done with this conversation. use of sock puppets is not ok.
>
> That's what I meant by "not take requests from".

Right. Anyone can open a bug, but we enthusiatically close that are invalid.


[...]
>> I don't see this as misaligned, but rather a way of saying that def_w*
>> entries come behind site-local receiver mitigations and
>> receiver/sender collaboration on fixing the shabby mail.
>
> What I was trying to express is that often senders, even zero-spam
> senders, are often enormous, opaque, and intractable. So while I agree
> in theory, I guess the real question is whather we want to say to a
> receiver:
>
> your non-spam mail is spammy, and we aren't going to add a defwl
> because first you need to get e.g. Bank of America to stop sending
> html mail.
>
> or
>
> your non-spam mail is spammy and it's ok to add a defwl
>
> I have occasionally complained to BigCorp and it has never been useful.
> Sure, one can get the branch manager to reverse a fee, but I mean one
> cannot get them to change their practices.

Right. That's why there need to be alternatives to making the mail look less spammish. No one is required to persuade bank execs to behave differently...

[...]
> But I don't mean generally/vaguely. I mean senders that are zero-spam
> and likely important to receivers, in the bank/airline notification (and
> similar) class. Meaning something with real-world consequences that is
> timely. Not newsletters.

Right.


> FWIW, I have given up on the KAM rules. The scores are insanely high
> for things that appear in ham, and I was having too-frequent
> misclassification. Some of the scores were triggering on things which
> are not even objectively spammy, e.g a watch rule on a technical
> discussion of clocks where it was on topic and I was subscribed.

That's a rabbithole of a different nature.
My point in mentioning the KAM channel was as an example of a local choice outside of the default deployment which has a radical effect on FPs. Akin to lowering the threshold to 3.0

(FWIW: I think the KAM rules are fine if, like PCCC, you have a staff of antispam experts and a mature package of customer-facing and staff-facing tools and processes to minimize and mitigate FPs. I use them personally, but I have a robust warren of ways for mail to get around SA analysis... )

[...]

>>> I am extremely skeptical of anything that smells of email marketing
>>> here. I would expect only places sending transactional mail and alerts
>>> to established customers.
>>
>> I share the skepticism, but I have been working with business
>> customers and their love of other businesspeople's email marketing
>> (and random non-work-related email...) for long enough that I have
>> stopped arguing with the nature of email that people eagerly desire in
>> their mailboxes. I care that it is contextually safe, legal, and
>> solidly consensual. There are marketers who stay inside the lines.
>
> If it's really 100% ok, fine. I just said that I'm skeptical and thus
> require more convincing from and ESP than from bank alerts, to overcome
> a presumption of "email marketing is rarely ok".

Yes, I don't foresee ever seriously considering the addition of any marketing-oriented ESP per se to the default welcomelist. They all sometimes send spam.

The Microsoft case is an example. The entry I removed matched any subdomain of microsoft.com, triggered by spam from an address at email.microsoft.com which came to me from a Marketo IP address. Marketo sends a LOT of spam. Marketo generally has no listing of its own.

--
Bill Cole
bill@scconsult.com or billcole@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Not Currently Available For Hire