Mailing List Archive

MIME_BASE64_TEXT only on us-ascii
Hi folks,

it's seems to be that spamassins dont check non ASCII Base64 decodes Mails.

Content-Type: text/html; charset="utf-8"
Content-Transfer-Encoding: base64

[BAYES_99=3.5, BAYES_999=5, HTML_FONT_LOW_CONTRAST=0.001,
HTML_MESSAGE=0.001, MIME_HTML_ONLY=0.723,
RCVD_IN_BL_SPAMCOP_NET=1.347, RCVD_IN_RP_RNBL=1.31]


Mails with:
Content-Type: text/html;
charset="us-ascii"


would get "MIME_BASE64_TEXT"

[.BAYES_99=3.5, BAYES_999=5, CK_HELO_GENERIC=0.001,
HELO_DYNAMIC_DHCP=0.206, HTML_IMAGE_ONLY_28=1.404, HTML_MESSAGE=0.001,
HTTP_EXCESSIVE_ESCAPES=1.572, KHOP_DYNAMIC=0.001,
MIME_BASE64_TEXT=1.741, MIME_HTML_ONLY=0.723,
RAZOR2_CF_RANGE_51_100=1.886, RAZOR2_CHECK=0.922,
RCVD_IN_RP_RNBL=1.31, T_REMOTE_IMAGE=0.01]


Is this a Bug?


Kind regards
Philipp

--
Philipp Ewald
Administrator

DigiOnline GmbH, Probsteigasse 15 - 19, 50670 Köln
Fax: +49 221 6500-690, E-Mail: philipp.ewald@digionline.de

AG Köln HRB 27711, St.-Nr. 5215 5811 0640
Geschäftsführer: Werner Grafenhain

Informationen zum Datenschutz: www.digionline.de/ds
Re: MIME_BASE64_TEXT only on us-ascii [ In reply to ]
On 2021-11-12 at 04:33:34 UTC-0500 (Fri, 12 Nov 2021 10:33:34 +0100)
Philipp Ewald <philipp.ewald@digionline.de>
is rumored to have said:

> Hi folks,
>
> it's seems to be that spamassins dont check non ASCII Base64 decodes
> Mails.

I cannot make that line of text into a coherent English sentence.


> Content-Type: text/html; charset="utf-8"
> Content-Transfer-Encoding: base64
>
> [BAYES_99=3.5, BAYES_999=5, HTML_FONT_LOW_CONTRAST=0.001,
> HTML_MESSAGE=0.001, MIME_HTML_ONLY=0.723,
> RCVD_IN_BL_SPAMCOP_NET=1.347, RCVD_IN_RP_RNBL=1.31]
>
>
> Mails with:
> Content-Type: text/html;
> charset="us-ascii"
>
>
> would get "MIME_BASE64_TEXT"
>
> [.BAYES_99=3.5, BAYES_999=5, CK_HELO_GENERIC=0.001,
> HELO_DYNAMIC_DHCP=0.206, HTML_IMAGE_ONLY_28=1.404,
> HTML_MESSAGE=0.001,
> HTTP_EXCESSIVE_ESCAPES=1.572, KHOP_DYNAMIC=0.001,
> MIME_BASE64_TEXT=1.741, MIME_HTML_ONLY=0.723,
> RAZOR2_CF_RANGE_51_100=1.886, RAZOR2_CHECK=0.922,
> RCVD_IN_RP_RNBL=1.31, T_REMOTE_IMAGE=0.01]
>
>
> Is this a Bug?

Not until it's reproducible and described in a coherent manner.

If you can provide valid email messages (perhaps artificially
constructed) that do (or don't) hit the rules that you believe they
should (or should not,) please do so.

The purpose of MIME_BASE64_TEXT is to identify messages where a text
part (or the whole message) with pure US-ASCII content has been
Base64-encoded instead of being sent unencoded (or just QP-encoded to
protect overlong lines.)


--
Bill Cole
bill@scconsult.com or billcole@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Not Currently Available For Hire
Re: MIME_BASE64_TEXT only on us-ascii [ In reply to ]
> I cannot make that line of text into a coherent English sentence.

May I pray for pardon my Lord. My english is not nativ.....



Here you can test it....


Mail with:

>> Content-Type: text/html;
>> charset="us-ascii"

getting "MIME_BASE64_TEXT=1.741"

Base64 generate with site:
https://www.base64encode.org/


Kind regards




On 11/12/21 10:16 PM, Bill Cole wrote:
> On 2021-11-12 at 04:33:34 UTC-0500 (Fri, 12 Nov 2021 10:33:34 +0100)
> Philipp Ewald <philipp.ewald@digionline.de>
> is rumored to have said:
>
>> Hi folks,
>>
>> it's seems to be that spamassins dont check non ASCII Base64 decodes Mails.
>
> I cannot make that line of text into a coherent English sentence.
>
>
>> Content-Type: text/html; charset="utf-8"
>> Content-Transfer-Encoding: base64
>>
>> [BAYES_99=3.5, BAYES_999=5, HTML_FONT_LOW_CONTRAST=0.001,
>> HTML_MESSAGE=0.001, MIME_HTML_ONLY=0.723,
>> RCVD_IN_BL_SPAMCOP_NET=1.347, RCVD_IN_RP_RNBL=1.31]
>>
>>
>> Mails with:
>> Content-Type: text/html;
>>         charset="us-ascii"
>>
>>
>> would get "MIME_BASE64_TEXT"
>>
>> [.BAYES_99=3.5, BAYES_999=5, CK_HELO_GENERIC=0.001,
>>         HELO_DYNAMIC_DHCP=0.206, HTML_IMAGE_ONLY_28=1.404, HTML_MESSAGE=0.001,
>>         HTTP_EXCESSIVE_ESCAPES=1.572, KHOP_DYNAMIC=0.001,
>>         MIME_BASE64_TEXT=1.741, MIME_HTML_ONLY=0.723,
>>         RAZOR2_CF_RANGE_51_100=1.886, RAZOR2_CHECK=0.922,
>>         RCVD_IN_RP_RNBL=1.31, T_REMOTE_IMAGE=0.01]
>>
>>
>> Is this a Bug?
>
> Not until it's reproducible and described in a coherent manner.
>
> If you can provide valid email messages (perhaps artificially constructed) that do (or don't) hit the rules that you believe they should (or should not,) please do so.
>
> The purpose of MIME_BASE64_TEXT is to identify messages where a text part (or the whole message) with pure US-ASCII content has been Base64-encoded instead of being sent unencoded (or just QP-encoded to protect overlong lines.)
>
>

--
Philipp Ewald
Administrator

DigiOnline GmbH, Probsteigasse 15 - 19, 50670 Köln
Fax: +49 221 6500-690, E-Mail: philipp.ewald@digionline.de

AG Köln HRB 27711, St.-Nr. 5215 5811 0640
Geschäftsführer: Werner Grafenhain

Informationen zum Datenschutz: www.digionline.de/ds
Re: MIME_BASE64_TEXT only on us-ascii [ In reply to ]
On 2021-11-15 at 05:53:43 UTC-0500 (Mon, 15 Nov 2021 11:53:43 +0100)
Philipp Ewald <philipp.ewald@digionline.de>
is rumored to have said:

>> I cannot make that line of text into a coherent English sentence.
>
> May I pray for pardon my Lord. My english is not nativ.....

We work with what we have. My German would be far worse.

I suspect that the only problem here is one of unclear language.

> Here you can test it....

I have no clue what to test. I do not understand what you think is not
working as intended.

> Mail with:
>
>>> Content-Type: text/html;
>>> charset="us-ascii"
>
> getting "MIME_BASE64_TEXT=1.741"

Which is correct, if the charset is actually us-ascii and Base64
encoding is used anyway. There is no circumstance where a formally
correct text/html document that is strictly us-ascii (i.e. all entities
HTML-encoded) must be Base64-encoded. MIME_BASE64_TEXT exists because it
is unusual to base64-encode pure us-ascii AND it is strong (albeit
imperfect) indicator of the message being spam.

> Base64 generate with site:
> https://www.base64encode.org/

Or /usr/bin/base64 or 'openssl enc' :)




> Kind regards
>
>
>
>
> On 11/12/21 10:16 PM, Bill Cole wrote:
>> On 2021-11-12 at 04:33:34 UTC-0500 (Fri, 12 Nov 2021 10:33:34 +0100)
>> Philipp Ewald <philipp.ewald@digionline.de>
>> is rumored to have said:
>>
>>> Hi folks,
>>>
>>> it's seems to be that spamassins dont check non ASCII Base64 decodes
>>> Mails.
>>
>> I cannot make that line of text into a coherent English sentence.
>>
>>
>>> Content-Type: text/html; charset="utf-8"
>>> Content-Transfer-Encoding: base64
>>>
>>> [BAYES_99=3.5, BAYES_999=5, HTML_FONT_LOW_CONTRAST=0.001,
>>> HTML_MESSAGE=0.001, MIME_HTML_ONLY=0.723,
>>> RCVD_IN_BL_SPAMCOP_NET=1.347, RCVD_IN_RP_RNBL=1.31]
>>>
>>>
>>> Mails with:
>>> Content-Type: text/html;
>>>         charset="us-ascii"
>>>
>>>
>>> would get "MIME_BASE64_TEXT"
>>>
>>> [.BAYES_99=3.5, BAYES_999=5, CK_HELO_GENERIC=0.001,
>>>         HELO_DYNAMIC_DHCP=0.206, HTML_IMAGE_ONLY_28=1.404,
>>> HTML_MESSAGE=0.001,
>>>         HTTP_EXCESSIVE_ESCAPES=1.572, KHOP_DYNAMIC=0.001,
>>>         MIME_BASE64_TEXT=1.741, MIME_HTML_ONLY=0.723,
>>>         RAZOR2_CF_RANGE_51_100=1.886, RAZOR2_CHECK=0.922,
>>>         RCVD_IN_RP_RNBL=1.31, T_REMOTE_IMAGE=0.01]
>>>
>>>
>>> Is this a Bug?
>>
>> Not until it's reproducible and described in a coherent manner.
>>
>> If you can provide valid email messages (perhaps artificially
>> constructed) that do (or don't) hit the rules that you believe they
>> should (or should not,) please do so.
>>
>> The purpose of MIME_BASE64_TEXT is to identify messages where a text
>> part (or the whole message) with pure US-ASCII content has been
>> Base64-encoded instead of being sent unencoded (or just QP-encoded to
>> protect overlong lines.)
>>
>>
>
> --
> Philipp Ewald
> Administrator
>
> DigiOnline GmbH, Probsteigasse 15 - 19, 50670 Köln
> Fax: +49 221 6500-690, E-Mail: philipp.ewald@digionline.de
>
> AG Köln HRB 27711, St.-Nr. 5215 5811 0640
> Geschäftsführer: Werner Grafenhain
>
> Informationen zum Datenschutz: www.digionline.de/ds


--
Bill Cole
bill@scconsult.com or billcole@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Not Currently Available For Hire
Re: MIME_BASE64_TEXT only on us-ascii [ In reply to ]
My problem is that this rule is useless, while I can set the charset to utf-8 and spamassassin ignores this rule....


I got many SPAMS passed through because 1 scorepoint was missing, because charset was set to "utf-8"

>> Mail with:
>>
>>>> Content-Type: text/html;
>>>> charset="us-ascii"
>>
>> getting "MIME_BASE64_TEXT=1.741"
>
> Which is correct, if the charset is actually us-ascii and Base64 encoding is used anyway. There is no circumstance where a formally correct text/html document that is strictly us-ascii (i.e. all entities HTML-encoded) must be Base64-encoded. MIME_BASE64_TEXT exists because it is unusual to base64-encode pure us-ascii AND it is strong (albeit imperfect) indicator of the message being spam.

This is correct. But why is us-ascii requeired for this rule? Are spammer only in US?

You can easy trick spamassasin by setting charset="utf-8"

Kind regards


On 11/16/21 4:28 AM, Bill Cole wrote:
> I have no clue what to test. I do not understand what you think is not working as intended.

--
Philipp Ewald
Administrator

DigiOnline GmbH, Probsteigasse 15 - 19, 50670 Köln
Fax: +49 221 6500-690, E-Mail: philipp.ewald@digionline.de

AG Köln HRB 27711, St.-Nr. 5215 5811 0640
Geschäftsführer: Werner Grafenhain

Informationen zum Datenschutz: www.digionline.de/ds
Re: MIME_BASE64_TEXT only on us-ascii [ In reply to ]
On Tue, 2021-11-16 at 11:32 +0100, Philipp Ewald wrote:
> This is correct. But why is us-ascii requeired for this rule? Are
> spammer only in US?
>
No, its because the base character set for e-mail bodies is USASCII. 

Base64 encoding is a way of making sure that attachments using other
charsets (UTF8, and those using 16 bit encoding) will look just like
USASCII attachments to mail-handling programs, etc and not cause those
programs to have reject the mail message. As far as I know it has no
other common, legitimate use, but it does have the side effect of making
anything thats base 64-encoded unreadable.

So, you can see that the ONLY effect of using base64 encoding on an
attachment containing usascii text is to make it unreadable. This is why
spammers use it: they've worked out that SA will spot and score
malicious URLs, shortners, etc. So, some spammers think that using
base64 encoding will hide those bad URLs from SA, which is quite true.
However their tiny minds don't see that using base64 encoding on a
usascii attachment is a fairly reliable spam indicator all by itself.

Martin
Re: MIME_BASE64_TEXT only on us-ascii [ In reply to ]
Why should a uft-8 base64 coded Mail should contain less spam?



When user get compromised we look into Spammails that was sent.

many of that mails was UTF-8 base64 coded and some mail with us-ascii

Guess with mail got through spamassassin?


RIGHT. base64 coded male with charset utf-8. Containing the same content....


I can understand the point of this rule, but IMO this rule has Bug and should be redesigned

On 11/16/21 12:15 PM, Martin Gregorie wrote:
> On Tue, 2021-11-16 at 11:32 +0100, Philipp Ewald wrote:
>> This is correct. But why is us-ascii requeired for this rule? Are
>> spammer only in US?
>>
> No, its because the base character set for e-mail bodies is USASCII.
>
> Base64 encoding is a way of making sure that attachments using other
> charsets (UTF8, and those using 16 bit encoding) will look just like
> USASCII attachments to mail-handling programs, etc and not cause those
> programs to have reject the mail message. As far as I know it has no
> other common, legitimate use, but it does have the side effect of making
> anything thats base 64-encoded unreadable.
>
> So, you can see that the ONLY effect of using base64 encoding on an
> attachment containing usascii text is to make it unreadable. This is why
> spammers use it: they've worked out that SA will spot and score
> malicious URLs, shortners, etc. So, some spammers think that using
> base64 encoding will hide those bad URLs from SA, which is quite true.
> However their tiny minds don't see that using base64 encoding on a
> usascii attachment is a fairly reliable spam indicator all by itself.
>
> Martin
>
>

--
Philipp Ewald
Administrator

DigiOnline GmbH, Probsteigasse 15 - 19, 50670 Köln
Fax: +49 221 6500-690, E-Mail: philipp.ewald@digionline.de

AG Köln HRB 27711, St.-Nr. 5215 5811 0640
Geschäftsführer: Werner Grafenhain

Informationen zum Datenschutz: www.digionline.de/ds
Re: MIME_BASE64_TEXT only on us-ascii [ In reply to ]
We support utf-8 Mails and we got Mails utf-8 base64 coded. This should be a reason too to set spam rating.



Sorry i dont get it. have a nice day.


On 11/16/21 1:00 PM, Reindl Harald wrote:
>
>
> Am 16.11.21 um 12:47 schrieb Philipp Ewald:
>> Why should a uft-8 base64 coded Mail should contain less spam?
>
> nobody said that!
>
> MIME_BASE64_TEXT is one of hundrets if not thousands of signs for spamminess and has it's place in a *score based* classification
>
> it's point is that there is no reason for base64 except try to hide the intent

--
Philipp Ewald
Administrator

DigiOnline GmbH, Probsteigasse 15 - 19, 50670 Köln
Fax: +49 221 6500-690, E-Mail: philipp.ewald@digionline.de

AG Köln HRB 27711, St.-Nr. 5215 5811 0640
Geschäftsführer: Werner Grafenhain

Informationen zum Datenschutz: www.digionline.de/ds
Re: MIME_BASE64_TEXT only on us-ascii [ In reply to ]
On 11/16/2021 7:34 AM, Philipp Ewald wrote:
> We support utf-8 Mails and we got Mails utf-8 base64 coded. This should be a reason
> too to set spam rating.
>
>
>
> Sorry i dont get it. have a nice day.

The point is this:

UTF-8 emails SHOULD be base64 encoded.

ASCII emails SHOULD NOT be base64 encoded.

Therefore, an ASCII email that IS base64 encoded is unusual and is frequently seen in
spam, so it is scored in SA.

A UTF-8 email that is base64 encoded is normal and so is not scored simply for being
encoded.

--
Bowie
Re: MIME_BASE64_TEXT only on us-ascii [ In reply to ]
> UTF-8 emails SHOULD be base64 encoded.

Hmm most of our mails we get are not base64 coded... (with charset UTF-8) but OK

So any UTF-8 witch is not base64 should get a spam rating bacause IT SHOULD be base64 coded?

never mind....


On 11/16/21 6:55 PM, Bowie Bailey wrote:
> On 11/16/2021 7:34 AM, Philipp Ewald wrote:
>> We support utf-8 Mails and we got Mails utf-8 base64 coded. This should be a reason too to set spam rating.
>>
>>
>>
>> Sorry i dont get it. have a nice day.
>
> The point is this:
>
> UTF-8 emails SHOULD be base64 encoded.
>
> ASCII emails SHOULD NOT be base64 encoded.
>
> Therefore, an ASCII email that IS base64 encoded is unusual and is frequently seen in spam, so it is scored in SA.
>
> A UTF-8 email that is base64 encoded is normal and so is not scored simply for being encoded.
>

--
Philipp Ewald
Administrator

DigiOnline GmbH, Probsteigasse 15 - 19, 50670 Köln
Fax: +49 221 6500-690, E-Mail: philipp.ewald@digionline.de

AG Köln HRB 27711, St.-Nr. 5215 5811 0640
Geschäftsführer: Werner Grafenhain

Informationen zum Datenschutz: www.digionline.de/ds
Re: MIME_BASE64_TEXT only on us-ascii [ In reply to ]
On 2021-11-17 at 03:59:38 UTC-0500 (Wed, 17 Nov 2021 09:59:38 +0100)
Philipp Ewald <philipp.ewald@digionline.de>
is rumored to have said:

>> UTF-8 emails SHOULD be base64 encoded.
>
> Hmm most of our mails we get are not base64 coded... (with charset
> UTF-8) but OK

Base64 encoding is only necessary if there are non-ASCII characters
used. UTF-8 is a superset of ASCII & it is normal for MUAs to not encode
more than needed. As a result, they often do no encoding or use
quoted-printable encoding for just a few non-ASCII characters instead of
base64, which encodes whole messages and has a 33% data size penalty.

So, to be more correct: MANY UTF-8 emails MUST be base64 encoded. No
US-ASCII emails MUST be encoded.

> So any UTF-8 witch is not base64 should get a spam rating bacause IT
> SHOULD be base64 coded?
>
> never mind....

This reflects a widespread misunderstanding about how SpamAssassin works
and the rationale behind the rules.

SpamAssassin rules are not laws in any sense. They do not prescribe or
proscribe any action. They do not reflect any sort of moral or ethical
judgment. They do not express or define technical correctness. They were
created almost entirely by human guesses, across almost 20 years, judged
and validated by our RuleQA process which determines the publication and
scoring of individual rules within the whole set on a daily basis.

Whether a rule is published and what score it is assigned by default
depends solely on how that rule has *proven* itself useful in
discriminating between spam and ham. A rule does not exist because it
"should" but because it WORKS. We do not force-publish rules because we
think mail should or should not have particular attributes; we TEST, and
the testing determines what gets scored. No amount of rigorous logic
grounded in technical theory can override the judgment of RuleQA.


--
Bill Cole
bill@scconsult.com or billcole@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Not Currently Available For Hire
Re: MIME_BASE64_TEXT only on us-ascii [ In reply to ]
> On Nov 17, 2021, at 9:50 AM, Bill Cole <sausers-20150205@billmail.scconsult.com> wrote:
>
> SpamAssassin rules are not laws in any sense. They do not prescribe or proscribe any action. They do not reflect any sort of moral or ethical judgment. They do not express or define technical correctness.


Isn't that exactly what we're discussing here? "Technical correctness"?

Good internetworking implementations follow (to the extent they don't conflict with good security practices) Postel's Law, "be conservative in what you send, be liberal [but not naive] in what you accept".

The point earlier in the thread was that using more encoding than is strictly necessary is not being "conservative in what you send", since it puts extra burden on the receiver to have a robust and complete implementation, and creates more opportunity to have an interoperability failure.

Rereading:


> Base64 encoding is only necessary if there are non-ASCII characters used. UTF-8 is a superset of ASCII & it is normal for MUAs to not encode more than needed.


Exactly. Encoding is only used when and where necessary.

Properly encoded HTML uses HTML-Entity naming, which is also ASCII-friendly, i.e. &eacute; instead of Latin1 &#233; etc. or raw 8bit characters.

-Philip
Re: MIME_BASE64_TEXT only on us-ascii [ In reply to ]
On Tue, Nov 30, 2021 at 12:03:15PM -0700, Philip Prindeville wrote:
> > On Nov 17, 2021, at 9:50 AM, Bill Cole <sausers-20150205@billmail.scconsult.com> wrote:
> > SpamAssassin rules are not laws in any sense. They do not prescribe or proscribe any action. They do not reflect any sort of moral or ethical judgment. They do not express or define technical correctness.
>
> Isn't that exactly what we're discussing here? "Technical correctness"?

Hm, no? App encoding pure ASCII is Base64 is not breaking any RFC?
So it is behaving "technically correctly".

> Good internetworking implementations follow (to the extent they don't conflict with good security practices) Postel's Law, "be conservative in what you send, be liberal [but not naive] in what you accept".

Well, antispam efforts (as is security for important stuff) are
mostly exactly the OPPOSITE of good internetworking implementations
of the old Postel's law.

And for the good reasons - in the internetworking implementations of
the old, the vast majority of peers (if not all) you interacted with
were GOOD guys trying to do good things.

In today e-mail (and security), the majority of the actors are
enemies trying to penetrate your defensive lines.

Also, see https://en.wikipedia.org/wiki/Robustness_principle#Criticism


> Rereading:
> > Base64 encoding is only necessary if there are non-ASCII characters used. UTF-8 is a superset of ASCII & it is normal for MUAs to not encode more than needed.
>
> Exactly. Encoding is only used when and where necessary.

...by legitimate users. Spammers on the other hand will sometimes
encode even when it is NOT needed, probably in attempt to avoid less
advanced antispam tools (or due to sheer laziness when writing spam
tool).

The fact that such encoding is tehnically allowed does NOT change the
fact that the tecnique is vastly more used by spammers than by
innocent parties.

> Properly encoded HTML uses HTML-Entity naming, which is also ASCII-friendly, i.e. &eacute; instead of Latin1 &#233; etc. or raw 8bit characters.

There are several "proper" (ie. allowed by different RFCs) ways to
encode that information in mail. Statistical analyses seem to say that
some of the ways are used much more by spammers then by legitimate
users. Hence, the score for those methods.

--
Opinions above are GNU-copylefted.
Re: MIME_BASE64_TEXT only on us-ascii [ In reply to ]
On Tue, 30 Nov 2021, Philip Prindeville wrote:

>> On Nov 17, 2021, at 9:50 AM, Bill Cole <sausers-20150205@billmail.scconsult.com> wrote:
>>
>> SpamAssassin rules are not laws in any sense. They do not prescribe or proscribe any action. They do not reflect any sort of moral or ethical judgment. They do not express or define technical correctness.
>
> Isn't that exactly what we're discussing here? "Technical correctness"?

The way I generally put it is: SpamAssassin is not an RFC-compliance audit
tool.

--
John Hardin KA7OHZ http://www.impsec.org/~jhardin/
jhardin@impsec.org pgpk -a jhardin@impsec.org
key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
The police of a state should never be stronger or better armed
than the citizenry. An armed citizenry, willing to fight, is the
foundation of civil freedom. -- Robert A. Heinlein, 1942
-----------------------------------------------------------------------
549 days since the first private commercial manned orbital mission (SpaceX)
Re: MIME_BASE64_TEXT only on us-ascii [ In reply to ]
> On Nov 30, 2021, at 1:10 PM, Matija Nalis <mnalis-sa-list@voyager.hr> wrote:
>
> On Tue, Nov 30, 2021 at 12:03:15PM -0700, Philip Prindeville wrote:
>>> On Nov 17, 2021, at 9:50 AM, Bill Cole <sausers-20150205@billmail.scconsult.com> wrote:
>>> SpamAssassin rules are not laws in any sense. They do not prescribe or proscribe any action. They do not reflect any sort of moral or ethical judgment. They do not express or define technical correctness.
>>
>> Isn't that exactly what we're discussing here? "Technical correctness"?
>
> Hm, no? App encoding pure ASCII is Base64 is not breaking any RFC?
> So it is behaving "technically correctly".


Again, Postel's Rule.

Excessive and unnecessary encoding isn't behaving correctly.


>
>> Good internetworking implementations follow (to the extent they don't conflict with good security practices) Postel's Law, "be conservative in what you send, be liberal [but not naive] in what you accept".
>
> Well, antispam efforts (as is security for important stuff) are
> mostly exactly the OPPOSITE of good internetworking implementations
> of the old Postel's law.


Yeah, they date from a more innocent time. Unfortunately Jon passed away before he could adjust it for a more modern world. (He was one of my mentors and I miss him, along with Bob Braden.)


> And for the good reasons - in the internetworking implementations of
> the old, the vast majority of peers (if not all) you interacted with
> were GOOD guys trying to do good things.
>
> In today e-mail (and security), the majority of the actors are
> enemies trying to penetrate your defensive lines.


That might be overstated.


> Also, see https://en.wikipedia.org/wiki/Robustness_principle#Criticism


I'm aware. Jon and I had a few arguments about this.

Including about how it weakened the effectiveness of Bake-Offs and stringency/conformance testing.



>> Rereading:
>>> Base64 encoding is only necessary if there are non-ASCII characters used. UTF-8 is a superset of ASCII & it is normal for MUAs to not encode more than needed.
>>
>> Exactly. Encoding is only used when and where necessary.
>
> ...by legitimate users. Spammers on the other hand will sometimes
> encode even when it is NOT needed, probably in attempt to avoid less
> advanced antispam tools (or due to sheer laziness when writing spam
> tool).
>
> The fact that such encoding is tehnically allowed does NOT change the
> fact that the tecnique is vastly more used by spammers than by
> innocent parties.


I don't think anyone is arguing otherwise.

-Philip


>
>> Properly encoded HTML uses HTML-Entity naming, which is also ASCII-friendly, i.e. &eacute; instead of Latin1 &#233; etc. or raw 8bit characters.
>
> There are several "proper" (ie. allowed by different RFCs) ways to
> encode that information in mail. Statistical analyses seem to say that
> some of the ways are used much more by spammers then by legitimate
> users. Hence, the score for those methods.
>
> --
> Opinions above are GNU-copylefted.