Mailing List Archive

SA seems powerless against marketing emails for SEO/web development
For whatever reason, solicitations from marketers for various web
development services are easily slipping through my defenses. I figured
bayes filtering would eventually do the job but after a reporting them
for many days now, I'm still getting like 3 to half dozen a day. Here's
one example: https://paste.debian.net/1194735/

The report for this email:

Content analysis details: (-1.0 points, 5.0 required)

pts rule name description
---- ----------------------
--------------------------------------------------
-0.0 RCVD_IN_DNSWL_NONE RBL: Sender listed at
https://www.dnswl.org/,
no trust
[209.85.210.44 listed in list.dnswl.org]
-1.0 BAYES_00 BODY: Bayes spam probability is 0 to 1%
[score: 0.0000]
-0.0 SPF_PASS SPF: sender matches SPF record
0.2 FREEMAIL_ENVFROM_END_DIGIT Envelope-from freemail username ends
in digit
[margaretkelly866[at]gmail.com]
0.0 FREEMAIL_FROM Sender email is commonly abused enduser mail
provider
[margaretkelly866[at]gmail.com]
0.0 SPF_HELO_NONE SPF: HELO does not publish an SPF Record
-0.0 RCVD_IN_MSPIKE_H3 RBL: Good reputation (+3)
[209.85.210.44 listed in wl.mailspike.net]
0.0 HTML_MESSAGE BODY: HTML included in message
-0.1 DKIM_VALID_EF Message has a valid DKIM or DK signature
from
envelope-from domain
0.1 DKIM_SIGNED Message has a DKIM or DK signature, not
necessarily
valid
-0.1 DKIM_VALID_AU Message has a valid DKIM or DK signature
from
author\'s domain
-0.1 DKIM_VALID Message has at least one valid DKIM or DK
signature
-0.0 RCVD_IN_MSPIKE_WL Mailspike good senders

This email is bit of an outlier as most of these emails will get flagged
with bayes_99 and bayes_999 but this one actually gives it bayes_00.


My bayes filter has been trained with about 2000 examples of spam and
ham.

Not sure what to do at this point. I'm thinking about scoring up emails
if the mention stuff like "SEO", "web design" etc. but I'm not sure if
this is the best approach. Feels like a thumb in the dike approach.
Re: SA seems powerless against marketing emails for SEO/web development [ In reply to ]
On 22.04.21 14:21, Steve Dondley wrote:
> pts rule name description
>---- ----------------------
>--------------------------------------------------
>-0.0 RCVD_IN_DNSWL_NONE RBL: Sender listed at
>https://www.dnswl.org/,
> no trust
> [209.85.210.44 listed in list.dnswl.org]
>-1.0 BAYES_00 BODY: Bayes spam probability is 0 to 1%
> [score: 0.0000]
>-0.0 SPF_PASS SPF: sender matches SPF record
> 0.2 FREEMAIL_ENVFROM_END_DIGIT Envelope-from freemail username ends
> in digit
> [margaretkelly866[at]gmail.com]
> 0.0 FREEMAIL_FROM Sender email is commonly abused enduser mail
> provider
> [margaretkelly866[at]gmail.com]
> 0.0 SPF_HELO_NONE SPF: HELO does not publish an SPF Record
>-0.0 RCVD_IN_MSPIKE_H3 RBL: Good reputation (+3)
> [209.85.210.44 listed in wl.mailspike.net]
> 0.0 HTML_MESSAGE BODY: HTML included in message
>-0.1 DKIM_VALID_EF Message has a valid DKIM or DK signature
>from
> envelope-from domain
> 0.1 DKIM_SIGNED Message has a DKIM or DK signature, not
>necessarily
> valid
>-0.1 DKIM_VALID_AU Message has a valid DKIM or DK signature
>from
> author\'s domain
>-0.1 DKIM_VALID Message has at least one valid DKIM or DK
>signature
>-0.0 RCVD_IN_MSPIKE_WL Mailspike good senders
>
>This email is bit of an outlier as most of these emails will get
>flagged with bayes_99 and bayes_999 but this one actually gives it
>bayes_00.

>My bayes filter has been trained with about 2000 examples of spam and
>ham.

now, train as needed - this one as spam.

--
Matus UHLAR - fantomas, uhlar@fantomas.sk ; http://www.fantomas.sk/
Warning: I wish NOT to receive e-mail advertising to this address.
Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
10 GOTO 10 : REM (C) Bill Gates 1998, All Rights Reserved!
Re: SA seems powerless against marketing emails for SEO/web development [ In reply to ]
On Thu, 22 Apr 2021, Matus UHLAR - fantomas wrote:

> On 22.04.21 14:21, Steve Dondley wrote:
>> pts rule name description
>> ---- ----------------------
>> --------------------------------------------------
>> -0.0 RCVD_IN_DNSWL_NONE RBL: Sender listed at https://www.dnswl.org/,
>> no trust
>> [209.85.210.44 listed in list.dnswl.org]
>> -1.0 BAYES_00 BODY: Bayes spam probability is 0 to 1%
>> [score: 0.0000]
[snip..]
>> -0.0 RCVD_IN_MSPIKE_WL Mailspike good senders
>>
>> This email is bit of an outlier as most of these emails will get flagged
>> with bayes_99 and bayes_999 but this one actually gives it bayes_00.
>
>> My bayes filter has been trained with about 2000 examples of spam and ham.
>
> now, train as needed - this one as spam.

In that spam there was a tracking link at the bottom with a URL of the form:
https://name-company-track.appspot.com/Firebase?bunch-of-long-tracking-variables

How hard would it be to modify the uribl lookup code so that it did not truncate
hosts names, so we could create uribl entries of the form
"name-company-track.appspot.com" or would that be prohibitively expensive in
lookups?

I regularly see phish/spam that has URL hosts of the form some-name.blogspot.com
or other-name.webhosting.com and it would be nice to be able to slam those
things into a uribl list (I run my own).


--
Dave Funk University of Iowa
<dbfunk (at) engineering.uiowa.edu> College of Engineering
319/335-5751 FAX: 319/384-0549 1256 Seamans Center, 103 S Capitol St.
Sys_admin/Postmaster/cell_admin Iowa City, IA 52242-1527
#include <std_disclaimer.h>
Better is not better, 'standard' is better. B{
Re: SA seems powerless against marketing emails for SEO/web development [ In reply to ]
David B Funk wrote:
> How hard would it be to modify the uribl lookup code so that it did not
> truncate hosts names, so we could create uribl entries of the form
> "name-company-track.appspot.com" or would that be prohibitively
> expensive in lookups?

util_rb_2tld <domain>

For appspot.com specifically, according to my notes it's already in that
list. Check your URI extraction process.

I've had a subdomain on the 3tld version for some time:

util_rb_3tld r.appspot.com

Somewhat more recently I've taken to also creating dedicated rules for
some of the more popular domains I've added to these lists to score up
all messages with matching FQDNs a little whether or not the specific
name has come to my attention for listing on the local URI DNSBL.

-kgd
Re: SA seems powerless against marketing emails for SEO/web development [ In reply to ]
On 2021-04-22 02:31 PM, Matus UHLAR - fantomas wrote:
> On 22.04.21 14:21, Steve Dondley wrote:
>> pts rule name description
>> ---- ----------------------
>> --------------------------------------------------
>> -0.0 RCVD_IN_DNSWL_NONE RBL: Sender listed at
>> https://www.dnswl.org/,
>> no trust
>> [209.85.210.44 listed in list.dnswl.org]
>> -1.0 BAYES_00 BODY: Bayes spam probability is 0 to 1%
>> [score: 0.0000]
>> -0.0 SPF_PASS SPF: sender matches SPF record
>> 0.2 FREEMAIL_ENVFROM_END_DIGIT Envelope-from freemail username ends
>> in digit
>> [margaretkelly866[at]gmail.com]
>> 0.0 FREEMAIL_FROM Sender email is commonly abused enduser
>> mail
>> provider
>> [margaretkelly866[at]gmail.com]
>> 0.0 SPF_HELO_NONE SPF: HELO does not publish an SPF Record
>> -0.0 RCVD_IN_MSPIKE_H3 RBL: Good reputation (+3)
>> [209.85.210.44 listed in wl.mailspike.net]
>> 0.0 HTML_MESSAGE BODY: HTML included in message
>> -0.1 DKIM_VALID_EF Message has a valid DKIM or DK signature
>> from
>> envelope-from domain
>> 0.1 DKIM_SIGNED Message has a DKIM or DK signature, not
>> necessarily
>> valid
>> -0.1 DKIM_VALID_AU Message has a valid DKIM or DK signature
>> from
>> author\'s domain
>> -0.1 DKIM_VALID Message has at least one valid DKIM or DK
>> signature
>> -0.0 RCVD_IN_MSPIKE_WL Mailspike good senders
>>
>> This email is bit of an outlier as most of these emails will get
>> flagged with bayes_99 and bayes_999 but this one actually gives it
>> bayes_00.
>
>> My bayes filter has been trained with about 2000 examples of spam and
>> ham.
>
> now, train as needed - this one as spam.

OK, so I fixed my configuration issue. So now the bayes filtering is
working when I flag an email as spam in my mail client:

Content analysis details: (4.5 points, 5.0 required)

pts rule name description
---- ----------------------
--------------------------------------------------
<snip>
1.0 BAYES_999 BODY: Bayes spam probability is 99.9 to 100%
[score: 1.0000]
3.5 BAYES_99 BODY: Bayes spam probability is 99 to 100%
[score: 1.0000]
<snip>

But as you can see, the email is still not hitting the 5.0 threshold.

I could add another point between BAYES_999 and BAYES_99 scores but that
seems reactionary. Is there a better way? Should I thrown in another
point for certain keywords in marketing emails like these?
Re: SA seems powerless against marketing emails for SEO/web development [ In reply to ]
On 2021-04-22 23:54, Steve Dondley wrote:

> Content analysis details: (4.5 points, 5.0 required)
>
> pts rule name description
> ---- ----------------------
> --------------------------------------------------
> <snip>
> 1.0 BAYES_999 BODY: Bayes spam probability is 99.9 to
> 100%
> [score: 1.0000]
> 3.5 BAYES_99 BODY: Bayes spam probability is 99 to 100%
> [score: 1.0000]
> <snip>
>
> But as you can see, the email is still not hitting the 5.0 threshold.
>
> I could add another point between BAYES_999 and BAYES_99 scores but
> that seems reactionary. Is there a better way? Should I thrown in
> another point for certain keywords in marketing emails like these?

add score to tags that score possitive 0.0

until it gives 5.0 and above
Re: SA seems powerless against marketing emails for SEO/web development [ In reply to ]
> I could add another point between BAYES_999 and BAYES_99 scores but that
> seems reactionary. Is there a better way? Should I thrown in another point
> for certain keywords in marketing emails like these?

For this specific message I might be inclined to add a rule to check for a
URL in the subject and add a point for that. I can't think of very many
legit mails I've ever received with a URL in the subject. A point or two for
that should be safe enough if it isn't spam, but could trip it over the edge
if it is.

header URI_IN_SUBJECT Subject =~
/\b[-\w\._]+\@(?:[-\w_]\.)+(?:com|org|biz|cloud)\b/
score URI_IN_SUBJECT 1.5
describe URI_IN_SUBJECT A URI in the subject of the message

Something like that, maybe.

Loren
Re: SA seems powerless against marketing emails for SEO/web development [ In reply to ]
>> I could add another point between BAYES_999 and BAYES_99 scores but
>> that seems reactionary. Is there a better way? Should I thrown in
>> another point for certain keywords in marketing emails like these?
>
> add score to tags that score possitive 0.0
>
> until it gives 5.0 and above

I like this idea. Seems reasonable. Thanks.
Re: SA seems powerless against marketing emails for SEO/web development [ In reply to ]
On 2021-04-23 17:43, Steve Dondley wrote:
>>> I could add another point between BAYES_999 and BAYES_99 scores but
>>> that seems reactionary. Is there a better way? Should I thrown in
>>> another point for certain keywords in marketing emails like these?
>>
>> add score to tags that score possitive 0.0
>>
>> until it gives 5.0 and above
>
> I like this idea. Seems reasonable. Thanks.

and check whitelist with very low scores that helps to make negative
scores on spam, change that whitelist score to be more neotral, do not
remove the whitelist, ups welcomelist :=)

currently its a mess with welcomelist / whitelist, hope blacklist /
whitelist is removed in spamassassin 4.x.x

and at that time end users need to change many config files without
payback :(
Re: SA seems powerless against marketing emails for SEO/web development [ In reply to ]
On Thu, 22 Apr 2021 14:21:05 -0400
Steve Dondley wrote:

> I'm still getting like 3 to half
> dozen a day. Here's one example: https://paste.debian.net/1194735/

Apparently it already expired.