Mailing List Archive

URIDNSBL full message checking
I’m noticing that check_uridnsbl() seems only to check the message body. Is there some way to make it check the headers as well?

In 25_uribl.cf, I have:

urirhssub URIBL_BLACK multi.uribl.com. A 2
body URIBL_BLACK eval:check_uridnsbl('URIBL_BLACK')
describe URIBL_BLACK Contains an URL listed in the URIBL blacklist
tflags URIBL_BLACK net
reuse URIBL_BLACK

First obvious thing I tried was changing ‘body’ to ‘full’ in the above. It continues to check only the body. In fact, changing it to ‘header’, it continues to check the body. I then read through the man page on URIDNSBL and it does clearly state a ‘body’ rule.

Is there some clever way to have a URIDNSBL rule check the header of a message as well? Or is there something else I can use separately that would look up a domainname in the header section of an email?

Michael Grant
Re: URIDNSBL full message checking [ In reply to ]
On 2023-02-06 at 12:50:29 UTC-0500 (Mon, 6 Feb 2023 17:50:29 +0000)
Michael Grant via users <mgrant@grant.org>
is rumored to have said:

> I’m noticing that check_uridnsbl() seems only to check the message
> body. Is there some way to make it check the headers as well?

No. Which is fine, because there are usually no URIs in headers, and
when there are, they are likely to be standard List-* headers, which are
unlikely to be useful.


> In 25_uribl.cf, I have:
>
> urirhssub URIBL_BLACK multi.uribl.com. A 2
> body URIBL_BLACK eval:check_uridnsbl('URIBL_BLACK')
> describe URIBL_BLACK Contains an URL listed in the URIBL
> blacklist
> tflags URIBL_BLACK net
> reuse URIBL_BLACK
>
> First obvious thing I tried was changing ‘body’ to ‘full’ in
> the above. It continues to check only the body. In fact, changing it
> to ‘header’, it continues to check the body. I then read through
> the man page on URIDNSBL and it does clearly state a ‘body’ rule.

Predictable. :)

> Is there some clever way to have a URIDNSBL rule check the header of a
> message as well? Or is there something else I can use separately that
> would look up a domainname in the header section of an email?

Nothing comes to mind.

You can obviously use 'full' or the 'all' pseudo-header and look for
specific domains, but identifying everything in the header that COULD be
a domain and just testing that against a DNSBL designed for domains
found in URIs could have very bad failure modes.


--
Bill Cole
bill@scconsult.com or billcole@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Not Currently Available For Hire
Re: URIDNSBL full message checking [ In reply to ]
On Mon, Feb 06, 2023 at 04:16:46PM -0500, Bill Cole wrote:
> On 2023-02-06 at 12:50:29 UTC-0500 (Mon, 6 Feb 2023 17:50:29 +0000)
> Michael Grant via users <mgrant@grant.org>
> is rumored to have said:
>
> > I’m noticing that check_uridnsbl() seems only to check the message body.
> > Is there some way to make it check the headers as well?
>
> No. Which is fine, because there are usually no URIs in headers, and when
> there are, they are likely to be standard List-* headers, which are unlikely
> to be useful.

It's actually just a domain name. This uridnsbl keys off domain names
in the body too, I was kinda hoping it would look at the domain names
in the headers like the body, guess not.

> You can obviously use 'full' or the 'all' pseudo-header and look for
> specific domains, but identifying everything in the header that COULD be a
> domain and just testing that against a DNSBL designed for domains found in
> URIs could have very bad failure modes.

How about just say the from or received headers? Is there something
like check_rbl that would look up a domain name rather than an ip
address that I could look up the domain in that URIBL list?

I played with check_rbl() but this seems only to look up numeric ip
addresses.

Michael Grant
Re: URIDNSBL full message checking [ In reply to ]
Hello Michael,

>> No. Which is fine, because there are usually no URIs in headers, and when
>> there are, they are likely to be standard List-* headers, which are unlikely
>> to be useful.

Dont agree with that. We see many usecases for header checks...

We see many spams with a from domain inside SURBL.
And we have been testing this on our corpus for over 18 months now.

>> You can obviously use 'full' or the 'all' pseudo-header and look for
>> specific domains, but identifying everything in the header that COULD be a
>> domain and just testing that against a DNSBL designed for domains found in
>> URIs could have very bad failure modes.

I think we passed that point some years ago tbh.

> How about just say the from or received headers? Is there something
> like check_rbl that would look up a domain name rather than an ip
> address that I could look up the domain in that URIBL list?
>
> I played with check_rbl() but this seems only to look up numeric ip
> addresses.

You can test with:

header SURBL_MULTI_HDR
eval:check_hashbl_emails('multi.surbl.org', 'raw/max=10/shuffle/host',
'ALLFROM/Reply-To', '^127\.0\.0\.\d+$')
priority SURBL_MULTI_HDR -100
describe SURBL_MULTI_HDR Domain in email headers found in
surbl multi

And score accordingly.

You could also check off reply-to/the from and so on seperately.

Have fun± Raymond Dijkxhoorn - SURBL
Re[2]: URIDNSBL full message checking [ In reply to ]
>It's actually just a domain name. This uridnsbl keys off domain names
>in the body too, I was kinda hoping it would look at the domain names
>in the headers like the body, guess not.

So there's an interesting history here. Back in the early/mid 2000s,
when SURBL, URIBL, and invaluement's URI lists were just starting (I was
there!) - we didn't have reliable and universally-used/established
domain authentication tools like SPF and DKIM and even ESPs were either
non-existent or just beginning. Therefore, the vast majority of spammers
were sending from their own servers (or bots!) - and both the mail
header from and the SMTP-envelope FROM - in spams - was 99+% of the time
forged. So trying to run a DNSBL that listed the domains found in the
headers was a horrible idea because a massive percentage of spam used
forged domains. That was then a losing game of whack-a-mole that would
only add much useless one-off data to a dnsbl, as well as providing
spammers with intel they could use to find DNSBL spamtrap addresses.

Today, so much is radically different since now many spams have their
domains authenticated with things like SPF and DKIM. Therefore, SURBL
and URIBL and Spamhaus's DBL have since moved more towards purposely
including those header and SMTP-envelope domains (as well as the domain
at the end of the PTR record) as things that they specifically target
with their domain/URI lists. But these are things that "consumed" by SA
with OTHER rules, not with URIDNSBL. (also, postfix as some good rules
for this too which don't require callouts to content filters like SA.
Exim and others probably do, too?

At invaluement - we're very very late to this game - and we're going a
different route - choosing to target these with a separate list, not our
URI list - this will be our SED list, which is currently under
development - although, in the meantime, many of our subscribers use our
existing URI list in this way, outside of our recommendations, and are
happy with those results.

The main takeaways are:
(1) these require different rules than the URIDNSBL module (since
URIDNSBL is for checking domains/IPs inside the clickable links in the
body of the message)
(2) Any DNSBL trying to do should to pay attention to authentication,
and not just throwing every such domain in the list without being sure
it really is them and not a forged domain.

I hope this helps!

Rob McEwen, invaluement
Re: Re[2]: URIDNSBL full message checking [ In reply to ]
You could also use check_rbl_headers

Add this to init.pre or in your favorite .pre file:
loadplugin Mail::SpamAssassin::Plugin::DNSEval

Then add this rule:
if (version >= 3.004003)
ifplugin Mail::SpamAssassin::Plugin::DNSEval
header HEADERBL_URIBL eval:check_rbl_headers('hdrbl-uribl', 'multi.uribl.com.', '127.0.0.2')
describe HEADERBL_URIBL Header contains domain listed in URIBL
tflags HEADERBL_URIBL net
endif
endif

You can define in which headers it should look for domains using "rbl_headers". Have a look a the documentation with:
perldoc Mail::SpamAssassin::Plugin::DNSEval

Good luck,
Laurent S.
Re: URIDNSBL full message checking [ In reply to ]
>On 2023-02-06 at 12:50:29 UTC-0500 (Mon, 6 Feb 2023 17:50:29 +0000)
>Michael Grant via users <mgrant@grant.org>
>is rumored to have said:
>
>>I’m noticing that check_uridnsbl() seems only to check the message
>>body. Is there some way to make it check the headers as well?

On 06.02.23 16:16, Bill Cole wrote:
>No. Which is fine, because there are usually no URIs in headers, and
>when there are, they are likely to be standard List-* headers, which
>are unlikely to be useful.

I got a few spams containing List-Id: so I'm going through archive.

I remember receiving many spams from google (groups?) containing this header.
Unfortunately I don't have any samples right now.

However, looking at my spam
- there are many bogus list-id headers:

List-Id: b07285867v11317517
List-Id: "0" <1018.14c124b8eb050d06d6f466cb3e9.localhost>
List-Id: lm555vqc6z9b6 <linkedin>
List-Id: <spc-88419-0>

and I can already see a rule to catch these.


And, there are some lists I repeatedly got such spam from (I haven't
subscribed), so using them at least with my local BL could help.


>>In 25_uribl.cf, I have:
>>
>>urirhssub URIBL_BLACK multi.uribl.com. A 2
>>body URIBL_BLACK eval:check_uridnsbl('URIBL_BLACK')
>>describe URIBL_BLACK Contains an URL listed in the URIBL
>>blacklist
>>tflags URIBL_BLACK net
>>reuse URIBL_BLACK
>>
>>First obvious thing I tried was changing ‘body’ to ‘full’ in the
>>above. It continues to check only the body. In fact, changing it
>>to ‘header’, it continues to check the body. I then read through
>>the man page on URIDNSBL and it does clearly state a ‘body’ rule.
>
>Predictable. :)
>
>>Is there some clever way to have a URIDNSBL rule check the header of
>>a message as well? Or is there something else I can use separately
>>that would look up a domainname in the header section of an email?
>
>Nothing comes to mind.
>
>You can obviously use 'full' or the 'all' pseudo-header and look for
>specific domains, but identifying everything in the header that COULD
>be a domain and just testing that against a DNSBL designed for domains
>found in URIs could have very bad failure modes.

Perhaps adding some headers like List-Id: to the body like Subject: ?

--
Matus UHLAR - fantomas, uhlar@fantomas.sk ; http://www.fantomas.sk/
Warning: I wish NOT to receive e-mail advertising to this address.
Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
I don't have lysdexia. The Dog wouldn't allow that.
Re: URIDNSBL full message checking [ In reply to ]
On 2023-02-07 at 05:07:36 UTC-0500 (Tue, 07 Feb 2023 10:07:36 +0000)
Laurent S. <110ef9e3086d8405c2929e34be5b4340@protonmail.ch>
is rumored to have said:

> You could also use check_rbl_headers

THANK YOU!

I had not recalled that feature when I wrote my reply. I'm glad there
are people here whose brains are younger and less leaky.

The best feature of SpamAssassin is the user community.



--
Bill Cole
bill@scconsult.com or billcole@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Not Currently Available For Hire
Re: URIDNSBL full message checking [ In reply to ]
> You can test with:
>
> header SURBL_MULTI_HDR eval:check_hashbl_emails('multi.surbl.org',
> 'raw/max=10/shuffle/host', 'ALLFROM/Reply-To', '^127\.0\.0\.\d+$')
> priority SURBL_MULTI_HDR -100
> describe SURBL_MULTI_HDR Domain in email headers found in
> surbl multi

Raymond, thank you! This works.

But I'm having an issue using this with multi.surbl.org and
multi.uribl.org. The response addr needs to be bit-masked. The \d+
in 127.0.0.\d+ is in fact a bitmap.

If I want to assign different scores for different entries in their
databases, I'd need to mask the \d+. Is there any easier way to do
this than this?

header URIBL_BLACK eval:check_hashbl_emails('multi.uribl.com', 'raw/max=10/shuffle/host', 'ALLFROM/Reply-To', '^127\.0\.0\.(2|3|6|7|10|11|14|15|18|19|22|23|26|27|30|31|34|35|38|39|42|43|46|47|50|51|54|55|58|59|62|63|66|67|70|71|74|75|78|79|82|83|86|87|90|91|94|95|98|99|102|103|106|107|110|111|114|115|118|119|122|123|126|127|130|131|134|135|138|139|142|143|146|147|150|151|154|155|158|159|162|163|166|167|170|171|174|175|178|179|182|183|186|187|190|191|194|195|198|199|202|203|206|207|210|211|214|215|218|219|222|223|226|227|230|231|234|235|238|239|242|243|246|247|250|251|254)$')

check_uridnsbl() handles bitmaps with the urirhssub parameter (the "2) below:

urirhssub URIBL_BLACK multi.uribl.com. A 2

Is there something like the mask arg in urirhssub with check_hashbl?
I did have a look at the source of check_hashbl but I couldn't spot it
right off. I get the feeling there's got to be a more straight
forward way than above!

Michael Grant