Mailing List Archive: using spamassassin to classify spam

using spamassassin to classify spam

mgrant at grant

Mar 24, 2022, 4:00 PM

Post #1 of 10 (724 views)

I would like to write a rule that checks if a header has a domain name that doesn't resolve.

For example this header:

List-Unsubscribe: <mailto:leave-abcefgh.1.2.3.4@mumble.aidemxwzlwt.bwbibibi.edu>

I want to extract the mumble.aidemxwzlwt.bwbibibi.edu and run it
through AskDNS and if I get an NXDOMAIN, I want to score it.

Is it possible to do this within a cf file?

I can easily extract the domain name with a regex. Is there a way to
save that value in a variable in a cf file such that I can then call
AskDNS?

Re: using spamassassin to classify spam [ In reply to ]

gtaylor at tnetconsulting

Mar 24, 2022, 5:34 PM

Post #2 of 10 (724 views)

On 3/24/22 5:00 PM, Michael Grant wrote:
> List-Unsubscribe:
> <mailto:leave-abcefgh.1.2.3.4@mumble.aidemxwzlwt.bwbibibi.edu>
>
> I want to extract the mumble.aidemxwzlwt.bwbibibi.edu and run it
> through AskDNS and if I get an NXDOMAIN, I want to score it.

Remember, there are historic mechanisms for an MX for parent domains to
handle child domains even if the child domain in question doesn't have
it's own MX record.

I don't recall the current state of support for this, so don't rely on
it without testing it.

> Is it possible to do this within a cf file?

I don't know. Someone else with more knowledge of SpamAssassin will
need to speak to this.

--
Grant. . . .
unix || die

Re: using spamassassin to classify spam [ In reply to ]

uhlar at fantomas

Mar 25, 2022, 1:57 AM

Post #3 of 10 (724 views)

>On 3/24/22 5:00 PM, Michael Grant wrote:
>>List-Unsubscribe:
>><mailto:leave-abcefgh.1.2.3.4@mumble.aidemxwzlwt.bwbibibi.edu>
>>
>>I want to extract the mumble.aidemxwzlwt.bwbibibi.edu and run it
>>through AskDNS and if I get an NXDOMAIN, I want to score it.

On 24.03.22 18:34, Grant Taylor wrote:
>Remember, there are historic mechanisms for an MX for parent domains
>to handle child domains even if the child domain in question doesn't
>have it's own MX record.

which, besides wildcard DNS?

OP, also remember that mumble.aidemxwzlwt.bwbibibi.edu may have no A/MX
record while not produce NXDOMAIN

>I don't recall the current state of support for this, so don't rely on
>it without testing it.
>
>>Is it possible to do this within a cf file?
>
>I don't know. Someone else with more knowledge of SpamAssassin will
>need to speak to this.

--
Matus UHLAR - fantomas, uhlar@fantomas.sk ; http://www.fantomas.sk/
Warning: I wish NOT to receive e-mail advertising to this address.
Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
"Two words: Windows survives." - Craig Mundie, Microsoft senior strategist
"So does syphillis. Good thing we have penicillin." - Matthew Alton

Re: using spamassassin to classify spam [ In reply to ]

mgrant at grant

Mar 25, 2022, 3:01 AM

Post #4 of 10 (724 views)

> On 24.03.22 18:34, Grant Taylor wrote:
> > Remember, there are historic mechanisms for an MX for parent domains to
> > handle child domains even if the child domain in question doesn't have
> > it's own MX record.
>
> which, besides wildcard DNS?
>
> OP, also remember that mumble.aidemxwzlwt.bwbibibi.edu may have no A/MX
> record while not produce NXDOMAIN
>

Right, good points! So for each sub domain there, do an MX and A
record lookup and stop before getting to the tld itself (.edu). Of
course there got to be some list somewhere what the tld and gtlds are.

Unless there's an existing function in some plugin to do this, I'll
have to write my own. Little surprising that there isn't, this seems
like an obvious check!

However, my question still has another part regarding doing this in a
cf file like local.cf. If it were simply 2 lines in a local.cf to do
this, I'd rather do it there than cobble together a plugin which is
another order of magnitude more complicated.

I have seen things like _VARIABLE_ in .cf files and they seem to get
there from the perl side by doing something like this:

$pms->set_tag('VARIABLE', $value);

I was wondering if there was a way to set such a variable from the
output of something within the cf side.

This does not work:

header LIST_UNSUB_DOM List-Unsubscribe =~ /\@(.+)/ VARIABLE
askdns LIST_UNSUB_DOM _VARIABLE_

Even if askdns did do the correct thing, I coudln't find a way to get
it the domain name to look up. Is there a syntax to do this in the
context of a cf file such as local.cf? Sure seems like it'd be
useful!

Otherwise, I'll have to write a plugin but it seems a shame.

Michael Grant

p.s. the subject of my original post really could have been clearer!
I'm classifying spam by putting thigs into a buckets with certain
attributes and this is one of them. Things with domain names that are
bogus, but it's still just spamassassin. Sorry about any confusion!

Re: using spamassassin to classify spam [ In reply to ]

Mar 25, 2022, 3:17 AM

Post #5 of 10 (724 views)

On 2022-03-25 11:01, Michael Grant wrote:

> Otherwise, I'll have to write a plugin but it seems a shame.

if you are good at writing plugin, why is it a shame then ?

i can only confirm its not for beginners to write rules to spamassassin,
but even try in rspamd is waste & harder unless its clear what
spamassassin & rspamd missing

what is your input data ?, what is the wanted output data ?

Re: using spamassassin to classify spam [ In reply to ]

Mar 25, 2022, 5:27 AM

Post #6 of 10 (724 views)

On Fri, Mar 25, 2022 at 06:01:43AM -0400, Michael Grant wrote:
>
> Unless there's an existing function in some plugin to do this, I'll
> have to write my own. Little surprising that there isn't, this seems
> like an obvious check!

There is already very basic HEADER() template support added in trunk/4.0.0,
this would generally work:

askdns UNSUB_NXDOMAIN _HEADER(List-Unsubscribe:host)_ MX [NXDOMAIN]

It just tries to find something resembling a hostname (having valid TLD) in
the header, preferring to match @(.*) first. So it doesn't differentiate
between http, mailto etc.

Re: using spamassassin to classify spam [ In reply to ]

Mar 25, 2022, 6:34 AM

Post #7 of 10 (724 views)

On 2022-03-25 13:27, Henrik K wrote:
> On Fri, Mar 25, 2022 at 06:01:43AM -0400, Michael Grant wrote:
>>
>> Unless there's an existing function in some plugin to do this, I'll
>> have to write my own. Little surprising that there isn't, this seems
>> like an obvious check!
>
> There is already very basic HEADER() template support added in
> trunk/4.0.0,
> this would generally work:
>
> askdns UNSUB_NXDOMAIN _HEADER(List-Unsubscribe:host)_ MX [NXDOMAIN]
>
> It just tries to find something resembling a hostname (having valid
> TLD) in
> the header, preferring to match @(.*) first. So it doesn't
> differentiate
> between http, mailto etc.

is MX here include A/AAAA ?

common mistake many places, if there is no MX on a domainname, then most
checkers say domain send no mail from A/AAAA, doh :=)

Re: using spamassassin to classify spam [ In reply to ]

uhlar at fantomas

Mar 25, 2022, 6:41 AM

Post #8 of 10 (724 views)

>>On Fri, Mar 25, 2022 at 06:01:43AM -0400, Michael Grant wrote:
>>>Unless there's an existing function in some plugin to do this, I'll
>>>have to write my own. Little surprising that there isn't, this seems
>>>like an obvious check!

>On 2022-03-25 13:27, Henrik K wrote:
>>There is already very basic HEADER() template support added in
>>trunk/4.0.0,
>>this would generally work:
>>
>>askdns UNSUB_NXDOMAIN _HEADER(List-Unsubscribe:host)_ MX [NXDOMAIN]
>>
>>It just tries to find something resembling a hostname (having valid
>>TLD) in
>>the header, preferring to match @(.*) first. So it doesn't
>>differentiate
>>between http, mailto etc.

On 25.03.22 14:34, Benny Pedersen wrote:
>is MX here include A/AAAA ?
>
>common mistake many places, if there is no MX on a domainname, then
>most checkers say domain send no mail from A/AAAA, doh :=)

simply said, mail is deliverable, if
- MX exists and the destination is not "."
- or, A/AAAA exist.

that would require plugin or a few meta rules.

--
Matus UHLAR - fantomas, uhlar@fantomas.sk ; http://www.fantomas.sk/
Warning: I wish NOT to receive e-mail advertising to this address.
Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
10 GOTO 10 : REM (C) Bill Gates 1998, All Rights Reserved!

Re: using spamassassin to classify spam [ In reply to ]

martin at gregorie

Mar 25, 2022, 6:45 AM

Post #9 of 10 (724 views)

On Thu, 2022-03-24 at 18:34 -0600, Grant Taylor wrote:
> On 3/24/22 5:00 PM, Michael Grant wrote:
> > List-Unsubscribe:
> > <mailto:leave-abcefgh.1.2.3.4@mumble.aidemxwzlwt.bwbibibi.edu>
> >
> > I want to extract the mumble.aidemxwzlwt.bwbibibi.edu and run it
> > through AskDNS and if I get an NXDOMAIN, I want to score it.
>
> Remember, there are historic mechanisms for an MX for parent domains
> to
> handle child domains even if the child domain in question doesn't have
> it's own MX record.
>
> I don't recall the current state of support for this, so don't rely on
> it without testing it.
>
> > Is it possible to do this within a cf file?
>
Yes. You'll need to write a Perl plugin and a rule to trigger it. The
rule should extract the domain name from the 'mailto:' string and pass
it to the Perl plugin, which in turn calls AskDNS with the string as a
parameter and either returns a positive score or zero depending on
whether AskDNS returned NXDOMAIN or not.

Its all simple enough and requires only a few lines pf Perl: I haven't
needed a plugin to do what you want, but did write one that searches a
PostgreSQL database and whitelists e-mail from anybody that I've
previously sent mail to.

Get a copy of the 'Camel' book of you don't have one ("Programming Perl"
by Wall, Chrtiansen & Orwant, pub: O'Reilly).

The requirements for writing plugins are on the SA website.

Martin

Re: using spamassassin to classify spam [ In reply to ]

mgrant at grant

Mar 25, 2022, 8:29 AM

Post #10 of 10 (724 views)

On Fri, Mar 25, 2022 at 02:27:09PM +0200, Henrik K wrote:
> On Fri, Mar 25, 2022 at 06:01:43AM -0400, Michael Grant wrote:
> >
> > Unless there's an existing function in some plugin to do this, I'll
> > have to write my own. Little surprising that there isn't, this seems
> > like an obvious check!
>
> There is already very basic HEADER() template support added in trunk/4.0.0,
> this would generally work:
>
> askdns UNSUB_NXDOMAIN _HEADER(List-Unsubscribe:host)_ MX [NXDOMAIN]
>
> It just tries to find something resembling a hostname (having valid TLD) in
> the header, preferring to match @(.*) first. So it doesn't differentiate
> between http, mailto etc.

Fantastic, thank you!

I'm trying to test this with the debian experimental 4.0.0~0.0svn1896439-1
package.

Running an email through this version seems to be working (as in
spamassassin < test.eml). However when I test just a narrow set of
rules in my own cf file, I get this:

$ spamassassin -t -C test.cf < tests/test1.eml
config: no rules were found! Do you need to run 'sa-update'? at /usr/bin/spamassassin line 417.

this works fine on spamassassin 3.x by the way.

I have tried reducing test.cf to something simple, for example:

full DKIM_SIGNED eval:check_dkim_signed()
describe DKIM_SIGNED Message has a DKIM or DK signature, not necessarily valid
score DKIM_SIGNED 5.0

To be clear, for this I really don't want to run all the tests. Only
specific ones which is why I tried using the -C option which works
with 3.x. Is there a correct way to do this with 4.x?

Michael Grant