Mailing List Archive

Slipping through the cracks
Hi folks,

I've spent a lot of time tuning our spamassassin setup over the
years. Channels, RBLs, pyzor, DCC, bayes, KAM rules, some home spun
rules, etc... and things do work fairly well, the rate is very high ,
but the ones that get through are the ones that are designed to get
around the defenses before they are shutdown. I get the feeling the
scores from many rules are too low, and I'm looking for the right way to
move forward.

The reason I say this is because I've got a spamtrap account, which is
comprised of several addresses that are heavily targeted by spam lists,
and these accounts seem to get the fast flux, rapid zone updates and ip
reputation burns (and other techniques) that are used to do initial spam
flooding before they are picked up by things. Once pyzor, dcc, and the
RBLs pick these up, they are usually scored high enough to get flagged
for everyone else, but without the RBLs, the scoring is too low to meet
that[0]. Of course I "learn" these messages when they come in.

I've been trying to analyze which are the techniques they use to try and
come up with rules that will stop them, but so far they are hard to come
up with something manually. i've taken several of these that got through
and later, after a day, checked them with network tests, and they are
all scored very high by the various lists, fuzzers, and checksums. Often
you will see these don't even hit rbls... but the ones that do, aren't
hitting enough of them to catch them... however usually, if an rbl is
hit, then it gets marked as spam, as most of the times several of the
RBLs all fire at once... but if they are not on rbls, they don't get
flagged as spam by the regular rules.

So, what can I do to tweak these rules to score things up more,
specifically the rules that provide a low false positive rate[1]. This
seems something that should be done programmatically, and not
manually. It seems like what 'masscheck' maybe does generically for all
rules for all installations, but can I use that to just adjust our rules
for our particular breed of spam that comes through?

Thanks for any ideas,
micah


0. with some notable exceptions, like KAM_DMARC_REJECT and
HELO_DYNAMIC_SPLIT_IP

1. like KAM_DMARC_STATUS, HTML_NO_CHARSET are possible ones, or mails
that do not have a To: have a score of 0.1

--
micah
Re: Slipping through the cracks [ In reply to ]
On Fri, 19 Jun 2020, micah anderson wrote:

> So, what can I do to tweak these rules to score things up more,
> specifically the rules that provide a low false positive rate[1]. This
> seems something that should be done programmatically, and not
> manually. It seems like what 'masscheck' maybe does generically for all
> rules for all installations, but can I use that to just adjust our rules
> for our particular breed of spam that comes through?

How about: analyze your spamtrap for recent source IP addresses on a
quick schedule (hourly?) and drive a local DNSBL from IPs seen more than
2-3 times in the last 24-48 hours?

Potentially relax it a bit by collecting on /30 or /28 netblocks instead
of individual /32 IP addresses.


--
John Hardin KA7OHZ http://www.impsec.org/~jhardin/
jhardin@impsec.org FALaholic #11174 pgpk -a jhardin@impsec.org
key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
Britain used to be the most powerful empire in the world.
Now they're terrified of pocketknives.
How the mighty have fallen. -- Matt Walsh
-----------------------------------------------------------------------
138 days until the Presidential Election
Re: Slipping through the cracks [ In reply to ]
John Hardin <jhardin@impsec.org> writes:

> On Fri, 19 Jun 2020, micah anderson wrote:
>
>> So, what can I do to tweak these rules to score things up more,
>> specifically the rules that provide a low false positive rate[1]. This
>> seems something that should be done programmatically, and not
>> manually. It seems like what 'masscheck' maybe does generically for all
>> rules for all installations, but can I use that to just adjust our rules
>> for our particular breed of spam that comes through?
>
> How about: analyze your spamtrap for recent source IP addresses on a
> quick schedule (hourly?) and drive a local DNSBL from IPs seen more than
> 2-3 times in the last 24-48 hours?

Interesting possibility... but if I look at the current batch that made
it through, I see:

1. amazon aws
2. gmail (amusingly saying my amazon prime membership is going to
expire)
3. mailchimp
4. yahoo.com

all of those would not be good to block :(

Its not always like that, but it does happen.

--
micah
Re: Slipping through the cracks [ In reply to ]
On Fri, 2020-06-19 at 13:54 -0400, micah anderson wrote:

> 2. gmail (amusingly saying my amazon prime membership is going to
> expire)
>
That would make an obvious local rule if you're continuing to see
messages like that since a Prime expiry notice thats NOT from Amazon is
unlikely to be valid:

Score 5+ if:
- body or subject mention amazon prime
and
- sender and/or Message-ID do not contain a valid Amazon host name.

Remember to keep 2-3 example messages for testing your new rule before
you adding it to your live system.

Martin
Re: Slipping through the cracks [ In reply to ]
On Fri, 19 Jun 2020, micah anderson wrote:

> John Hardin <jhardin@impsec.org> writes:
>
>> On Fri, 19 Jun 2020, micah anderson wrote:
>>
>>> So, what can I do to tweak these rules to score things up more,
>>> specifically the rules that provide a low false positive rate[1]. This
>>> seems something that should be done programmatically, and not
>>> manually. It seems like what 'masscheck' maybe does generically for all
>>> rules for all installations, but can I use that to just adjust our rules
>>> for our particular breed of spam that comes through?
>>
>> How about: analyze your spamtrap for recent source IP addresses on a
>> quick schedule (hourly?) and drive a local DNSBL from IPs seen more than
>> 2-3 times in the last 24-48 hours?
>
> Interesting possibility... but if I look at the current batch that made
> it through, I see:
>
> 1. amazon aws
> 2. gmail (amusingly saying my amazon prime membership is going to
> expire)
> 3. mailchimp
> 4. yahoo.com
>
> all of those would not be good to block :(

Amazon AWS if not using a "real" (non-AWS) domain name might be safe to
reject - there's been some discussion about that on the list lately.

> Its not always like that, but it does happen.

Hm. Perhaps you'd need whitelists too, to avoid some known mixed sources.


--
John Hardin KA7OHZ http://www.impsec.org/~jhardin/
jhardin@impsec.org FALaholic #11174 pgpk -a jhardin@impsec.org
key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
I?ve seen firsthand how an ideological hatred of guns and the
people who own them is more important to some people than the
actual goal of saving lives.
-- Dan Gross, former president of the Brady Campaign
-----------------------------------------------------------------------
138 days until the Presidential Election