Mailing List Archive

New grubby med spammer sneeking through
There is a new guy that a trained Bayes will catch, but no current rules.
It looks like a simple body rule looking for his href tag will catch the
ones I've seen so far. He has a pattern that will ket more complex rules
match on the body if needed.

Below is the cheap rule for him, and pattern development if he changes the
url text and someone needs to make a more complex rule.

Cheap body rule:

body LW_SELCYCDC /http\:\/\/selcydc\.com/


Possible body rules to catch his stuff if needed:

/, (?:looking|searching) for a (?:place|site) to get medi\w+\?/
/(?:Quality|Cheap) Viag\w+ and Cial\w+\./
/Best deals, 80 p[rcen]+t off!/
/We (?:are able to ship|can send)[\w\s]+wide/


Loren
Re: New grubby med spammer sneeking through [ In reply to ]
Would it be possible to do this kind of url blacklisting via a dns based lookup
service?

Has anyone discussed it? Maybe the likes of spamcop.net would be in a good
position to extract the webserver ips that are hosting spamvertised content and
list their ips in a reverse blacklist zone.

Just a thought.

Rob

Loren Wilton wrote:
> There is a new guy that a trained Bayes will catch, but no current rules.
> It looks like a simple body rule looking for his href tag will catch the
> ones I've seen so far. He has a pattern that will ket more complex rules
> match on the body if needed.
>
> Below is the cheap rule for him, and pattern development if he changes the
> url text and someone needs to make a more complex rule.
>
> Cheap body rule:
>
> body LW_SELCYCDC /http\:\/\/selcydc\.com/
>
>
> Possible body rules to catch his stuff if needed:
>
> /, (?:looking|searching) for a (?:place|site) to get medi\w+\?/
> /(?:Quality|Cheap) Viag\w+ and Cial\w+\./
> /Best deals, 80 p[rcen]+t off!/
> /We (?:are able to ship|can send)[\w\s]+wide/
>
>
> Loren
>
>


--
Robert Brooks, Network Manager, Hyperlink Interactive Ltd
<robb@hyperlink-interactive.co.uk> http://hyperlink-interactive.co.uk/
Tel: +44 (0)20 7240 8121 Fax: +44 (0)20 7240 8098
- Help Microsoft stamp out piracy. Give Linux to a friend today! -
RE: New grubby med spammer sneeking through [ In reply to ]
I was the loudest voice screaming for this. SA devs opened a bug on it.
(Wish I could find it again!) Then I realised the 1 SERIOUS flaw with it.

With a DNSRBL and email there is one sender to check. With URLs there is no
limit to how many one could put in a spam. A spammer could simply flood the
spam with good and bad URLs. This would cause a timeout and simply skip the
test. You could limit the number of URLS, but the spammers would simply add
that many good ones in.

I've even spoken to RBL hosts. Some fear what this would do to their
servers. THe number of lookups would skyrocket.

So the only ting I can think of is a local RBL. :(

--Chris

> -----Original Message-----
> From: Robert Brooks [mailto:robb@hyperlink-interactive.co.uk]
> Sent: Friday, March 05, 2004 8:39 AM
> To: SpamAssassin Mailing List
> Subject: Re: New grubby med spammer sneeking through
>
>
> Would it be possible to do this kind of url blacklisting via
> a dns based lookup
> service?
>
> Has anyone discussed it? Maybe the likes of spamcop.net
> would be in a good
> position to extract the webserver ips that are hosting
> spamvertised content and
> list their ips in a reverse blacklist zone.
>
> Just a thought.
>
> Rob
>
> Loren Wilton wrote:
> > There is a new guy that a trained Bayes will catch, but no
> current rules.
> > It looks like a simple body rule looking for his href tag
> will catch the
> > ones I've seen so far. He has a pattern that will ket more
> complex rules
> > match on the body if needed.
> >
> > Below is the cheap rule for him, and pattern development if
> he changes the
> > url text and someone needs to make a more complex rule.
> >
> > Cheap body rule:
> >
> > body LW_SELCYCDC /http\:\/\/selcydc\.com/
> >
> >
> > Possible body rules to catch his stuff if needed:
> >
> > /, (?:looking|searching) for a (?:place|site) to get medi\w+\?/
> > /(?:Quality|Cheap) Viag\w+ and Cial\w+\./
> > /Best deals, 80 p[rcen]+t off!/
> > /We (?:are able to ship|can send)[\w\s]+wide/
> >
> >
> > Loren
> >
> >
>
>
> --
> Robert Brooks, Network Manager, Hyperlink Interactive Ltd
> <robb@hyperlink-interactive.co.uk> http://hyperlink-interactive.co.uk/
> Tel: +44 (0)20 7240 8121 Fax: +44 (0)20 7240 8098
> - Help Microsoft stamp out piracy. Give Linux to a friend today! -
>
Re: New grubby med spammer sneeking through [ In reply to ]
From: "Robert Brooks" <robb@hyperlink-interactive.co.uk>
| Has anyone discussed it? Maybe the likes of spamcop.net would be in a good
| position to extract the webserver ips that are hosting spamvertised content and
| list their ips in a reverse blacklist zone.
|

You might look at
list.dsbl.org:
sbl.spamhaus.org:

Spamcop is full of FP's

I know of some Canadian friends that blacklist
each other for kicks (just recently happened)
during a weekend party session. spamcop
just eats it up.

Also people that forgot they subscribed to lists
use spamcop to stop mail instead of unsubscribing
(first hand knowlege) again spamcop eats it up.

my 2cents

Greg
Re: New grubby med spammer sneeking through [ In reply to ]
Greg Cirino - Cirelle Enterprises wrote:
> From: "Robert Brooks" <robb@hyperlink-interactive.co.uk>
> | Has anyone discussed it? Maybe the likes of spamcop.net would be in a good
> | position to extract the webserver ips that are hosting spamvertised content and
> | list their ips in a reverse blacklist zone.
> |
>
> You might look at
> list.dsbl.org:
> sbl.spamhaus.org:

Yes, I use these in my mailserver config, but as I understand it these have the
ip addresses of suspect mailservers not webhosts.

What I'm talking about is the webservers such that when some link
http://www.example.com/foo turns up in an email then SA can look up the host and
see if it's being used in other spamvertisments.

Regards,

Rob

--
Robert Brooks, Network Manager, Hyperlink Interactive Ltd
<robb@hyperlink-interactive.co.uk> http://hyperlink-interactive.co.uk/
Tel: +44 (0)20 7240 8121 Fax: +44 (0)20 7240 8098
- Help Microsoft stamp out piracy. Give Linux to a friend today! -
Re: New grubby med spammer sneeking through [ In reply to ]
From: "Robert Brooks" <robb@hyperlink-interactive.co.uk>
| What I'm talking about is the webservers such that when some link
| http://www.example.com/foo turns up in an email then SA can look up the host and
| see if it's being used in other spamvertisments.

That's different, never mind

If I Now understand correctly, you want a body search for url looking
text and do a lookup on found urls?

If that is the case, I think Chris' Big Evil list is as close as you might
come for the time being. Though I may be wrong, not being all knowing.

Regards
Greg
Re: New grubby med spammer sneeking through [ In reply to ]
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


"Greg Cirino - Cirelle Enterprises" writes:
>From: "Robert Brooks" <robb@hyperlink-interactive.co.uk>
>| What I'm talking about is the webservers such that when some link=20
>| http://www.example.com/foo turns up in an email then SA can look up =
>the host and=20
>| see if it's being used in other spamvertisments.
>
>That's different, never mind
>
>If I Now understand correctly, you want a body search for url looking
>text and do a lookup on found urls?

This is already in SVN trunk -- the URIBL plugin. It works *great*.

Regarding avoiding DOS attacks with hundreds of URLs -- it selects
up to 20 randomly and tests just those, and we have another rule
which matches messages with too many URLs in the message which is
getting good results.

- --j.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFASLrBQTcbUG5Y7woRAhFXAJwKBAZJRmmdVwhzYalxOnsC0aMnoACgpFIo
7DHC7yBQPcgbkHl1BL1AoQY=
=3SUt
-----END PGP SIGNATURE-----
Re: New grubby med spammer sneeking through [ In reply to ]
Greg Cirino - Cirelle Enterprises wrote:
> That's different, never mind
>
> If I Now understand correctly, you want a body search for url looking
> text and do a lookup on found urls?

exactly

> If that is the case, I think Chris' Big Evil list is as close as you might
> come for the time being. Though I may be wrong, not being all knowing.

yes this I have in my SA setup.

My thoughts being that urls are often a lot easier to discard than IP addresses
and DNS is possibly a relatively easy way to distribute this information to SA
clients.

It also has the advantages of being faster than daily file updates and but can
be cached. It may also encourage webhosts to boot spammy businesses off their
webservers.

Rob

--
Robert Brooks, Network Manager, Hyperlink Interactive Ltd
<robb@hyperlink-interactive.co.uk> http://hyperlink-interactive.co.uk/
Tel: +44 (0)20 7240 8121 Fax: +44 (0)20 7240 8098
- Help Microsoft stamp out piracy. Give Linux to a friend today! -
Re: New grubby med spammer sneeking through [ In reply to ]
Justin Mason wrote:
> This is already in SVN trunk -- the URIBL plugin. It works *great*.

excellent, another great idea someone else got to first, I'll get back to my
perpetual motion machine plans then :-)

--
Robert Brooks, Network Manager, Hyperlink Interactive Ltd
<robb@hyperlink-interactive.co.uk> http://hyperlink-interactive.co.uk/
Tel: +44 (0)20 7240 8121 Fax: +44 (0)20 7240 8098
- Help Microsoft stamp out piracy. Give Linux to a friend today! -
Re: New grubby med spammer sneeking through [ In reply to ]
On Friday 05 March 2004 01:39 pm, Robert Brooks wrote:
> Would it be possible to do this kind of url blacklisting via a dns based
> lookup service?

This has already been implemented in the development version of SA, and will
be released in version 3.0.0

--
Give a man a match, and he'll be warm for a minute, but set him on
fire, and he'll be warm for the rest of his life.

Advanced SPAM filtering software: http://spamassassin.org
RE: New grubby med spammer sneeking through [ In reply to ]
On Fri, 5 Mar 2004, Chris Santerre wrote:

> I was the loudest voice screaming for this. SA devs opened a bug on it.
> (Wish I could find it again!) Then I realised the 1 SERIOUS flaw with it.
>
> With a DNSRBL and email there is one sender to check. With URLs there is no
> limit to how many one could put in a spam. A spammer could simply flood the
> spam with good and bad URLs. This would cause a timeout and simply skip the
> test. You could limit the number of URLS, but the spammers would simply add
> that many good ones in.
>
> I've even spoken to RBL hosts. Some fear what this would do to their
> servers. THe number of lookups would skyrocket.
>
> So the only ting I can think of is a local RBL. :(
>
> --Chris

It may be a problem but probably not. When you do -any- dns lookup
your local DNS server will cache the answer and locally hand it out on
repeated lookups. So if that spam has 20 URLs there would be 1 hit
against the remote DNSRBL server and 19 local repeats,
unless the spammer registered 20 different domains and puts them all
into their spam. But then all 20 answers would be cached for the
next instance of that spam.

One other way to approach this question, use either 'whois' or DNS-NS
records as a spam indicator.
Spammers can register bunches of domain names but each name will cost
them something, so they tend to look for registrars that offer bulk
discounts. I'll bet that if you check the registration info for many of
those spamdomains, you'll find that they're registered with a few
specific sites (several of those being in China).

Only problem is that 'whois' lookups tend to be slow and can timeout.
DNS-NS record lookups are quick and easy. Often legit businesses will
have their own DNS servers or use their ISPs. The bulk registration
customers (spammers) are more likely to use the DNS services provided
by the bulk registrars, rather than bothering with finding their own.

I'll bet that if you check the DNS-NS records for all those spam-domains
you see in those fake-meds spams you'll find that the majority of them
are listed as one of half-dozen specific servers.

EG: nicsimple.com, domain2004.com, arxcom.com, xinnet.cn, namelite.com,
nsfornothing.biz

Another spam indication would be to look at the number of NS records
registered for a given domain. A legit business cares about the
reliability of its net presence and will register 2 or more DNS servers.
Spam-domains will sometimes have only 1 DNS server of record.

--
Dave Funk University of Iowa
<dbfunk (at) engineering.uiowa.edu> College of Engineering
319/335-5751 FAX: 319/384-0549 1256 Seamans Center
Sys_admin/Postmaster/cell_admin Iowa City, IA 52242-1527
#include <std_disclaimer.h>
Better is not better, 'standard' is better. B{
RE: New grubby med spammer sneeking through [ In reply to ]
On Fri, 5 Mar 2004, David B Funk wrote:

> On Fri, 5 Mar 2004, Chris Santerre wrote:
>
> > I was the loudest voice screaming for this. SA devs opened a bug on it.
> > (Wish I could find it again!) Then I realised the 1 SERIOUS flaw with it.
> >
> > With a DNSRBL and email there is one sender to check. With URLs there is no
> > limit to how many one could put in a spam. A spammer could simply flood the
> > spam with good and bad URLs. This would cause a timeout and simply skip the
> > test. You could limit the number of URLS, but the spammers would simply add
> > that many good ones in.

[ edit ]

> One other way to approach this question, use either 'whois' or DNS-NS
> records as a spam indicator.
> Spammers can register bunches of domain names but each name will cost
> them something, so they tend to look for registrars that offer bulk
> discounts. I'll bet that if you check the registration info for many of
> those spamdomains, you'll find that they're registered with a few
> specific sites (several of those being in China).

That's an intersting hypothesis, and one I just put to a quick-and-
dirty (*very* quick-and-dirty) test. I used this weeks' corpus of SA-
identified spam, dug out all the URLs I could find, and passed the host
names through a "dig $hotname ANY | grep '<TAB>NS'". There is _some_
commonality of NS records but not a whole lot that I can find.

The winners are "pharm45454dns.info" (try telling me that *that* isn't
a spam domain), "network-dns.biz", and "name2004.com". "THEBESTMAIL.US"
gets an honourable mention.

Most interesting, however, are the number of hostnames that _didn't_
return anything other than an NXDOMAIN error or timeouts. I think
that _those_ can be used effectively as scoring hits.

> Another spam indication would be to look at the number of NS records
> registered for a given domain. A legit business cares about the
> reliability of its net presence and will register 2 or more DNS servers.
> Spam-domains will sometimes have only 1 DNS server of record.

That didn't seem to be a good indicator. I only saw four domains
that had a single NS record out of 47 lookups (of the 47, 12 returned
timeouts); the rest returned two or more.

+------------------------------------------------+---------------------+
| Carl Richard Friend (UNIX Sysadmin) | West Boylston |
| Minicomputer Collector / Enthusiast | Massachusetts, USA |
| mailto:crfriend@rcn.com +---------------------+
| http://users.rcn.com/crfriend/museum | ICBM: 42:22N 71:47W |
+------------------------------------------------+---------------------+
RE: New grubby med spammer sneeking through [ In reply to ]
> -----Original Message-----
> From: David B Funk [mailto:dbfunk@engineering.uiowa.edu]
> Sent: Friday, March 05, 2004 10:27 PM
> To: Chris Santerre
> Cc: 'Robert Brooks'; SpamAssassin Mailing List
> Subject: RE: New grubby med spammer sneeking through
>
>
> On Fri, 5 Mar 2004, Chris Santerre wrote:
>
> > I was the loudest voice screaming for this. SA devs opened
> a bug on it.
> > (Wish I could find it again!) Then I realised the 1 SERIOUS
> flaw with it.
> >
> > With a DNSRBL and email there is one sender to check. With
> URLs there is no
> > limit to how many one could put in a spam. A spammer could
> simply flood the
> > spam with good and bad URLs. This would cause a timeout and
> simply skip the
> > test. You could limit the number of URLS, but the spammers
> would simply add
> > that many good ones in.
> >
> > I've even spoken to RBL hosts. Some fear what this would do to their
> > servers. THe number of lookups would skyrocket.
> >
> > So the only ting I can think of is a local RBL. :(
> >
> > --Chris
>
> It may be a problem but probably not. When you do -any- dns lookup
> your local DNS server will cache the answer and locally hand it out on
> repeated lookups. So if that spam has 20 URLs there would be 1 hit
> against the remote DNSRBL server and 19 local repeats,
> unless the spammer registered 20 different domains and puts them all
> into their spam. But then all 20 answers would be cached for the
> next instance of that spam.
>
> One other way to approach this question, use either 'whois' or DNS-NS
> records as a spam indicator.
> Spammers can register bunches of domain names but each name will cost
> them something, so they tend to look for registrars that offer bulk
> discounts. I'll bet that if you check the registration info
> for many of
> those spamdomains, you'll find that they're registered with a few
> specific sites (several of those being in China).
>
> Only problem is that 'whois' lookups tend to be slow and can timeout.
> DNS-NS record lookups are quick and easy. Often legit businesses will
> have their own DNS servers or use their ISPs. The bulk registration
> customers (spammers) are more likely to use the DNS services provided
> by the bulk registrars, rather than bothering with finding their own.
>
> I'll bet that if you check the DNS-NS records for all those
> spam-domains
> you see in those fake-meds spams you'll find that the majority of them
> are listed as one of half-dozen specific servers.
>
> EG: nicsimple.com, domain2004.com, arxcom.com, xinnet.cn,
> namelite.com,
> nsfornothing.biz
>
> Another spam indication would be to look at the number of NS records
> registered for a given domain. A legit business cares about the
> reliability of its net presence and will register 2 or more
> DNS servers.
> Spam-domains will sometimes have only 1 DNS server of record.
>
> --
> Dave Funk University of Iowa

Maybe I'm confused. I'm not talking about DNS lookups, but RBL lookups.
Those don't get cached. I've even talked to a few RBL providers and they
also don't like this idea. I've seen spam with all sorts of URL fodder in
it. These hijack small images (Like bullets, diamonds, ..ect...) from legit
sites. I've seen spam with about 20-30 different domain URL's to confuse
both bayes and my Bigevil scripts. A spammer could simply put in 30
different URLs and the URI RBL lookup would
1) Timeout
2) Dramatically increase the load on an RBL server.

Unless this can be solved. I'm against it. :(

--Chris
RE: New grubby med spammer sneeking through [ In reply to ]
> -----Original Message-----
> From: jm@jmason.org [mailto:jm@jmason.org]
> Sent: Friday, March 05, 2004 12:37 PM
> To: Greg Cirino - Cirelle Enterprises
> Cc: Robert Brooks; SpamAssassin Mailing List
> Subject: Re: New grubby med spammer sneeking through
>
>
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
>
> "Greg Cirino - Cirelle Enterprises" writes:
> >From: "Robert Brooks" <robb@hyperlink-interactive.co.uk>
> >| What I'm talking about is the webservers such that when
> some link=20
> >| http://www.example.com/foo turns up in an email then SA
> can look up =
> >the host and=20
> >| see if it's being used in other spamvertisments.
> >
> >That's different, never mind
> >
> >If I Now understand correctly, you want a body search for url looking
> >text and do a lookup on found urls?
>
> This is already in SVN trunk -- the URIBL plugin. It works *great*.
>
> Regarding avoiding DOS attacks with hundreds of URLs -- it selects
> up to 20 randomly and tests just those, and we have another rule
> which matches messages with too many URLs in the message which is
> getting good results.
>
> - --j.

In a word...."cool!" :-)

I should have known you guys would work this out! This would greatly reduce
the size of my Bigevil. Not eliminate as numbers have shown. But would cut
it by at least 40%-60%! Until someone comes up with a pure URIRBL server.
THen I can move on to other projects ;)

--Chris