Mailing List Archive

bad href rule....
Seeing a new run of spam with:
{a hrefstringhref=http://bogus.url href="http://real.url"}

I think they are hoping to fool a primitive scan for 'href=' but it
just makes for a really unambiguous spamsign. I'm scoring it high.
We'll probably see some variations on this soon, with other things in
front of href.....

rawbody LOC_HTMLBADHREF /href[a-z]*href/i
describe LOC_HTMLBADHREF href(string)href in link
score LOC_HTMLBADHREF 2.5

- Charles
Re: bad href rule.... [ In reply to ]
Hello Charles,

Monday, February 16, 2004, 8:38:10 AM, you wrote:

CG> Seeing a new run of spam with:
CG> {a hrefstringhref=http://bogus.url href="http://real.url"}

CG> I think they are hoping to fool a primitive scan for 'href=' but it
CG> just makes for a really unambiguous spamsign. I'm scoring it high.
CG> We'll probably see some variations on this soon, with other things in
CG> front of href.....

CG> rawbody LOC_HTMLBADHREF /href[a-z]*href/i
CG> describe LOC_HTMLBADHREF href(string)href in link
CG> score LOC_HTMLBADHREF 2.5

LOC_HTMLBADHREF -- 433s/0h of 100794 corpus (82099s/18695h) 02/16/04


Bob Menschel
Re: [spa] Re: bad href rule.... [ In reply to ]
(smug look) Gee, I'm getting good at this..... :-)
You've already got 433 of those in your corpus? They only started
a couple of days ago.... (shake head)

- Charles

On Tue, 17 Feb 2004, Robert Menschel wrote:
> Hello Charles,
> Monday, February 16, 2004, 8:38:10 AM, you wrote:
>
> CG> Seeing a new run of spam with:
> CG> {a hrefstringhref=http://bogus.url href="http://real.url"}
>
> CG> I think they are hoping to fool a primitive scan for 'href=' but it
> CG> just makes for a really unambiguous spamsign. I'm scoring it high.
> CG> We'll probably see some variations on this soon, with other things in
> CG> front of href.....
>
> CG> rawbody LOC_HTMLBADHREF /href[a-z]*href/i
> CG> describe LOC_HTMLBADHREF href(string)href in link
> CG> score LOC_HTMLBADHREF 2.5
>
> LOC_HTMLBADHREF -- 433s/0h of 100794 corpus (82099s/18695h) 02/16/04
>
>
> Bob Menschel
>
>
>
Re: [spa] Re: bad href rule.... [ In reply to ]
On Tue, 17 Feb 2004, Charles Gregory wrote:

>
> (smug look) Gee, I'm getting good at this..... :-)
> You've already got 433 of those in your corpus? They only started
> a couple of days ago.... (shake head)
>
> - Charles
>
> On Tue, 17 Feb 2004, Robert Menschel wrote:
> > Hello Charles,
> > Monday, February 16, 2004, 8:38:10 AM, you wrote:
> >
> > CG> Seeing a new run of spam with:
> > CG> {a hrefstringhref=http://bogus.url href="http://real.url"}
> >
> > CG> I think they are hoping to fool a primitive scan for 'href=' but it
> > CG> just makes for a really unambiguous spamsign. I'm scoring it high.
> > CG> We'll probably see some variations on this soon, with other things in
> > CG> front of href.....
> >
> > CG> rawbody LOC_HTMLBADHREF /href[a-z]*href/i
> > CG> describe LOC_HTMLBADHREF href(string)href in link
> > CG> score LOC_HTMLBADHREF 2.5
> >
> > LOC_HTMLBADHREF -- 433s/0h of 100794 corpus (82099s/18695h) 02/16/04
> > Bob Menschel

No, I've been seeing that junk for several weeks now.
I wrote a similar rule that is a little less discriminating but
seems to work for me.

rawbody L_FAKE_HREF /\w\whref=http:/i
describe L_FAKE_HREF Faked href to hide spammer URLs
score L_FAKE_HREF 1.7


--
Dave Funk University of Iowa
<dbfunk (at) engineering.uiowa.edu> College of Engineering
319/335-5751 FAX: 319/384-0549 1256 Seamans Center
Sys_admin/Postmaster/cell_admin Iowa City, IA 52242-1527
#include <std_disclaimer.h>
Better is not better, 'standard' is better. B{