Mailing List Archive

How would one write a rule to match this?
Got a new spam-sign today that I don't know how to make a rule match on. This
is just another token/word breaking method, however it uses valid html, in
this case it's using the same font over and over again. The message did get
tagged as spam (barely) only as a result of RBL's and the fact that there was
no text/plain part.

---spam sample---
<html>
<body bgcolor=3D"#FFFFFF">
<p><font face=3D"Arial, Helvetica, sans-serif" size=3D"4" color=3D"#FF3333=
"> </font><font size=3D2 ptsize=3D"10"><b><font face=3D"Arial, Helvetica, =
sans-serif" color=3D"#ff0000" size=3D"4">
Fr</font><font face=3D"Arial, Helvetica, sans-serif" color=3D"#ff0000" s=
ize=3D"4">om
D</font><font face=3D"Arial, Helvetica, sans-serif" size=3D"4" color=3D"=
#FF3333">iet
To The Blue Pill</font><br>
</b></font><font face=3D"Arial, Helvetica, sans-serif" size=3D"3">A</fon=
t><font face=3D"Arial, Helvetica, sans-serif" size=3D"3">ll
The</font><font face=3D"Arial, Helvetica, sans-serif" size=3D"3"> Popula=
r Pr</font><font face=3D"Arial, Helvetica, sans-serif" size=3D"3">escr</fo=
nt><font face=3D"Arial, Helvetica, sans-serif" size=3D"3">iption
M</font><font face=3D"Arial, Helvetica, sans-serif" size=3D"3">eds<br>
</font> <font face=3D"Arial, Helvetica, sans-serif" size=3D"3"><b><font =
size=3D"4"><a href=3D"http://allquickmeds4u.com">ClickHere</a></font></b><=
br>
No Fees<br>
</font></p>
<p>&nbsp;</p>
<p>&nbsp; </p>
<p><font size=3D"1" face=3D"Arial, Helvetica, sans-serif"><a href=3D"http:=
//allquickmeds4u.com/evm.htm">ListExclusionHere<br>
</a></font><font size=3D1 ptsize=3D"10" face=3D"Arial, Helvetica, sans-s=
erif">Easylink
Suite1483 9 Tanbark Circuit Werrington Downs NSW 2747 AU</font></p>
</body>
</html>
---spam sample---
Re: How would one write a rule to match this? [ In reply to ]
At 11:36 AM 2/10/2004, Brian Godette wrote:
>Got a new spam-sign today that I don't know how to make a rule match on. This
>is just another token/word breaking method, however it uses valid html, in
>this case it's using the same font over and over again.

I know it's always good to try to have rules for every word-break tactic,
however, let's face it, this particular obfuscation tactic shouldn't be
effective against spamassassin in the first place.

Remember, SA strips out HTML tags before it runs rules.

Rules like this:
body LOCAL_MEDS /\bmeds\b/i
score LOCAL_MEDS 0.1

Should hit on that mail just fine, despite the gapping stuck in between the
letters.

Really it strikes me as more of a lacking in your bayes training, and a
lacking in the default ruleset.
Re: How would one write a rule to match this? [ In reply to ]
However the word-breaking is a far greater and more reliable (less FP)
spam-sign than any of the fairly generic and contextual words that were used
in the body of the spam. This spam hit NO body rules other than the HTML
related ones. It didn't hit any bayes score at all even with bayes turned on,
something I've noticed happen every now and then.

To match this sort of word break one would have to backreference the prior
font and compare faces (if used), size (if used), and color (if used).

On Tuesday 10 February 2004 09:51 am, Matt Kettler wrote:
> At 11:36 AM 2/10/2004, Brian Godette wrote:
> >Got a new spam-sign today that I don't know how to make a rule match on.
> > This is just another token/word breaking method, however it uses valid
> > html, in this case it's using the same font over and over again.
>
> I know it's always good to try to have rules for every word-break tactic,
> however, let's face it, this particular obfuscation tactic shouldn't be
> effective against spamassassin in the first place.
>
> Remember, SA strips out HTML tags before it runs rules.
>
> Rules like this:
> body LOCAL_MEDS /\bmeds\b/i
> score LOCAL_MEDS 0.1
>
> Should hit on that mail just fine, despite the gapping stuck in between the
> letters.
>
> Really it strikes me as more of a lacking in your bayes training, and a
> lacking in the default ruleset.