Mailing List Archive

[Bug 2415] TheBat mail wrongly classified as spam
http://bugzilla.spamassassin.org/show_bug.cgi?id=2415





------- Additional Comments From pietsch@coli.uni-sb.de 2004-03-17 15:35 -------
Hi developers,

looks like I just joined the SA developer community (if you don't mind).

We need to re-open this bug. I'm pretty sure that Malte's quick fix is based on
false assumptions.

I have tried to find out why GMX blocked an email that was sent via a Mailman
list due to header (that is, SA) analysis results. Running my own SA 2.63 on
this email, I found the culprit:
4.1 FORGED_MUA_THEBAT_BOUN Mail pretending to be from The Bat! (boundary)
(On my linux box, this had been compensated by "-4.9 BAYES_00".)
Using the latest nightly build, the results are the same.

Here are the crucial header lines:

X-Mailer: The Bat! (v1.62 Christmas Edition) Business
Content-Type: multipart/mixed; boundary="===============0088136400=="
X-Mailman-Version: 2.2a0

The original bug reporter showed us a Mailman-processed email created by The
Bat! (v2.00.6). This led Malte into believing that the The Bat's boundary
pattern had changed in version 2. I looked into my email folders and found out
that this is not the case. Some real world examples:
X-Mailer: The Bat! (v2.00.6) Business
Content-Type: multipart/mixed; boundary="----------E229B084B5FA09D"

X-Mailer: The Bat! (v2.00.5) Business
Content-Type: multipart/mixed; boundary="----------8E92687AE93EE84"

X-Mailer: The Bat! (v2.00)
Content-Type: multipart/mixed; boundary="----------3E1CB50391145C0"

This is just the same pattern as in The Bat v1.x:
X-Mailer: The Bat! (v1.62 Christmas Edition) Business
Content-Type: multipart/mixed; boundary="----------629C1451646CE54"

So were did the equal signs come from that triggered the forged boundary rule?
Luckily, I could pull a less mangled version of the email from the Pipermail web
archive of that list (using a "download mbox" link). It turned out that the
original email had no boundary header line at all:
Content-Type: text/plain; charset=ISO-8859-15

Mailman (in this configuration) converts text/plain emails to multipart/mixed in
order to attach a footer containing helpful list information. Thereby, MIME
boundaries are added.

Here's my patch against the latest nightly build:

--- 20_ratware.cf.orig 2004-03-17 19:20:20.000000000 +0100
+++ 20_ratware.cf 2004-03-17 23:49:36.000000000 +0100
@@ -159,13 +159,11 @@ describe FORGED_MUA_AOL_FROM Forged mail

# From private mail with developers. Some top tips here!
header __THEBAT_MUA X-Mailer =~ /The Bat!/
-header __THEBAT_MUA_V1 X-Mailer =~ /^\QThe Bat! (v1.\E/
-header __THEBAT_MUA_V2 X-Mailer =~ /^\QThe Bat! (v2.\E/
header __CTYPE_CHARSET_QUOTED Content-Type =~ /charset=\"/i
header __CTYPE_HAS_BOUNDARY Content-Type =~ /boundary/i
header __BAT_BOUNDARY Content-Type =~ /boundary=\"?-{10}/
meta FORGED_MUA_THEBAT_CS (__THEBAT_MUA && __CTYPE_CHARSET_QUOTED)
-meta FORGED_MUA_THEBAT_BOUN (__THEBAT_MUA && !__THEBAT_MUA_V2 &&
__CTYPE_HAS_BOUNDARY && !__BAT_BOUNDARY)
+meta FORGED_MUA_THEBAT_BOUN (__THEBAT_MUA && __CTYPE_HAS_BOUNDARY &&
!__BAT_BOUNDARY && !__KNOWN_MAILING_LIST)
describe FORGED_MUA_THEBAT_CS Mail pretending to be from The Bat! (charset)
describe FORGED_MUA_THEBAT_BOUN Mail pretending to be from The Bat! (boundary)

So instead of checking The Bat's version, I suggest checking whether a mailing
list software has come into play.

Does this open the door for spammers? I don't know. This is the first time I am
hacking SA. Basically I think that SA should give way if an email has been
processed my mailing list software. Its recognition rates for spam mail that
comes through unmoderated mailing lists are already bad. And I don't think SA
should put more effort into analysing list-processed email. It's the list
owners' responsibility to spam-check email before distributing them, right?

Would an experienced SA developer please apply my patch? -- unless there are
objections, of course.

Cheers,
Christian




------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
[Bug 2415] TheBat mail wrongly classified as spam [ In reply to ]
http://bugzilla.spamassassin.org/show_bug.cgi?id=2415





------- Additional Comments From pietsch@coli.uni-sb.de 2004-03-17 19:02 -------
Created an attachment (id=1857)
--> (http://bugzilla.spamassassin.org/attachment.cgi?id=1857&action=view)
the patch you saw in my previous comment

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hm, bugzilla folded some lines of the patch I submitted.
Meanwhile I've found the ``create new attachment'' button :-)
Who would like to be the committer?

Cheers,
Christian
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2 (GNU/Linux)

iD8DBQFAWQ5Pad9zPtQoQIgRAhMGAJ0YTelkBeA6+00vp+rjRRTuF9GXTQCdFGik
f6DxMmk0zR1rHVkawfMWyHE=
=3I6Y
-----END PGP SIGNATURE-----




------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
Re: [Bug 2415] TheBat mail wrongly classified as spam [ In reply to ]
[Bug 2415] TheBat mail wrongly classified as spam [ In reply to ]
http://bugzilla.spamassassin.org/show_bug.cgi?id=2415

pietsch@coli.uni-sb.de changed:

What |Removed |Added
----------------------------------------------------------------------------
Attachment #1857 is|0 |1
obsolete| |



------- Additional Comments From pietsch@coli.uni-sb.de 2004-03-21 18:09 -------
Created an attachment (id=1865)
--> (http://bugzilla.spamassassin.org/attachment.cgi?id=1865&action=view)
disable forged TheBat-boundary detection for mailing lists

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


This patch against the current svn code base
solves the false positive problem with TheBat
(both version 1.x and 2.x) in mailing list emails
cleanly -- provided that my patch for bug 3201 is
in place.

Please tell me what you think of it, and if it
will appear in the 3.0.0 release.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2 (GNU/Linux)

iD8DBQFAXkfMad9zPtQoQIgRArpJAJ0Y+CabqubRXt6liSTdxEYM+I2YgwCfcAoD
mh2FGNoxRrN3zd/KVi3kQrc=
=syCv
-----END PGP SIGNATURE-----




------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
[Bug 2415] TheBat mail wrongly classified as spam [ In reply to ]
http://bugzilla.spamassassin.org/show_bug.cgi?id=2415

pietsch@coli.uni-sb.de changed:

What |Removed |Added
----------------------------------------------------------------------------
BugsThisDependsOn| |3201





------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
[Bug 2415] TheBat mail wrongly classified as spam [ In reply to ]
http://bugzilla.spamassassin.org/show_bug.cgi?id=2415

pietsch@coli.uni-sb.de changed:

What |Removed |Added
----------------------------------------------------------------------------
Status|RESOLVED |REOPENED
Resolution|LATER |



------- Additional Comments From pietsch@coli.uni-sb.de 2004-03-23 22:14 -------
I think this patch should be integrated into 3.0.0.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
[Bug 2415] TheBat mail wrongly classified as spam [ In reply to ]
http://bugzilla.spamassassin.org/show_bug.cgi?id=2415





------- Additional Comments From pietsch@coli.uni-sb.de 2004-03-28 18:50 -------
Subject: Re: TheBat mail wrongly classified as spam

Hi,

after almost two weeks, I have not received any comment on my proposal to
re-open this bug. In the meantime, I have provided two patches that I think
someone should commit to svn.

If not, why not?

Bye,
Christian


On Wed, Mar 17, 2004 at 03:35:19PM -0800, bugzilla-daemon@bugzilla.spamassassin.org wrote:
> http://bugzilla.spamassassin.org/show_bug.cgi?id=2415
>
>
>
>
>
> ------- Additional Comments From pietsch@coli.uni-sb.de 2004-03-17 15:35 -------
> Hi developers,
>
> looks like I just joined the SA developer community (if you don't mind).
>
> We need to re-open this bug. I'm pretty sure that Malte's quick fix is based on
> false assumptions.
>
> I have tried to find out why GMX blocked an email that was sent via a Mailman
> list due to header (that is, SA) analysis results. Running my own SA 2.63 on
> this email, I found the culprit:
> 4.1 FORGED_MUA_THEBAT_BOUN Mail pretending to be from The Bat! (boundary)
> (On my linux box, this had been compensated by "-4.9 BAYES_00".)
> Using the latest nightly build, the results are the same.
>
> Here are the crucial header lines:
>
> X-Mailer: The Bat! (v1.62 Christmas Edition) Business
> Content-Type: multipart/mixed; boundary="===============0088136400=="
> X-Mailman-Version: 2.2a0
>
> The original bug reporter showed us a Mailman-processed email created by The
> Bat! (v2.00.6). This led Malte into believing that the The Bat's boundary
> pattern had changed in version 2. I looked into my email folders and found out
> that this is not the case. Some real world examples:
> X-Mailer: The Bat! (v2.00.6) Business
> Content-Type: multipart/mixed; boundary="----------E229B084B5FA09D"
>
> X-Mailer: The Bat! (v2.00.5) Business
> Content-Type: multipart/mixed; boundary="----------8E92687AE93EE84"
>
> X-Mailer: The Bat! (v2.00)
> Content-Type: multipart/mixed; boundary="----------3E1CB50391145C0"
>
> This is just the same pattern as in The Bat v1.x:
> X-Mailer: The Bat! (v1.62 Christmas Edition) Business
> Content-Type: multipart/mixed; boundary="----------629C1451646CE54"
>
> So were did the equal signs come from that triggered the forged boundary rule?
> Luckily, I could pull a less mangled version of the email from the Pipermail web
> archive of that list (using a "download mbox" link). It turned out that the
> original email had no boundary header line at all:
> Content-Type: text/plain; charset=ISO-8859-15
>
> Mailman (in this configuration) converts text/plain emails to multipart/mixed in
> order to attach a footer containing helpful list information. Thereby, MIME
> boundaries are added.
>
> Here's my patch against the latest nightly build:
>
> --- 20_ratware.cf.orig 2004-03-17 19:20:20.000000000 +0100
> +++ 20_ratware.cf 2004-03-17 23:49:36.000000000 +0100
> @@ -159,13 +159,11 @@ describe FORGED_MUA_AOL_FROM Forged mail
>
> # From private mail with developers. Some top tips here!
> header __THEBAT_MUA X-Mailer =~ /The Bat!/
> -header __THEBAT_MUA_V1 X-Mailer =~ /^\QThe Bat! (v1.\E/
> -header __THEBAT_MUA_V2 X-Mailer =~ /^\QThe Bat! (v2.\E/
> header __CTYPE_CHARSET_QUOTED Content-Type =~ /charset=\"/i
> header __CTYPE_HAS_BOUNDARY Content-Type =~ /boundary/i
> header __BAT_BOUNDARY Content-Type =~ /boundary=\"?-{10}/
> meta FORGED_MUA_THEBAT_CS (__THEBAT_MUA && __CTYPE_CHARSET_QUOTED)
> -meta FORGED_MUA_THEBAT_BOUN (__THEBAT_MUA && !__THEBAT_MUA_V2 &&
> __CTYPE_HAS_BOUNDARY && !__BAT_BOUNDARY)
> +meta FORGED_MUA_THEBAT_BOUN (__THEBAT_MUA && __CTYPE_HAS_BOUNDARY &&
> !__BAT_BOUNDARY && !__KNOWN_MAILING_LIST)
> describe FORGED_MUA_THEBAT_CS Mail pretending to be from The Bat! (charset)
> describe FORGED_MUA_THEBAT_BOUN Mail pretending to be from The Bat! (boundary)
>
> So instead of checking The Bat's version, I suggest checking whether a mailing
> list software has come into play.
>
> Does this open the door for spammers? I don't know. This is the first time I am
> hacking SA. Basically I think that SA should give way if an email has been
> processed my mailing list software. Its recognition rates for spam mail that
> comes through unmoderated mailing lists are already bad. And I don't think SA
> should put more effort into analysing list-processed email. It's the list
> owners' responsibility to spam-check email before distributing them, right?
>
> Would an experienced SA developer please apply my patch? -- unless there are
> objections, of course.
>
> Cheers,
> Christian
>
>
>
>
> ------- You are receiving this mail because: -------
> You are the assignee for the bug, or are watching the assignee.





------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
[Bug 2415] TheBat mail wrongly classified as spam [ In reply to ]
http://bugzilla.spamassassin.org/show_bug.cgi?id=2415

jm@jmason.org changed:

What |Removed |Added
----------------------------------------------------------------------------
Status|REOPENED |RESOLVED
Resolution| |FIXED



------- Additional Comments From jm@jmason.org 2004-03-28 21:24 -------
hi Christian -- sorry about the delay! we're busy! ;)

I think you're right. however, KNOWN_MAILING_LIST is gone, so we'll ahve to do
it another way instead. I've just added this:

# replacement FORGED_MUA_THEBAT_BOUN
# bug 2415
header __MAILMAN_21 X-Mailman-Version =~ /\d/
meta T_FORGED_MUA_THEBAT_BOUN (__THEBAT_MUA && !__THEBAT_MUA_V2 &&
__CTYPE_HAS_BOUNDARY && !__BAT_BOUNDARY && !__MAILMAN_21)

that should do it.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
Re: [Bug 2415] TheBat mail wrongly classified as spam [ In reply to ]
Hi Justin,

thanks for looking into this. I'm not quite convinced yet.

Is your rule meant to work together with Malte's inappropriate THEBAT
rules, or is it a replacement?

After building with today's svn code and copying 70_testing.cf to
/etc/mail/spamassassin/, I still get a FP by Malte's rule:
4.1 FORGED_MUA_THEBAT_BOUN Mail pretending to be from The Bat! (boundary)

Perhaps I should have presented my email exemplar in the first place.
You find it attached (it's from a public mailing list). Meanwhile I've
found 35 more false positives like this one in my mail folders.

On Sun, Mar 28, 2004 at 09:24:09PM -0800, Justin wrote:
> http://bugzilla.spamassassin.org/show_bug.cgi?id=2415
...
> I think you're right. however, KNOWN_MAILING_LIST is gone, so

Are you shure it's gone? I can still see it in the current svn code
and installed as /usr/share/spamassassin/20_meta_tests.cf .

If you apply the two patches of mine, all will be well -- I promise!
:-)

> we'll ahve to do it another way instead. I've just added this:
>
> # replacement FORGED_MUA_THEBAT_BOUN
> # bug 2415
> header __MAILMAN_21 X-Mailman-Version =~ /\d/
> meta T_FORGED_MUA_THEBAT_BOUN (__THEBAT_MUA && !__THEBAT_MUA_V2 && __CTYPE_HAS_BOUNDARY && !__BAT_BOUNDARY && !__MAILMAN_21)

We should never test for __THEBAT_MUA_V2 because there is no need to
make a difference between The Bat v1 and v2.

Have fun with the email I attached (it might even be of some relevance
to people living in central Europe:)

Regards,
Christian

--
Christian Pietsch
http://purl.org/NET/pietsch/
Re: [Bug 2415] TheBat mail wrongly classified as spam [ In reply to ]
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


Christian Pietsch writes:
> Hi Justin,
>
> thanks for looking into this. I'm not quite convinced yet.
>
> Is your rule meant to work together with Malte's inappropriate THEBAT
> rules, or is it a replacement?

It's a replacement -- and the version in testing is called
T_FORGED_MUA_THEBAT_BOUN.

> After building with today's svn code and copying 70_testing.cf to
> /etc/mail/spamassassin/, I still get a FP by Malte's rule:
> 4.1 FORGED_MUA_THEBAT_BOUN Mail pretending to be from The Bat! (boundary)
>
> Perhaps I should have presented my email exemplar in the first place.
> You find it attached (it's from a public mailing list). Meanwhile I've
> found 35 more false positives like this one in my mail folders.
>
> On Sun, Mar 28, 2004 at 09:24:09PM -0800, Justin wrote:
> > http://bugzilla.spamassassin.org/show_bug.cgi?id=2415
> ...
> > I think you're right. however, KNOWN_MAILING_LIST is gone, so
>
> Are you shure it's gone? I can still see it in the current svn code
> and installed as /usr/share/spamassassin/20_meta_tests.cf .
>
> If you apply the two patches of mine, all will be well -- I promise!
> :-)
>
> > we'll ahve to do it another way instead. I've just added this:
> >
> > # replacement FORGED_MUA_THEBAT_BOUN
> > # bug 2415
> > header __MAILMAN_21 X-Mailman-Version =~ /\d/
> > meta T_FORGED_MUA_THEBAT_BOUN (__THEBAT_MUA && !__THEBAT_MUA_V2 && __CTYPE_HAS_BOUNDARY && !__BAT_BOUNDARY && !__MAILMAN_21)
>
> We should never test for __THEBAT_MUA_V2 because there is no need to
> make a difference between The Bat v1 and v2.
>
> Have fun with the email I attached (it might even be of some relevance
> to people living in central Europe:)
>
> Regards,
> Christian
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFAad9kQTcbUG5Y7woRAtWXAJ97H8s6wzx3Gk3LoBHwsW6Aa9r14QCfUxDE
pOMtyauEl6Q5ja74zKpbZMU=
=X5Fo
-----END PGP SIGNATURE-----