http://bugzilla.spamassassin.org/show_bug.cgi?id=2415
------- Additional Comments From pietsch@coli.uni-sb.de 2004-03-17 15:35 -------
Hi developers,
looks like I just joined the SA developer community (if you don't mind).
We need to re-open this bug. I'm pretty sure that Malte's quick fix is based on
false assumptions.
I have tried to find out why GMX blocked an email that was sent via a Mailman
list due to header (that is, SA) analysis results. Running my own SA 2.63 on
this email, I found the culprit:
4.1 FORGED_MUA_THEBAT_BOUN Mail pretending to be from The Bat! (boundary)
(On my linux box, this had been compensated by "-4.9 BAYES_00".)
Using the latest nightly build, the results are the same.
Here are the crucial header lines:
X-Mailer: The Bat! (v1.62 Christmas Edition) Business
Content-Type: multipart/mixed; boundary="===============0088136400=="
X-Mailman-Version: 2.2a0
The original bug reporter showed us a Mailman-processed email created by The
Bat! (v2.00.6). This led Malte into believing that the The Bat's boundary
pattern had changed in version 2. I looked into my email folders and found out
that this is not the case. Some real world examples:
X-Mailer: The Bat! (v2.00.6) Business
Content-Type: multipart/mixed; boundary="----------E229B084B5FA09D"
X-Mailer: The Bat! (v2.00.5) Business
Content-Type: multipart/mixed; boundary="----------8E92687AE93EE84"
X-Mailer: The Bat! (v2.00)
Content-Type: multipart/mixed; boundary="----------3E1CB50391145C0"
This is just the same pattern as in The Bat v1.x:
X-Mailer: The Bat! (v1.62 Christmas Edition) Business
Content-Type: multipart/mixed; boundary="----------629C1451646CE54"
So were did the equal signs come from that triggered the forged boundary rule?
Luckily, I could pull a less mangled version of the email from the Pipermail web
archive of that list (using a "download mbox" link). It turned out that the
original email had no boundary header line at all:
Content-Type: text/plain; charset=ISO-8859-15
Mailman (in this configuration) converts text/plain emails to multipart/mixed in
order to attach a footer containing helpful list information. Thereby, MIME
boundaries are added.
Here's my patch against the latest nightly build:
--- 20_ratware.cf.orig 2004-03-17 19:20:20.000000000 +0100
+++ 20_ratware.cf 2004-03-17 23:49:36.000000000 +0100
@@ -159,13 +159,11 @@ describe FORGED_MUA_AOL_FROM Forged mail
# From private mail with developers. Some top tips here!
header __THEBAT_MUA X-Mailer =~ /The Bat!/
-header __THEBAT_MUA_V1 X-Mailer =~ /^\QThe Bat! (v1.\E/
-header __THEBAT_MUA_V2 X-Mailer =~ /^\QThe Bat! (v2.\E/
header __CTYPE_CHARSET_QUOTED Content-Type =~ /charset=\"/i
header __CTYPE_HAS_BOUNDARY Content-Type =~ /boundary/i
header __BAT_BOUNDARY Content-Type =~ /boundary=\"?-{10}/
meta FORGED_MUA_THEBAT_CS (__THEBAT_MUA && __CTYPE_CHARSET_QUOTED)
-meta FORGED_MUA_THEBAT_BOUN (__THEBAT_MUA && !__THEBAT_MUA_V2 &&
__CTYPE_HAS_BOUNDARY && !__BAT_BOUNDARY)
+meta FORGED_MUA_THEBAT_BOUN (__THEBAT_MUA && __CTYPE_HAS_BOUNDARY &&
!__BAT_BOUNDARY && !__KNOWN_MAILING_LIST)
describe FORGED_MUA_THEBAT_CS Mail pretending to be from The Bat! (charset)
describe FORGED_MUA_THEBAT_BOUN Mail pretending to be from The Bat! (boundary)
So instead of checking The Bat's version, I suggest checking whether a mailing
list software has come into play.
Does this open the door for spammers? I don't know. This is the first time I am
hacking SA. Basically I think that SA should give way if an email has been
processed my mailing list software. Its recognition rates for spam mail that
comes through unmoderated mailing lists are already bad. And I don't think SA
should put more effort into analysing list-processed email. It's the list
owners' responsibility to spam-check email before distributing them, right?
Would an experienced SA developer please apply my patch? -- unless there are
objections, of course.
Cheers,
Christian
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
------- Additional Comments From pietsch@coli.uni-sb.de 2004-03-17 15:35 -------
Hi developers,
looks like I just joined the SA developer community (if you don't mind).
We need to re-open this bug. I'm pretty sure that Malte's quick fix is based on
false assumptions.
I have tried to find out why GMX blocked an email that was sent via a Mailman
list due to header (that is, SA) analysis results. Running my own SA 2.63 on
this email, I found the culprit:
4.1 FORGED_MUA_THEBAT_BOUN Mail pretending to be from The Bat! (boundary)
(On my linux box, this had been compensated by "-4.9 BAYES_00".)
Using the latest nightly build, the results are the same.
Here are the crucial header lines:
X-Mailer: The Bat! (v1.62 Christmas Edition) Business
Content-Type: multipart/mixed; boundary="===============0088136400=="
X-Mailman-Version: 2.2a0
The original bug reporter showed us a Mailman-processed email created by The
Bat! (v2.00.6). This led Malte into believing that the The Bat's boundary
pattern had changed in version 2. I looked into my email folders and found out
that this is not the case. Some real world examples:
X-Mailer: The Bat! (v2.00.6) Business
Content-Type: multipart/mixed; boundary="----------E229B084B5FA09D"
X-Mailer: The Bat! (v2.00.5) Business
Content-Type: multipart/mixed; boundary="----------8E92687AE93EE84"
X-Mailer: The Bat! (v2.00)
Content-Type: multipart/mixed; boundary="----------3E1CB50391145C0"
This is just the same pattern as in The Bat v1.x:
X-Mailer: The Bat! (v1.62 Christmas Edition) Business
Content-Type: multipart/mixed; boundary="----------629C1451646CE54"
So were did the equal signs come from that triggered the forged boundary rule?
Luckily, I could pull a less mangled version of the email from the Pipermail web
archive of that list (using a "download mbox" link). It turned out that the
original email had no boundary header line at all:
Content-Type: text/plain; charset=ISO-8859-15
Mailman (in this configuration) converts text/plain emails to multipart/mixed in
order to attach a footer containing helpful list information. Thereby, MIME
boundaries are added.
Here's my patch against the latest nightly build:
--- 20_ratware.cf.orig 2004-03-17 19:20:20.000000000 +0100
+++ 20_ratware.cf 2004-03-17 23:49:36.000000000 +0100
@@ -159,13 +159,11 @@ describe FORGED_MUA_AOL_FROM Forged mail
# From private mail with developers. Some top tips here!
header __THEBAT_MUA X-Mailer =~ /The Bat!/
-header __THEBAT_MUA_V1 X-Mailer =~ /^\QThe Bat! (v1.\E/
-header __THEBAT_MUA_V2 X-Mailer =~ /^\QThe Bat! (v2.\E/
header __CTYPE_CHARSET_QUOTED Content-Type =~ /charset=\"/i
header __CTYPE_HAS_BOUNDARY Content-Type =~ /boundary/i
header __BAT_BOUNDARY Content-Type =~ /boundary=\"?-{10}/
meta FORGED_MUA_THEBAT_CS (__THEBAT_MUA && __CTYPE_CHARSET_QUOTED)
-meta FORGED_MUA_THEBAT_BOUN (__THEBAT_MUA && !__THEBAT_MUA_V2 &&
__CTYPE_HAS_BOUNDARY && !__BAT_BOUNDARY)
+meta FORGED_MUA_THEBAT_BOUN (__THEBAT_MUA && __CTYPE_HAS_BOUNDARY &&
!__BAT_BOUNDARY && !__KNOWN_MAILING_LIST)
describe FORGED_MUA_THEBAT_CS Mail pretending to be from The Bat! (charset)
describe FORGED_MUA_THEBAT_BOUN Mail pretending to be from The Bat! (boundary)
So instead of checking The Bat's version, I suggest checking whether a mailing
list software has come into play.
Does this open the door for spammers? I don't know. This is the first time I am
hacking SA. Basically I think that SA should give way if an email has been
processed my mailing list software. Its recognition rates for spam mail that
comes through unmoderated mailing lists are already bad. And I don't think SA
should put more effort into analysing list-processed email. It's the list
owners' responsibility to spam-check email before distributing them, right?
Would an experienced SA developer please apply my patch? -- unless there are
objections, of course.
Cheers,
Christian
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.