Mailing List Archive

[Bug 3226] bogus MAILTO_TO_SPAM_ADDR, TO_HAS_SPACES, OPT_HEADER tests
http://bugzilla.spamassassin.org/show_bug.cgi?id=3226





------- Additional Comments From jm@jmason.org 2004-03-29 11:28 -------
could we get a full copy of the message, as an attachment, in message/rfc822
format? MAILTO_TO_SPAM_ADDR is a body test.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
[Bug 3226] bogus MAILTO_TO_SPAM_ADDR, TO_HAS_SPACES, OPT_HEADER tests [ In reply to ]
http://bugzilla.spamassassin.org/show_bug.cgi?id=3226





------- Additional Comments From moore+spamassassin@cs.utk.edu 2004-03-29 11:47 -------
Created an attachment (id=1875)
--> (http://bugzilla.spamassassin.org/attachment.cgi?id=1875&action=view)
message similar to the one that caused the bug

okay, this is the message as *I* received it - not as it was received by the
guy who sent us the complaint. The headers will differ slightly due to
differences in sendmail configs, received fields, etc. but the message bodies
should be the same in either case.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
[Bug 3226] bogus MAILTO_TO_SPAM_ADDR, TO_HAS_SPACES, OPT_HEADER tests [ In reply to ]
http://bugzilla.spamassassin.org/show_bug.cgi?id=3226

jm@jmason.org changed:

What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution| |WORKSFORME



------- Additional Comments From jm@jmason.org 2004-03-29 12:04 -------
ok, thanks -- here's what the current dev version says:

Content analysis details: (-4.6 points, 5.0 required)

pts rule name description
---- ---------------------- --------------------------------------------------
0.0 TO_HAS_SPACES To: address contains spaces
0.3 MAILTO_TO_SPAM_ADDR URI: Includes a link to a likely spammer email
-4.9 BAYES_00 BODY: Bayesian spam probability is 0 to 1%
[score: 0.0000]

ignoring BAYES_00 (my trained data) it's clearly fine -- both TO_HAS_SPACES and
MAILTO_TO_SPAM_ADDR are "noise" rules -- informational, rather than "deciding"
rules, providing tiny scores.

OPT_HEADER was the real culprit. It has been removed, and the message is scored
*well* inside the ham range at 0.3 points.

That's as good as any mail gets, really ;) -- a very non-spam score.
marking WORKSFORME.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
[Bug 3226] bogus MAILTO_TO_SPAM_ADDR, TO_HAS_SPACES, OPT_HEADER tests [ In reply to ]
http://bugzilla.spamassassin.org/show_bug.cgi?id=3226





------- Additional Comments From moore+spamassassin@cs.utk.edu 2004-03-29 12:08 -------
I don't get what you mean when you say "WORKSFORME".

Do you mean that none of the tests give false positive results when you try it?




------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
[Bug 3226] bogus MAILTO_TO_SPAM_ADDR, TO_HAS_SPACES, OPT_HEADER tests [ In reply to ]
http://bugzilla.spamassassin.org/show_bug.cgi?id=3226

moore+spamassassin@cs.utk.edu changed:

What |Removed |Added
----------------------------------------------------------------------------
Status|RESOLVED |CLOSED



------- Additional Comments From moore+spamassassin@cs.utk.edu 2004-03-29 12:09 -------
okay, sorry I missed your last comment. now I've seen it.

thanks for looking into this.

Keith




------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
[Bug 3226] bogus MAILTO_TO_SPAM_ADDR, TO_HAS_SPACES, OPT_HEADER tests [ In reply to ]
http://bugzilla.spamassassin.org/show_bug.cgi?id=3226

moore+spamassassin@cs.utk.edu changed:

What |Removed |Added
----------------------------------------------------------------------------
Status|CLOSED |REOPENED
Resolution|WORKSFORME |



------- Additional Comments From moore+spamassassin@cs.utk.edu 2004-03-29 12:12 -------
actually could you edit the message to put a space between : and ; on the To
line and see if that still trips one of the tests? because of sendmail bugs
a lot of people receive the message with a space between : and ;.





------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
[Bug 3226] bogus MAILTO_TO_SPAM_ADDR, TO_HAS_SPACES, OPT_HEADER tests [ In reply to ]
http://bugzilla.spamassassin.org/show_bug.cgi?id=3226





------- Additional Comments From felicity@kluge.net 2004-03-29 12:24 -------
Subject: Re: bogus MAILTO_TO_SPAM_ADDR, TO_HAS_SPACES, OPT_HEADER tests

On Mon, Mar 29, 2004 at 12:04:59PM -0800, bugzilla-daemon@bugzilla.spamassassin.org wrote:
> 0.0 TO_HAS_SPACES To: address contains spaces
>
> ignoring BAYES_00 (my trained data) it's clearly fine -- both TO_HAS_SPACES and
> MAILTO_TO_SPAM_ADDR are "noise" rules -- informational, rather than "deciding"
> rules, providing tiny scores.

Just an FYI: according to the RFC, the To header in question ("na-digest
list:;") is parsed as:

display-name : ;

where display-name is defined as either a quoted string or a single
"word" (ie: no whitespace), and the blank list between : and ; is the
list of addresses. so the header is, actually, invalid (need to put
the display-name with the space in quotes).


but from our POV -- shouldn't To:addr ignore the display name and in this
case return a blank To address? right now we return "na-digest list:;"
for the above...

IMO, the code should convert the "display-name: mailbox ;" bit to be just
"mailbox", then the rest of our code can rip out the appropriate bits.





------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
[Bug 3226] bogus MAILTO_TO_SPAM_ADDR, TO_HAS_SPACES, OPT_HEADER tests [ In reply to ]
http://bugzilla.spamassassin.org/show_bug.cgi?id=3226

jm@jmason.org changed:

What |Removed |Added
----------------------------------------------------------------------------
Status|REOPENED |RESOLVED
Resolution| |WORKSFORME



------- Additional Comments From jm@jmason.org 2004-03-29 12:27 -------
'actually could you edit the message to put a space between : and ; on the To
line and see if that still trips one of the tests? because of sendmail bugs
a lot of people receive the message with a space between : and ;.'

That's fine. it has that space, and the "TO_HAS_SPACES" rule is picking that
up, but it's not providing any worrisome quantity of points.

the SA ruleset has several "levels" of scores --

1. really serious rules that are high accuracy. OPT_HEADER was such a rule, but
occasionally threw up false positives -- quite a few -- so we've thrown it out now.

2. not-serious-at-all rules that are low accuracy, but when combined with one or
more big-hitter rules can push a borderline spam just over the threshold.
TO_HAS_SPACES is a good example. it's got such a low score, it's displayed as
0.0 in that report ;) these are very low accuracy, and just provide a little
nudge -- so a hit from these on nonspam is most definitely not a problem, and
frequently happen.

3. others which are in between. irrelevant to this discussion.

so, seriously, no need to worry about the TO_HAS_SPACES hit, it will not affect
anything -- no message will be marked as spam if it hits that, unless it hits
some other, more serious rules, in which case *those* are the hits to worry about.





------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
[Bug 3226] bogus MAILTO_TO_SPAM_ADDR, TO_HAS_SPACES, OPT_HEADER tests [ In reply to ]
http://bugzilla.spamassassin.org/show_bug.cgi?id=3226





------- Additional Comments From jm@jmason.org 2004-03-29 12:31 -------
'but from our POV -- shouldn't To:addr ignore the display name and in this
case return a blank To address? right now we return "na-digest list:;"
for the above... IMO, the code should convert the "display-name: mailbox ;" bit
to be just "mailbox", then the rest of our code can rip out the appropriate bits.'

theo -- I think you're right. we might be able to get better quality results
from TO_HAS_SPACES that way... that should probably be opened as an RFE bug
separately.

And regarding the quoting around "na-digest list" as required by RFC-2822,
that's probably correct for total RFC compliance! But I don't think it'll make
a diff to SA so not a big deal for us ;)





------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
[Bug 3226] bogus MAILTO_TO_SPAM_ADDR, TO_HAS_SPACES, OPT_HEADER tests [ In reply to ]
http://bugzilla.spamassassin.org/show_bug.cgi?id=3226





------- Additional Comments From moore+spamassassin@cs.utk.edu 2004-03-29 12:33 -------
okay, the message that the guy sent who was complaining about his mail being
blocked showed TO_HAS_SPACES as being assigned 2.4 points. maybe he tweaked
the weight, or maybe this is also changed in the development version of SA?



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
[Bug 3226] bogus MAILTO_TO_SPAM_ADDR, TO_HAS_SPACES, OPT_HEADER tests [ In reply to ]
http://bugzilla.spamassassin.org/show_bug.cgi?id=3226





------- Additional Comments From felicity@kluge.net 2004-03-29 19:52 -------
Subject: Re: bogus MAILTO_TO_SPAM_ADDR, TO_HAS_SPACES, OPT_HEADER tests

On Mon, Mar 29, 2004 at 12:31:42PM -0800, bugzilla-daemon@bugzilla.spamassassin.org wrote:
> theo -- I think you're right. we might be able to get better quality results
> from TO_HAS_SPACES that way... that should probably be opened as an RFE bug
> separately.

submitted as 3227.





------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.