Hello Chris,
Friday, February 13, 2004, 11:49:39 AM, you wrote:
CS> I'm really not worried. I'm surprised it took them this long to
CS> figure this part out.
Actually, this is just a variation on the KWS_SPAMSENTENCE fodder posted
here four or five months ago by CAB. That's a series of 79 sentences
taken from someone's web page, with one sentence picked at random applied
to the end of each spam generated by that spammer tool.
What I see in this new batch is an Adlibs exercise in sentence creation
(take one verb, two nouns, add an adjective and/or adverb or two, maybe a
preposition, and stir).
I agree with Justin's recent post -- randomly constructed sentences
cannot poison Bayes -- they will just feed bayes with tokens which do not
exist in normal ham for my domains, and will get themselves identified. A
few may slip through (a very few, since the basic rules do catch almost
all spam content), but that window is already shrinking.
Bob Menschel
Friday, February 13, 2004, 11:49:39 AM, you wrote:
CS> I'm really not worried. I'm surprised it took them this long to
CS> figure this part out.
Actually, this is just a variation on the KWS_SPAMSENTENCE fodder posted
here four or five months ago by CAB. That's a series of 79 sentences
taken from someone's web page, with one sentence picked at random applied
to the end of each spam generated by that spammer tool.
What I see in this new batch is an Adlibs exercise in sentence creation
(take one verb, two nouns, add an adjective and/or adverb or two, maybe a
preposition, and stir).
I agree with Justin's recent post -- randomly constructed sentences
cannot poison Bayes -- they will just feed bayes with tokens which do not
exist in normal ham for my domains, and will get themselves identified. A
few may slip through (a very few, since the basic rules do catch almost
all spam content), but that window is already shrinking.
Bob Menschel