Mailing List Archive

Empty Subject Suspected as Spam?
I just recieved a spam that went through the filters on my system with a
score of 6.4. The subject line was completely empty but no rule was
triggered for it. Is this OK?
It seems to me that an empty subject is a reason to score something.
I know it happens a lot that people forget to write a subject but even
before I started using SA, I used to classify emails missing subjects as
spam candidates.

--ilan
Re: Empty Subject Suspected as Spam? [ In reply to ]
"Ilan Aisic" <ilan@pointer.co.il> writes:

> I just recieved a spam that went through the filters on my system with a
> score of 6.4. The subject line was completely empty but no rule was
> triggered for it. Is this OK?

Heck, yeah. 6.4 means spam!

> It seems to me that an empty subject is a reason to score something.
> I know it happens a lot that people forget to write a subject but even
> before I started using SA, I used to classify emails missing subjects as
> spam candidates.

Doesn't work very well as a spam sign, it's more common in ham than
spam. It could be worth feeding into Bayes, but I doubt it would help
an appreciable amount.

Daniel

--
Daniel Quinlan anti-spam (SpamAssassin), Linux,
http://www.pathname.com/~quinlan/ and open source consulting
RE: Empty Subject Suspected as Spam? [ In reply to ]
See below:

> > I just recieved a spam that went through the filters on my
> system with a
> > score of 6.4. The subject line was completely empty but
> no rule was
> > triggered for it. Is this OK?
>
> Heck, yeah. 6.4 means spam!

I tuned my system to reject anything above 12 but to mark anything above 5
as *SPAM*.
So far, I've been receiving a lot of ham with scores > 5. These cases are
declining though because I've added more and more whitelist entries.


>
> > It seems to me that an empty subject is a reason to score
> something. I
> > know it happens a lot that people forget to write a subject
> but even
> > before I started using SA, I used to classify emails
> missing subjects
> > as spam candidates.
>
> Doesn't work very well as a spam sign, it's more common in
> ham than spam. It could be worth feeding into Bayes, but I
> doubt it would help an appreciable amount.

Empty subject doesn't necessarily mean a spam but it should, IMHO, add
something to the overall score.
There are many other rules that are very common in ham. For exmaple, all
those HTML related rules. Many MUA programs produce HTML messages.

--ilan
Re: Empty Subject Suspected as Spam? [ In reply to ]
Well, if you want it, something like the following should do the job for
you. Personally I think you shouldn't want it:

header __HAS_SUBJECT Subject =~ /./
meta NO_SUBJECT (!__HAS_SUBJECT)
score NO_SUBJECT .01
describe NO_SUBJECT Subject-less message

> So far, I've been receiving a lot of ham with scores > 5. These cases
are

Something is broken, would be my guess. Either that or you are running a
porn site or something else that will take some rather special rules, and
the normal rules are bad for you.

(Note: I'm not making any comparitive comments on the worth of porn sites.
In general I consider them to be reasonably legit businesses, with the
exception of some of those ^&*( hydra sites in Florida with 5 million
non-porn-sounding redirects, and then keep opening more and more browsers
until you have to kill the whole process tree.)

(Hum, I wonder what a set of valid SA rules for a porn site would look
like...)

Loren
Re: Empty Subject Suspected as Spam? [ In reply to ]
> So far, I've been receiving a lot of ham with scores > 5. These cases
are
> declining though because I've added more and more whitelist entries.

I think whitelisting is probably the wrong approach. You should probably be
adjusting scores on rules that are triggering high, or maybe making
alternate versions of some of these rules.

Loren
RE: Empty Subject Suspected as Spam? [ In reply to ]
Tx Loren,
I've already put this rule to use.
Anyway, most of the ham I got that scored > 5 were various legit newsletters
I subscribe to like Java Developer's Journal.
My site (www.pointer.co.il) is relatively small. We only have about 20
active mail users and very little web traffic.
Checking exim's reject log I can see that in the last week, 54 messages were
rejected by SA because they scored > 12 points.
However, 309 were rejected because they were blacklisted by the exim MTA
before they had the chance to be examined by SA.
I regard the blacklist that is used by the MTA as the first line of defense
and I try to update this blacklist from time to time.
I'm hesitant on reducing the reject threshold to avoid FP. FP almost
doesn't happen to me anymore because of my whitelist but I still have to
educate the other mail users. Some of my users receive a lot of mail in
Russian and we all receive mail in Hebrew. Most of the regexp rules
checking content are for English messages. I'm starting to write some
rules for Hebrew but what I have so far is really very rudimentary and
immature.

--ilan

> -----Original Message-----
> From: Loren Wilton [mailto:lwilton@earthlink.net]
> Sent: Saturday, March 06, 2004 10:51 AM
> To: spamassassin-users@incubator.apache.org
> Subject: Re: Empty Subject Suspected as Spam?
>
>
> Well, if you want it, something like the following should do
> the job for you. Personally I think you shouldn't want it:
>
> header __HAS_SUBJECT Subject =~ /./
> meta NO_SUBJECT (!__HAS_SUBJECT)
> score NO_SUBJECT .01
> describe NO_SUBJECT Subject-less message
>
> > So far, I've been receiving a lot of ham with scores > 5.
> These cases
> are
>
> Something is broken, would be my guess. Either that or you
> are running a porn site or something else that will take some
> rather special rules, and the normal rules are bad for you.
>
> (Note: I'm not making any comparitive comments on the worth
> of porn sites. In general I consider them to be reasonably
> legit businesses, with the exception of some of those ^&*(
> hydra sites in Florida with 5 million non-porn-sounding
> redirects, and then keep opening more and more browsers until
> you have to kill the whole process tree.)
>
> (Hum, I wonder what a set of valid SA rules for a porn site would look
> like...)
>
> Loren
>
>
Re: Empty Subject Suspected as Spam? [ In reply to ]
BTW, are you trying to filter out the empty-body spams that seem to be
cropping up? If so, I think you will find they are missing a To: line as
well as a subject. Filtering for no To and no body seems safer to me than
filtering for an empty subject, if that is what you are trying to catch.

Loren
RE: Empty Subject Suspected as Spam? [ In reply to ]
>
> BTW, are you trying to filter out the empty-body spams that
> seem to be cropping up? If so, I think you will find they
> are missing a To: line as well as a subject. Filtering for
> no To and no body seems safer to me than filtering for an
> empty subject, if that is what you are trying to catch.
>
> Loren
>

I only decided that an empty subject means a possible spam. If it's ham,
it's an unprofessional one...
It's a good point though that an empty "To:" and an empty body should also
trigger rules. IMHO, they should each trigger a separate rule. However, if
they are both present the message merits more points. You can probably
argue that in such case you wouldn't want to see the message anyway.

Also, we often receive messages that are addressed to "Undisclosed
Recipients". At least half of them are spams. I'd add this as a rule too.

--ilan
Re: [spa] Empty Subject Suspected as Spam? [ In reply to ]
On Sat, 6 Mar 2004, Ilan Aisic wrote:
> It seems to me that an empty subject is a reason to score something.

On it's own, perhaps not, but I get a pretty good hit rate with
a check for 'undisclosed recipients' in the 'To:' header, plus a missing
or blank subject line.....

header __UNDISCLOSED To =~ /undisclosed.recipients/i
header __NOSUBJECT Subject !~ /\w/i [if-unset: %]
meta LOC_UNDISCLNOSUBJ (__UNDISCLOSED && __NOSUBJECT)
describe LOC_UNDISCLNOSUBJ To undisclosed recipients with no subject
score LOC_UNDISCLNOSUBJ 1.5

Note the '[if-unset]' catches when there is NO subject header.
Re: [spa] Empty Subject Suspected as Spam? [ In reply to ]
On Sat, 2004-03-06 at 14:39, Charles Gregory wrote:
> On Sat, 6 Mar 2004, Ilan Aisic wrote:
> > It seems to me that an empty subject is a reason to score something.
>
> On it's own, perhaps not, but I get a pretty good hit rate with
> a check for 'undisclosed recipients' in the 'To:' header, plus a missing
> or blank subject line.....
>


how about nothing in the body.... lately i have been getting some spam
with nothing in the body
Re: [spa] Empty Subject Suspected as Spam? [ In reply to ]
> how about nothing in the body.... lately i have been getting some spam
> with nothing in the body

This works for me

# 1.141 1.4142 0.0000 1.000 0.98 3.00 LW_BLANK_MSG
header __TO_HEADER_EXISTS To =~ /./
body __MSG_BODY_EXISTS /./
meta LW_BLANK_MSG (! __TO_HEADER_EXISTS && ! __MSG_BODY_EXISTS)
score LW_BLANK_MSG 5

Watch the wrap!

Loren