Mailing List Archive

SURBL inclusion threshold lowered
We have lowered the threshold for inclusion of spam URI reports
into SURBL from 20 down to 10. This has increased the number
of domains in SURBL from about 250 to about 400. The expiration
time of reports is currently unchanged at 4 days. (I had said
the threshold was 24 earlier, but when I looked at the
thresholding code it said 20.)

We may tweak this further later. The FP rate should still be low
and I am checking the resulting data by hand now to see if the
whitelist needs any additions. BTW SpamCop appears to also
whitelist the domains or URIs they get in their users' reports,
which offers some additional protection to downstream users
of their data such as our SURBL effort.

One impetus for this change is my personal anecdote from
yesterday:

http://sc.surbl.org/

> We have changed the threshold for inclusion in SURBL from 24 to
> 10, as of 30 March 2004. The main reason is that *I* (! ;-)
> got a spam for a domain med6547.biz that at the time had 18
> hits in our data: a little under the count of 24 previously
> needed. Right after I (and presumably other spam victims)
> reported it to SC, the count went up to 38 or so.
> (Interestingly it's up to 129 about half a day later, so more
> people got spammed and reported it after I did.) A threshold of
> 10 or 12 would probably have caught the spam before it got to
> me.

Thus the change. :-)

Cheers,

Jeff C.
--
Jeff Chan
mailto:jeffc@surbl.org-nospam
http://sc.surbl.org/
SURBL inclusion threshold lowered [ In reply to ]
We have lowered the threshold for inclusion of spam URI reports
into SURBL from 20 down to 10. This has increased the number
of domains in SURBL from about 250 to about 400. The expiration
time of reports is currently unchanged at 4 days. (I had said
the threshold was 24 earlier, but when I looked at the
thresholding code it said 20.)

We may tweak this further later. The FP rate should still be low
and I am checking the resulting data by hand now to see if the
whitelist needs any additions. BTW SpamCop appears to also
whitelist the domains or URIs they get in their users' reports,
which offers some additional protection to downstream users
of their data such as our SURBL effort.

One impetus for this change is my personal anecdote from
yesterday:

http://sc.surbl.org/

> We have changed the threshold for inclusion in SURBL from 24 to
> 10, as of 30 March 2004. The main reason is that *I* (! ;-)
> got a spam for a domain med6547.biz that at the time had 18
> hits in our data: a little under the count of 24 previously
> needed. Right after I (and presumably other spam victims)
> reported it to SC, the count went up to 38 or so.
> (Interestingly it's up to 129 about half a day later, so more
> people got spammed and reported it after I did.) A threshold of
> 10 or 12 would probably have caught the spam before it got to
> me.

Thus the change. :-)

Cheers,

Jeff C.
--
Jeff Chan
mailto:jeffc@surbl.org-nospam
http://sc.surbl.org/
Re: SURBL inclusion threshold lowered [ In reply to ]
On Tuesday, March 30, 2004, 3:58:13 PM, Jeff Chan wrote:
> We have lowered the threshold for inclusion of spam URI reports
> into SURBL from 20 down to 10. This has increased the number
> of domains in SURBL from about 250 to about 400. The expiration
> time of reports is currently unchanged at 4 days. (I had said
> the threshold was 24 earlier, but when I looked at the
> thresholding code it said 20.)

FWIW Lowering the threshold did increase a few of the FPs creeping
through. I have therefore added a few more legitimate domains
like terra.es and viola.fr to the whitelist, which grew from
about 25 to 40 entries. (Adding these to the internal whitelist
prevents them from getting into SURBL and so prevents FPs on them
by client programs.)

http://spamcheck.freeapp.net/whitelist-domains.sort

I would appreciate hearing if anyone knows of any good sources
of domain whitelists.

Jeff C.
--
Jeff Chan
mailto:jeffc@surbl.org-nospam
http://sc.surbl.org/
Re: SURBL inclusion threshold lowered [ In reply to ]
On Tuesday, March 30, 2004, 3:58:13 PM, Jeff Chan wrote:
> We have lowered the threshold for inclusion of spam URI reports
> into SURBL from 20 down to 10. This has increased the number
> of domains in SURBL from about 250 to about 400. The expiration
> time of reports is currently unchanged at 4 days. (I had said
> the threshold was 24 earlier, but when I looked at the
> thresholding code it said 20.)

FWIW Lowering the threshold did increase a few of the FPs creeping
through. I have therefore added a few more legitimate domains
like terra.es and viola.fr to the whitelist, which grew from
about 25 to 40 entries. (Adding these to the internal whitelist
prevents them from getting into SURBL and so prevents FPs on them
by client programs.)

http://spamcheck.freeapp.net/whitelist-domains.sort

I would appreciate hearing if anyone knows of any good sources
of domain whitelists.

Jeff C.
--
Jeff Chan
mailto:jeffc@surbl.org-nospam
http://sc.surbl.org/
Re: SURBL inclusion threshold lowered [ In reply to ]
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


Jeff Chan writes:
> On Tuesday, March 30, 2004, 3:58:13 PM, Jeff Chan wrote:
> > We have lowered the threshold for inclusion of spam URI reports
> > into SURBL from 20 down to 10. This has increased the number
> > of domains in SURBL from about 250 to about 400. The expiration
> > time of reports is currently unchanged at 4 days. (I had said
> > the threshold was 24 earlier, but when I looked at the
> > thresholding code it said 20.)
>
> FWIW Lowering the threshold did increase a few of the FPs creeping
> through. I have therefore added a few more legitimate domains
> like terra.es and viola.fr to the whitelist, which grew from
> about 25 to 40 entries. (Adding these to the internal whitelist
> prevents them from getting into SURBL and so prevents FPs on them
> by client programs.)
>
> http://spamcheck.freeapp.net/whitelist-domains.sort
>
> I would appreciate hearing if anyone knows of any good sources
> of domain whitelists.

I think your best bet is to collect them yourself -- build a ham
corpus...

- --j.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFAaii0QTcbUG5Y7woRApENAJ90/o83oLovyb4wE/mC0VBGl5yWFgCfaUkN
WDmWPuyZBydrIDfz4ojrvVw=
=N9sP
-----END PGP SIGNATURE-----