FWIW Here's an du -sk directory size summary of the reports
SURBL grabbed from SpamCop Spamvertised sites over the past
4 days or so, stored by TLD or first octet of a numeric URI:
KBytes TLD or first octet of numeric address
====== =====================================
7 140
7 163
7 196
7 199
34 200
13 202
8 203
7 204
37 205
25 207
3 208
14 209
1 210
67 211
19 213
7 216
7 217
31 218
41 219
7 220
13 24
11 61
7 63
31 64
27 66
13 68
33 69
13 80
7 82
1 ae
1 an
5 ar
5 aspa
9 au
5 be
5550 biz
5 bogeyme
60 br
5 bz
1 ca
38 cc
3 celer
9 ch
21 cl
57 cn
7653 com
57 de
5 edu
9 es
3 f
17 fr
5 gg
9 gr
3 grand
9 hk
1 hostingp
11 il
3 imabigpimp
5 in
5798 info
21 it
9 jp
25 kr
5 mx
5 name
946 net
21 nl
5 no
3 nort
5 nu
305 org
5 pe
75 ph
1 pl
9 pt
21 ro
51 ru
5 se
1 sg
1 sk
1 st
1 st1
3 tabletswh
29 tc
5 thesed
5 tk
11 to
5 tr
51 tv
50 tw
9 ua
32 uk
1880 us
5 whole
69 ws
13 za
Looks like .com is the top spam site TLD reported to SpamCop,
followed by .info and .biz, then .us. And 211. is the top
numeric URI.
The obviously wrong TLDs like "grand" and "tabletswh" are either
sloppy URIs or an attempt to take advantage of an implicit .com
some browsers apparently add when no TLD is specified in
a URI. If the latter it could be an attempt to get around
message body scanning: sort of "obfuscation by underspecification".
We could counter this by adding a ".com" before processing any
domain lacking a legitimate-looking TLD.
Individual record lines vary in size somewhat so something like a
record count (line count) would be a more accurate way to measure
the number of minute-unique spam reports, but as an general
estimate of reported activity, it's probably pretty good.
Source data is the "domains" directory SURBL uses as a text
database of reports, stored into a tree of domain levels:
http://spamcheck.freeapp.net/domains/
Hope this kind of info is not too redundant; I'm new here...
Jeff C.
--
Jeff Chan
mailto:jeffc@surbl.org-nospam
http://sc.surbl.org/
SURBL grabbed from SpamCop Spamvertised sites over the past
4 days or so, stored by TLD or first octet of a numeric URI:
KBytes TLD or first octet of numeric address
====== =====================================
7 140
7 163
7 196
7 199
34 200
13 202
8 203
7 204
37 205
25 207
3 208
14 209
1 210
67 211
19 213
7 216
7 217
31 218
41 219
7 220
13 24
11 61
7 63
31 64
27 66
13 68
33 69
13 80
7 82
1 ae
1 an
5 ar
5 aspa
9 au
5 be
5550 biz
5 bogeyme
60 br
5 bz
1 ca
38 cc
3 celer
9 ch
21 cl
57 cn
7653 com
57 de
5 edu
9 es
3 f
17 fr
5 gg
9 gr
3 grand
9 hk
1 hostingp
11 il
3 imabigpimp
5 in
5798 info
21 it
9 jp
25 kr
5 mx
5 name
946 net
21 nl
5 no
3 nort
5 nu
305 org
5 pe
75 ph
1 pl
9 pt
21 ro
51 ru
5 se
1 sg
1 sk
1 st
1 st1
3 tabletswh
29 tc
5 thesed
5 tk
11 to
5 tr
51 tv
50 tw
9 ua
32 uk
1880 us
5 whole
69 ws
13 za
Looks like .com is the top spam site TLD reported to SpamCop,
followed by .info and .biz, then .us. And 211. is the top
numeric URI.
The obviously wrong TLDs like "grand" and "tabletswh" are either
sloppy URIs or an attempt to take advantage of an implicit .com
some browsers apparently add when no TLD is specified in
a URI. If the latter it could be an attempt to get around
message body scanning: sort of "obfuscation by underspecification".
We could counter this by adding a ".com" before processing any
domain lacking a legitimate-looking TLD.
Individual record lines vary in size somewhat so something like a
record count (line count) would be a more accurate way to measure
the number of minute-unique spam reports, but as an general
estimate of reported activity, it's probably pretty good.
Source data is the "domains" directory SURBL uses as a text
database of reports, stored into a tree of domain levels:
http://spamcheck.freeapp.net/domains/
Hope this kind of info is not too redundant; I'm new here...
Jeff C.
--
Jeff Chan
mailto:jeffc@surbl.org-nospam
http://sc.surbl.org/