Mailing List Archive

svn commit: rev 20548 - in incubator/spamassassin/trunk: lib/Mail/SpamAssassin rules
Author: quinlan
Date: Fri May 28 12:39:04 2004
New Revision: 20548

Modified:
incubator/spamassassin/trunk/lib/Mail/SpamAssassin/Conf.pm
incubator/spamassassin/trunk/lib/Mail/SpamAssassin/Dns.pm
incubator/spamassassin/trunk/rules/20_dnsbl_tests.cf
incubator/spamassassin/trunk/rules/70_testing.cf
Log:
revise and update the DNSBL documentation
change SenderBase tests to be prefixed with "sb:" instead of relying on
the set name to distinguish them
lower magnitude cut-off for SB_NSP_VOLUME_SPIKE based on test rule results
and remove those test rules


Modified: incubator/spamassassin/trunk/lib/Mail/SpamAssassin/Conf.pm
==============================================================================
--- incubator/spamassassin/trunk/lib/Mail/SpamAssassin/Conf.pm (original)
+++ incubator/spamassassin/trunk/lib/Mail/SpamAssassin/Conf.pm Fri May 28 12:39:04 2004
@@ -2128,50 +2128,62 @@
a method on the C<Mail::SpamAssassin::EvalTests> object. C<arguments>
are optional arguments to the function call.

-=item header SYMBOLIC_TEST_NAME eval:check_rbl('set', 'zone')
+=item header SYMBOLIC_TEST_NAME eval:check_rbl('set', 'zone' [, 'sub-test'])

-Check a DNSBL (DNS blacklist), also known as RBLs (realtime blacklists). This
-will retrieve Received headers from the mail, parse the IP addresses, select
-which ones are 'untrusted' based on the C<trusted_networks> logic, and query
-that blacklist. There's a few things to note:
+Check a DNSBL (a DNS blacklist or whitelist). This will retrieve Received:
+headers from the message, extract the IP addresses, select which ones are
+'untrusted' based on the C<trusted_networks> logic, and query that DNSBL
+zone. There's a few things to note:

=over 4

-=item Duplicated or reserved IPs
+=item duplicated or reserved IPs

-These are stripped, and the DNSBLs will not be queried for them. Reserved IPs
-are those listed in <http://www.iana.org/assignments/ipv4-address-space>,
-<http://duxcw.com/faq/network/privip.htm>, or
-<http://duxcw.com/faq/network/autoip.htm>.
+Duplicated IPs are only queried once and reserved IPs are not queried.
+Reserved IPs are those listed in
+<http://www.iana.org/assignments/ipv4-address-space>,
+<http://duxcw.com/faq/network/privip.htm>,
+<http://duxcw.com/faq/network/autoip.htm>, or
+<ftp://ftp.rfc-editor.org/in-notes/rfc3330.txt>

-=item The first argument, 'set'
+=item the 'set' argument

-This is used as a 'zone ID'. If you want to look up a multi-meaning zone like
-relays.osirusoft.com, you can then query the results from that zone using it;
+This is used as a 'zone ID'. If you want to look up a multiple-meaning zone
+like NJABL or SORBS, you can then query the results from that zone using it;
but all check_rbl_sub() calls must use that zone ID.

-Also, if an IP gets a hit in one lookup in a zone using that ID, any further
-hits in other rules using that zone ID will *not* be added to the score.
+Also, if more than one IP address gets a DNSBL hit for a particular rule, it
+does not affect the score because rules only trigger once per message.

-=item Selecting all IPs except for the originating one
+=item the 'zone' argument

-This is accomplished by naming the set 'foo-notfirsthop'. Useful for querying
-against DNS lists which list dialup IP addresses; the first hop may be a
-dialup, but as long as there is at least one more hop, via their outgoing
-SMTP server, that's legitimate, and so should not gain points. If there
-is only one hop, that will be queried anyway, as it should be relaying
-via its outgoing SMTP server instead of sending directly to your MX.
+This is the root zone of the DNSBL, ending in a period.

-=item Selecting IPs by whether they are trusted
+=item the 'sub-test' argument
+
+This optional argument behaves the same as the sub-test argument in
+C<check_rbl_sub()> below.
+
+=item selecting all IPs except for the originating one
+
+This is accomplished by placing '-notfirsthop' at the end of the set name.
+This is useful for querying against DNS lists which list dialup IP
+addresses; the first hop may be a dialup, but as long as there is at least
+one more hop, via their outgoing SMTP server, that's legitimate, and so
+should not gain points. If there is only one hop, that will be queried
+anyway, as it should be relaying via its outgoing SMTP server instead of
+sending directly to your MX (mail exchange).
+
+=item selecting IPs by whether they are trusted

When checking a 'nice' DNSBL (a DNS whitelist), you cannot trust the IP
-addresses in Received headers that were not added by trusted relays. To test
-the first IP address that can be trusted, name the set 'foo-firsttrusted'.
-That should test the IP address of the relay that connected to the most remote
-trusted relay.
+addresses in Received headers that were not added by trusted relays. To
+test the first IP address that can be trusted, place '-firsttrusted' at the
+end of the set name. That should test the IP address of the relay that
+connected to the most remote trusted relay.

-In addition, you can test all untrusted IP addresses by naming the set
-'foo-untrusted'.
+In addition, you can test all untrusted IP addresses by placing '-untrusted'
+at the end of the set name.

Note that this requires that SpamAssassin know which relays are trusted. For
simple cases, SpamAssassin can make a good estimate. For complex cases, you
@@ -2192,7 +2204,12 @@
using the zone ID from the original query. The sub-test may either be an
IPv4 dotted address for RBLs that return multiple A records or a
non-negative decimal number to specify a bitmask for RBLs that return a
-single A record containing a bitmask of results.
+single A record containing a bitmask of results, a SenderBase test
+beginning with "sb:", or (if none of the preceding options seem to fit) a
+regular expression.
+
+Note: the set name must be exactly the same for as the main query rule,
+including selections like '-notfirsthop' appearing at the end of the set name.

=cut


Modified: incubator/spamassassin/trunk/lib/Mail/SpamAssassin/Dns.pm
==============================================================================
--- incubator/spamassassin/trunk/lib/Mail/SpamAssassin/Dns.pm (original)
+++ incubator/spamassassin/trunk/lib/Mail/SpamAssassin/Dns.pm Fri May 28 12:39:04 2004
@@ -206,7 +206,7 @@
$self->dnsbl_hit($rule, $question, $answer);
}
# senderbase
- elsif ($set =~ /^senderbase/) {
+ elsif ($subtest =~ s/^sb://) {
$rdatastr =~ s/^"?\d+-//;
$rdatastr =~ s/"$//;
my %sb = ($rdatastr =~ m/(?:^|\|)(\d+)=([^|]+)/g);

Modified: incubator/spamassassin/trunk/rules/20_dnsbl_tests.cf
==============================================================================
--- incubator/spamassassin/trunk/rules/20_dnsbl_tests.cf (original)
+++ incubator/spamassassin/trunk/rules/20_dnsbl_tests.cf Fri May 28 12:39:04 2004
@@ -184,12 +184,12 @@
# http://www.senderbase.org/dnsresponses.html
# sa.senderbase.org for SpamAssassin queries
# query.senderbase.org for other queries
-header __SENDERBASE eval:check_rbl_txt('senderbase', 'sa.senderbase.org.')
+header __SENDERBASE eval:check_rbl_txt('sb', 'sa.senderbase.org.')
tflags __SENDERBASE net

# S23 = domain daily magnitude
# S25 = date of first message from this domain
-header SB_NEW_BULK eval:check_rbl_sub('senderbase', 'S23 > 6.2 && (time - S25 < 120*86400)')
+header SB_NEW_BULK eval:check_rbl_sub('sb', 'sb:S23 > 6.2 && (time - S25 < 120*86400)')
describe SB_NEW_BULK Sender domain is new and very high volume
tflags SB_NEW_BULK net

@@ -197,14 +197,14 @@
# S40 = IP daily magnitude
# S41 = IP monthly magnitude
# note: accounting for rounding, "> 0.3" means at least a 59% volume spike
-header SB_NSP_VOLUME_SPIKE eval:check_rbl_sub('senderbase', 'S5 =~ /NSP/ && S41 > 4.2 && S40 - S41 > 0.3')
+header SB_NSP_VOLUME_SPIKE eval:check_rbl_sub('sb', 'sb:S5 =~ /NSP/ && S41 > 3.8 && S40 - S41 > 0.3')
describe SB_NSP_VOLUME_SPIKE Sender IP hosted at NSP has a volume spike
tflags SB_NSP_VOLUME_SPIKE net

# S2 = organization daily magnitude
# S9 = IP addresses used by this organization
# note: this rule does not work as well on older mail
-header SB_HIGH_VOLUME_PER_IP eval:check_rbl_sub('senderbase', '(S2 / S9) > 5.00')
+header SB_HIGH_VOLUME_PER_IP eval:check_rbl_sub('sb', 'sb:(S2 / S9) > 5.00')
describe SB_HIGH_VOLUME_PER_IP Sender organization has high volume per IP
tflags SB_HIGH_VOLUME_PER_IP net


Modified: incubator/spamassassin/trunk/rules/70_testing.cf
==============================================================================
--- incubator/spamassassin/trunk/rules/70_testing.cf (original)
+++ incubator/spamassassin/trunk/rules/70_testing.cf Fri May 28 12:39:04 2004
@@ -71,50 +71,6 @@

########################################################################

-# possible replacements for SB_NSP_VOLUME_SPIKE
-# accounting for rounding, "> 0.3" means at least a 59% volume spike
-header T_NSP_S41_38_03 eval:check_rbl_sub('senderbase', 'S5 =~ /NSP/ && S41 > 3.8 && S40 - S41 > 0.3')
-header T_NSP_S41_39_03 eval:check_rbl_sub('senderbase', 'S5 =~ /NSP/ && S41 > 3.9 && S40 - S41 > 0.3')
-header T_NSP_S41_40_03 eval:check_rbl_sub('senderbase', 'S5 =~ /NSP/ && S41 > 4.0 && S40 - S41 > 0.3')
-header T_NSP_S41_41_03 eval:check_rbl_sub('senderbase', 'S5 =~ /NSP/ && S41 > 4.1 && S40 - S41 > 0.3')
-header T_NSP_S41_42_03 eval:check_rbl_sub('senderbase', 'S5 =~ /NSP/ && S41 > 4.2 && S40 - S41 > 0.3')
-header T_NSP_S41_43_03 eval:check_rbl_sub('senderbase', 'S5 =~ /NSP/ && S41 > 4.3 && S40 - S41 > 0.3')
-header T_NSP_S41_44_03 eval:check_rbl_sub('senderbase', 'S5 =~ /NSP/ && S41 > 4.4 && S40 - S41 > 0.3')
-header T_NSP_S41_45_03 eval:check_rbl_sub('senderbase', 'S5 =~ /NSP/ && S41 > 4.5 && S40 - S41 > 0.3')
-header T_NSP_S41_46_03 eval:check_rbl_sub('senderbase', 'S5 =~ /NSP/ && S41 > 4.6 && S40 - S41 > 0.3')
-
-# accounting for rounding, "> 0.4" means at least a 251% volume spike
-header T_NSP_S41_38_04 eval:check_rbl_sub('senderbase', 'S5 =~ /NSP/ && S41 > 3.8 && S40 - S41 > 0.4')
-header T_NSP_S41_39_04 eval:check_rbl_sub('senderbase', 'S5 =~ /NSP/ && S41 > 3.9 && S40 - S41 > 0.4')
-header T_NSP_S41_40_04 eval:check_rbl_sub('senderbase', 'S5 =~ /NSP/ && S41 > 4.0 && S40 - S41 > 0.4')
-header T_NSP_S41_41_04 eval:check_rbl_sub('senderbase', 'S5 =~ /NSP/ && S41 > 4.1 && S40 - S41 > 0.4')
-header T_NSP_S41_42_04 eval:check_rbl_sub('senderbase', 'S5 =~ /NSP/ && S41 > 4.2 && S40 - S41 > 0.4')
-header T_NSP_S41_43_04 eval:check_rbl_sub('senderbase', 'S5 =~ /NSP/ && S41 > 4.3 && S40 - S41 > 0.4')
-header T_NSP_S41_44_04 eval:check_rbl_sub('senderbase', 'S5 =~ /NSP/ && S41 > 4.4 && S40 - S41 > 0.4')
-header T_NSP_S41_45_04 eval:check_rbl_sub('senderbase', 'S5 =~ /NSP/ && S41 > 4.5 && S40 - S41 > 0.4')
-header T_NSP_S41_46_04 eval:check_rbl_sub('senderbase', 'S5 =~ /NSP/ && S41 > 4.6 && S40 - S41 > 0.4')
-
-tflags T_NSP_S41_38_03 net
-tflags T_NSP_S41_39_03 net
-tflags T_NSP_S41_40_03 net
-tflags T_NSP_S41_41_03 net
-tflags T_NSP_S41_42_03 net
-tflags T_NSP_S41_43_03 net
-tflags T_NSP_S41_44_03 net
-tflags T_NSP_S41_45_03 net
-tflags T_NSP_S41_46_03 net
-tflags T_NSP_S41_38_04 net
-tflags T_NSP_S41_39_04 net
-tflags T_NSP_S41_40_04 net
-tflags T_NSP_S41_41_04 net
-tflags T_NSP_S41_42_04 net
-tflags T_NSP_S41_43_04 net
-tflags T_NSP_S41_44_04 net
-tflags T_NSP_S41_45_04 net
-tflags T_NSP_S41_46_04 net
-
-########################################################################
-
# let's see how this works
meta T_SPF_PASS_NO_SBL (SPF_PASS && !RCVD_IN_SBL)
meta T_SPF_HELO_PASS_NO_SBL (SPF_HELO_PASS && !RCVD_IN_SBL)