Mailing List Archive

[Spamassassin Wiki] Update of "RescoreMassCheck" by JustinMason
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Spamassassin Wiki" for change notification.

The following page has been changed by JustinMason:
http://wiki.apache.org/spamassassin/RescoreMassCheck

The comment on the change is:
this page needs fixage

------------------------------------------------------------------------------
= Procedure =

Here's the process for generating the score-set.
+
+ TODO: this is no longer accurate -- iirc we can do all mass-checks in one sitting. Daniel, can you update this?

== 1. heads-up ==

@@ -83, +85 @@


See RunningPerceptron.

-
[Spamassassin Wiki] Update of "RescoreMassCheck" by JustinMason [ In reply to ]
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Spamassassin Wiki" for change notification.

The following page has been changed by JustinMason:
http://wiki.apache.org/spamassassin/RescoreMassCheck

------------------------------------------------------------------------------

== 1. heads-up ==

- Inform everyone in advance on the -users and -dev lists that we will be starting mass-checks shortly, and they should get their corpora nice and clean.
+ Inform everyone in advance on the -users and -dev lists that we will be starting mass-checks shortly, and they should get their corpora nice and clean (see CorpusCleaning).

== 2. announce mass-check ==
[Spamassassin Wiki] Update of "RescoreMassCheck" by JustinMason [ In reply to ]
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Spamassassin Wiki" for change notification.

The following page has been changed by JustinMason:
http://wiki.apache.org/spamassassin/RescoreMassCheck

------------------------------------------------------------------------------

== 1. heads-up ==

- Inform everyone in advance on the -users and -dev lists that we will be starting mass-checks shortly, and they should get their corpora nice and clean (see CorpusCleaning).
+ Inform everyone in advance on the -users and -dev lists that we will be starting mass-checks shortly, and they should get their corpora nice and clean (see CorpusCleaning) and sign up for RsyncAccounts.

== 2. announce mass-check ==
[Spamassassin Wiki] Update of "RescoreMassCheck" by JustinMason [ In reply to ]
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Spamassassin Wiki" for change notification.

The following page has been changed by JustinMason:
http://wiki.apache.org/spamassassin/RescoreMassCheck

------------------------------------------------------------------------------

== 2. announce mass-check ==

+ RescoreDetails is the full announcement text (and instructions) for this phase.
- See MassCheck. The mass-check for both scoresets can be done in one command, e.g.
-
- {{{
- cd masses
- mkdir -p spamassassin
- rm -f spamassassin/*
-
- cat > spamassassin/user_prefs
- bayes_auto_learn 0
- lock_method flock
- bayes_store_module Mail::SpamAssassin::BayesStore::SDBM
- use_auto_whitelist 0
- [hit Control-D]
-
- mass-check -j 4 --bayes --net --restart=400 --learn=35 --reuse [all targets]
-
- (note if a --after flag is part of the announcement, please add it as well)
- }}}
-
- Here's the full announcement text for this phase: RescoreDetails

We then take the log files rsync'd up to the server, and use those logs for all 4 score sets. The initial logs are for score set 3 (the fourth), sets 0, 1, and 2 can be generated from set 4 by stripping out the network tests and/or the Bayes tests.
[Spamassassin Wiki] Update of "RescoreMassCheck" by JustinMason [ In reply to ]
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Spamassassin Wiki" for change notification.

The following page has been changed by JustinMason:
http://wiki.apache.org/spamassassin/RescoreMassCheck

The comment on the change is:
take a note for future ref

------------------------------------------------------------------------------

== 2. announce mass-check ==

- RescoreDetails is the full announcement text (and instructions) for this phase.
+ RescoreDetails is the full announcement text (and instructions) for this phase. It's sufficient just to send out a mail something like the one we used in 3.1.0:
+
+ {{{
+ To: users
+ Cc: dev
+ Subject: NOTICE: 3.1.0 rescoring mass-checks
+
+ OK, if you're planning to send us mass-check logs for the 3.1.0
+ rescoring, now's the time!
+
+ http://wiki.apache.org/spamassassin/RescoreDetails has all the
+ details.
+
+ cheers!
+
+ --j.
+ }}}

We then take the log files rsync'd up to the server, and use those logs for all 4 score sets. The initial logs are for score set 3 (the fourth), sets 0, 1, and 2 can be generated from set 4 by stripping out the network tests and/or the Bayes tests.
[Spamassassin Wiki] Update of "RescoreMassCheck" by JustinMason [ In reply to ]
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Spamassassin Wiki" for change notification.

The following page has been changed by JustinMason:
http://wiki.apache.org/spamassassin/RescoreMassCheck

The comment on the change is:
add note about rule enabling, so we don't forget that again

------------------------------------------------------------------------------
== 1. heads-up ==

Inform everyone in advance on the -users and -dev lists that we will be starting mass-checks shortly, and they should get their corpora nice and clean (see CorpusCleaning) and sign up for RsyncAccounts.
+
+ Enable all rules using the helper script to do this:
+ {{{
+ masses/enable-all-evolved-rules < rules/50_scores.cf \
+ > rules/51_newscores.cf
+ mv rules/51_newscores.cf rules/50_scores.cf
+ svn diff [and ensure it looks sane]
+ svn commit [.create a new bug attachment for review if in R-T-C mode]
+ }}}
+
+ Build a prerelease tarball using {{{build/update_stable}}}. See {{{build/README}}} for details on the build process.

== 2. announce mass-check ==