Mailing List Archive

Re: [SAdev] Bayesian Information - How does it work?
On Thu, Jan 01, 2004 at 09:21:25AM -0800, Marc Perkel wrote:
> Is there any info on how Bayes is implemented in SA? Where is the source
> code file that controls it? I'm trying to understand how it works and
> what tokens in the dump mean.

1) don't post to spamassassin-devel anymore.
2) lib/Mail/SpamAssassin/Bayes.pm is the front end code.
lib/Mail/SpamAssassin/BayesStore.pm is all the backend (ie: storage)
related code.

The non-magic tokens are just straight tokens. A few have been compressed
(H*, etc,) which are visible in the Bayes.pm module.

The magic tokens are used for bayes enabling (# of ham/spam) and mostly
for expiry (oldest token atime, atime delta info, # of tokens, etc.)
Newest token atime actually isn't used anywhere yet. I added it into
the DB figuring that if we wanted to change the expiry algorithm or
something in the future, we may want the newest token to figure out
deltas or something, and this way we wouldn't need to redo the DB again.

--
Randomly Generated Tagline:
"Jab, Jab, Oooh. O(n log n)! Ha! Tail recursion! Thrust! Parry! <BOING>"
- Jim Flanagan