Mailing List Archive

bayes sql postgresql
i came after using it this way for long time now that it could be more
optimized with bayes user id vars, currently it creates new ids each
time there is a new user, but it does not reuse old ids that is not used
anymore after sa-learn --username foo@example.org, then that id is not
used anymore, next new user will always get a highter number, hmm :=)

next problem i find is that bayes usernames is caseSensitive in sql, so
Bar@example.org and bar@example.org is 2 diffrent bayes users :(

bayes ignore from and bayes ignore to, could be extended to know local
domains, egg dont store bayes data if from or to enveelopes is not local
domains

for completeness i use fuglu 0.10.6 installed on gentoo with preque
proxy scanning so can reject highscore spam

i have started to ask here before make a ticket for this if its good to
make changes to bayes
Re: bayes sql postgresql [ In reply to ]
On Sat, 18 Jan 2020 13:31:10 +0100
Benny Pedersen wrote:

> i came after using it this way for long time now that it could be
> more optimized with bayes user id vars, currently it creates new ids
> each time there is a new user, but it does not reuse old ids that is
> not used anymore after sa-learn --username foo@example.org, then that
> id is not used anymore, next new user will always get a highter
> number, hmm :=)

You think you might run-out of 32-bit numbers? If it really bothers
you, you could use 64-bit.

> next problem i find is that bayes usernames is caseSensitive in sql,
> so Bar@example.org and bar@example.org is 2 diffrent bayes users :(

Domains are case-insensitive, the local-part may or may not be.

I don't think this is anything to do with SpamAssassin, shouldn't this
be handled by whatever is passing these usernames.


> bayes ignore from and bayes ignore to, could be extended to know
> local domains, egg dont store bayes data if from or to enveelopes is
> not local domains

Usually Bayes users are connected to local accounts. It sounds like you
are just passing unvalidated email addresses to SA as virtual users.
Re: bayes sql postgresql [ In reply to ]
Zitat von Benny Pedersen <me@junc.eu>:

> i came after using it this way for long time now that it could be
> more optimized with bayes user id vars, currently it creates new ids
> each time there is a new user, but it does not reuse old ids that is
> not used anymore after sa-learn --username foo@example.org, then
> that id is not used anymore, next new user will always get a highter
> number, hmm :=)

From my own experience I can only tell this: you don't want
Spamassassin to use a full blown RDBMS as store for Bayes tokens,
because it is simply way too slow when learning spam/ham or checking
against.

You also normally don't want to have separate stores for every users,
but one global only instead, because most users are dumb and typically
will poison their storage contrary to optimise it.

Instead you should go with BerkeleyDB and one global store only, which
performs much better.