Mailing List Archive

sa-learn on an Exchange public folder
Hello all.

I've set up SA at $WORK and now want to train the bayesian classifier.
To that end, a public folder has been setup on our Exchange server and
I want to run sa-learn on any email that is transferred to it.

I'm guessing this is a popular thing to do and that there would already
be a wrapper around sa-learn on github but my Google-foo seems to be
off today.

Is there such a wrapper or do I have to write my own script?

Regards,
Emmanuel
Re: sa-learn on an Exchange public folder [ In reply to ]
Emmanuel Seyman wrote:
>
> Hello all.
>
> I've set up SA at $WORK and now want to train the bayesian classifier.
> To that end, a public folder has been setup on our Exchange server and
> I want to run sa-learn on any email that is transferred to it.
>
> I'm guessing this is a popular thing to do and that there would already
> be a wrapper around sa-learn on github but my Google-foo seems to be
> off today.
>
> Is there such a wrapper or do I have to write my own script?

Have a look at http://deepnet.cx/~kdeugau/spamtools/imap-learn. It
looks like the link for the original script I mangled to create that has
moved to https://dmzs.com/tools/files/spam.php.

Fair warning, I gave up on using IMAP for feeding Bayes locally because
it started to glitch out and fail for no reason I could see. But the
mailboxes I'm learning from are maildir on a *nix platform, not whatever
black box Exchange hides things in.

-kgd
Re: sa-learn on an Exchange public folder [ In reply to ]
Kris Deugau skrev den 2023-12-04 18:23:

> Fair warning, I gave up on using IMAP for feeding Bayes locally because
> it started to glitch out and fail for no reason I could see. But the
> mailboxes I'm learning from are maildir on a *nix platform, not
> whatever black box Exchange hides things in.

+1

https://gitlab.com/isbg/isbg i just prefer python :)

that sayed dovecot sieve via roundcube integration, might kill exchange
one time for all
Re: sa-learn on an Exchange public folder [ In reply to ]
On 2023-12-03 at 14:58:36 UTC-0500 (Sun, 3 Dec 2023 20:58:36 +0100)
Emmanuel Seyman <emmanuel@seyman.fr>
is rumored to have said:

> Hello all.
>
> I've set up SA at $WORK and now want to train the bayesian classifier.
> To that end, a public folder has been setup on our Exchange server and
> I want to run sa-learn on any email that is transferred to it.
>
> I'm guessing this is a popular thing to do and that there would
> already
> be a wrapper around sa-learn on github but my Google-foo seems to be
> off today.
>
> Is there such a wrapper or do I have to write my own script?


I am aware of no such script. The overwhelming majority of sites using
SA use operating systems other than Windows and mail servers using open
format standards like mbox and Maildir. Last I knew, Exchange folders
were binary blobs in a format (PST?) that MS either does not document or
documents poorly, but that could be a decade or more out of date.

SpamAssassin understands the standard format of Internet mail messages
as defined in RFC822 and its successors. It also understands a few
simple ways that RFC822 messages are packaged together (mbox, mbx,
bsmtp) but Exchange only uses that format for sending mail over the
Internet, while it uses its own proprietary formats internally.

--
Bill Cole
bill@scconsult.com or billcole@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Not Currently Available For Hire
Re: sa-learn on an Exchange public folder [ In reply to ]
Does anything "speak" against just fetching the message from said folder
(ex getmail or fetchmail) and feed them to sa-learn? At least for
getmail one can define a filter section which then calls sa-learn and
give it the message for learning. I use a getmail config like this

[retriever]
type = SimpleIMAPRetriever
server = MyServer
port = 143
username = user@example.com
password = TopSecret
timeout = 180
mailboxes = ("INBOX.Spam",)

[filter-salearn]
type = Filter_classifier
path = /usr/bin/sa-learn
arguments = ("--spam",)
user = spamassassin
group = spamassassin
ignore_stderr = True

[destination]
type = Maildir
path = /sa-learn/spam/
user = spamassassin


Have a good one

tobi

On 04/12/2023 18:23, Kris Deugau wrote:
> Emmanuel Seyman wrote:
>>
>> Hello all.
>>
>> I've set up SA at $WORK and now want to train the bayesian classifier.
>> To that end, a public folder has been setup on our Exchange server and
>> I want to run sa-learn on any email that is transferred to it.
>>
>> I'm guessing this is a popular thing to do and that there would already
>> be a wrapper around sa-learn on github but my Google-foo seems to be
>> off today.
>>
>> Is there such a wrapper or do I have to write my own script?
>
> Have a look at http://deepnet.cx/~kdeugau/spamtools/imap-learn. It
> looks like the link for the original script I mangled to create that
> has moved to https://dmzs.com/tools/files/spam.php.
>
> Fair warning, I gave up on using IMAP for feeding Bayes locally
> because it started to glitch out and fail for no reason I could see. 
> But the mailboxes I'm learning from are maildir on a *nix platform,
> not whatever black box Exchange hides things in.
>
> -kgd