Mailing List Archive

[Bug 7993] Plugin HashBL.pm: allow usage of HBL from Spamhaus
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7993

Henrik Krohns <apache@hege.li> changed:

What |Removed |Added
----------------------------------------------------------------------------
Target Milestone|Undefined |4.0.0
CC| |apache@hege.li

--- Comment #1 from Henrik Krohns <apache@hege.li> ---
Is there Perl/Apache conflict for directly copying a single function from CPAN?

I'd hate to require one more obscure module, so we should make our own
Util/encode_base32 from scratch if required.. ;-)

--
You are receiving this mail because:
You are the assignee for the bug.
Re: [Bug 7993] Plugin HashBL.pm: allow usage of HBL from Spamhaus [ In reply to ]
Am 2022-05-17 07:44, schrieb bugzilla-daemon@spamassassin.apache.org:
> https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7993
>
> Henrik Krohns <apache@hege.li> changed:
>
> What |Removed |Added
> ----------------------------------------------------------------------------
> Resolution|--- |FIXED
> Status|NEW |RESOLVED
>
> --- Comment #5 from Henrik Krohns <apache@hege.li> ---
> The base-family encoding is really simple when you understand it. I
> committed a
> super simple unoptimized Util/base32_encode (Revision 1900976).
>
> sha256 option added:
>
> Committed revision 1900977.

Yes, it is really simple and I can understand it now :-), thanks.
However, I would still prefer to be able to use the optimized version of
MIME::Base32 if this module is installed. For us the installation is no
problem, it would just be a statement in our puppet class for
SpamAssassin to install it on all servers with SpamAssassin.

Ah, I just saw that we still need to use the SH.pm plugin, since
HashBL.pm doesn't support attachment hashes. That's where the
performance of encode_base32 makes the biggest difference. Good thing I
rewrote the whole SH.pm plugin to support caching.

Michael
Re: [Bug 7993] Plugin HashBL.pm: allow usage of HBL from Spamhaus [ In reply to ]
On Wed, May 18, 2022 at 11:29:40AM +0200, Michael Storz wrote:
>
> Yes, it is really simple and I can understand it now :-), thanks. However, I
> would still prefer to be able to use the optimized version of MIME::Base32
> if this module is installed. For us the installation is no problem, it would
> just be a statement in our puppet class for SpamAssassin to install it on
> all servers with SpamAssassin.
>
> Ah, I just saw that we still need to use the SH.pm plugin, since HashBL.pm
> doesn't support attachment hashes. That's where the performance of
> encode_base32 makes the biggest difference. Good thing I rewrote the whole
> SH.pm plugin to support caching.

But all the calls of encode_base32 in SH.pm only encode results of sha256()
call? The performance should make no different as it's tiny string.

I'd rather spend my time actually implementing the attachment hashes. :-)
Re: [Bug 7993] Plugin HashBL.pm: allow usage of HBL from Spamhaus [ In reply to ]
On Wed, May 18, 2022 at 12:40:42PM +0300, Henrik K wrote:
> On Wed, May 18, 2022 at 11:29:40AM +0200, Michael Storz wrote:
> >
> > Yes, it is really simple and I can understand it now :-), thanks. However, I
> > would still prefer to be able to use the optimized version of MIME::Base32
> > if this module is installed. For us the installation is no problem, it would
> > just be a statement in our puppet class for SpamAssassin to install it on
> > all servers with SpamAssassin.
> >
> > Ah, I just saw that we still need to use the SH.pm plugin, since HashBL.pm
> > doesn't support attachment hashes. That's where the performance of
> > encode_base32 makes the biggest difference. Good thing I rewrote the whole
> > SH.pm plugin to support caching.
>
> But all the calls of encode_base32 in SH.pm only encode results of sha256()
> call? The performance should make no different as it's tiny string.
>
> I'd rather spend my time actually implementing the attachment hashes. :-)

Actually now that I did a quick benchmark of million rounds, my lousy code
is _faster_ than MIME::Base32. 10 seconds vs 17 seconds.

I think mine might use more memory as it's handling the bits as a string..
but it's faster. :-D
Re: [Bug 7993] Plugin HashBL.pm: allow usage of HBL from Spamhaus [ In reply to ]
On Wed, May 18, 2022 at 12:50:48PM +0300, Henrik K wrote:
> On Wed, May 18, 2022 at 12:40:42PM +0300, Henrik K wrote:
> > On Wed, May 18, 2022 at 11:29:40AM +0200, Michael Storz wrote:
> > >
> > > Yes, it is really simple and I can understand it now :-), thanks. However, I
> > > would still prefer to be able to use the optimized version of MIME::Base32
> > > if this module is installed. For us the installation is no problem, it would
> > > just be a statement in our puppet class for SpamAssassin to install it on
> > > all servers with SpamAssassin.
> > >
> > > Ah, I just saw that we still need to use the SH.pm plugin, since HashBL.pm
> > > doesn't support attachment hashes. That's where the performance of
> > > encode_base32 makes the biggest difference. Good thing I rewrote the whole
> > > SH.pm plugin to support caching.
> >
> > But all the calls of encode_base32 in SH.pm only encode results of sha256()
> > call? The performance should make no different as it's tiny string.
> >
> > I'd rather spend my time actually implementing the attachment hashes. :-)
>
> Actually now that I did a quick benchmark of million rounds, my lousy code
> is _faster_ than MIME::Base32. 10 seconds vs 17 seconds.
>
> I think mine might use more memory as it's handling the bits as a string..
> but it's faster. :-D

And before you question anything. I of course did a script that generates
random strings of different sizes and run million loops to compare mine and
MIME::Base32 output. There were no differences. I didn't bother to do
separate benchmark at that time.
Re: [Bug 7993] Plugin HashBL.pm: allow usage of HBL from Spamhaus [ In reply to ]
Am 2022-05-18 11:40, schrieb Henrik K:
> On Wed, May 18, 2022 at 11:29:40AM +0200, Michael Storz wrote:
>>
>> Yes, it is really simple and I can understand it now :-), thanks.
>> However, I
>> would still prefer to be able to use the optimized version of
>> MIME::Base32
>> if this module is installed. For us the installation is no problem, it
>> would
>> just be a statement in our puppet class for SpamAssassin to install it
>> on
>> all servers with SpamAssassin.
>>
>> Ah, I just saw that we still need to use the SH.pm plugin, since
>> HashBL.pm
>> doesn't support attachment hashes. That's where the performance of
>> encode_base32 makes the biggest difference. Good thing I rewrote the
>> whole
>> SH.pm plugin to support caching.
>
> But all the calls of encode_base32 in SH.pm only encode results of
> sha256()
> call? The performance should make no different as it's tiny string.

Uups, you are absolutely right, forget it.

>
> I'd rather spend my time actually implementing the attachment hashes.
> :-)

Uhh, that would be great :-)

Michael
Re: [Bug 7993] Plugin HashBL.pm: allow usage of HBL from Spamhaus [ In reply to ]
Am 2022-05-18 11:54, schrieb Henrik K:
> On Wed, May 18, 2022 at 12:50:48PM +0300, Henrik K wrote:
>> On Wed, May 18, 2022 at 12:40:42PM +0300, Henrik K wrote:
>> > On Wed, May 18, 2022 at 11:29:40AM +0200, Michael Storz wrote:
>> > >
>> > > Yes, it is really simple and I can understand it now :-), thanks. However, I
>> > > would still prefer to be able to use the optimized version of MIME::Base32
>> > > if this module is installed. For us the installation is no problem, it would
>> > > just be a statement in our puppet class for SpamAssassin to install it on
>> > > all servers with SpamAssassin.
>> > >
>> > > Ah, I just saw that we still need to use the SH.pm plugin, since HashBL.pm
>> > > doesn't support attachment hashes. That's where the performance of
>> > > encode_base32 makes the biggest difference. Good thing I rewrote the whole
>> > > SH.pm plugin to support caching.
>> >
>> > But all the calls of encode_base32 in SH.pm only encode results of sha256()
>> > call? The performance should make no different as it's tiny string.
>> >
>> > I'd rather spend my time actually implementing the attachment hashes. :-)
>>
>> Actually now that I did a quick benchmark of million rounds, my lousy
>> code
>> is _faster_ than MIME::Base32. 10 seconds vs 17 seconds.
>>
>> I think mine might use more memory as it's handling the bits as a
>> string..
>> but it's faster. :-D
>
> And before you question anything. I of course did a script that
> generates
> random strings of different sizes and run million loops to compare mine
> and
> MIME::Base32 output. There were no differences. I didn't bother to do
> separate benchmark at that time.

Even better. Make a comment in the code, that your version is faster
than the 'official' version.

Michael
Re: [Bug 7993] Plugin HashBL.pm: allow usage of HBL from Spamhaus [ In reply to ]
On Wed, May 18, 2022 at 12:50:48PM +0300, Henrik K wrote:
>
> I think mine might use more memory as it's handling the bits as a string..
> but it's faster. :-D

Now that I look at MIME::Base32, it does the same unpack "B*" to a string of
bits, so it uses even the same amount of memory. Rest of the code looks
unnecessarily complicated compared to just looking up the bits from hash
table.