Mailing List Archive

Why is my `any` keyword so slow?
While we're all sat twiddling our thumbs waiting for 5.34, I thought
I'd like to distract everyone with a bit of a performance puzzle.

Attached is a module tarball, not yet on CPAN, but which I had hoped
would be sometime soon. I was intending to make some native keyword
implementations of the List::Util functions - beginning with any and
all.

My plan was to create a keyword that acts exactly like List::Util::any
but is instead implemented in a pair of core ops much like grep is,
meaning it should run nicely fast, yes? Or at least, faster than the
XSUB that is List::Util::any.

Except, according to my benchmark it isn't. It is in fact significantly
slower than L:U:any. I manage to be faster than grep, which is
something. But it's very much slower than the List::Util version. Here
are numbers from a typical run:

$ perl -Mblib benchmark.pl
Rate CORE::grep List::Keywords/any List::Util::any
CORE::grep 50238/s -- -41% -78%
List::Keywords/any 84611/s 68% -- -63%
List::Util::any 230967/s 360% 173% --
Counts:
grep: cmp=30952700 call=309527 (100.000 cmp/call)
lka: cmp=26850327 call=526477 (51.000 cmp/call)
lua: cmp=71686161 call=1405611 (51.000 cmp/call)


If anyone happens to find themselves with some spare time, I'd
appreciate any assistance or suggestions of something to look into
here, to see why my version is being so slow. Or is it the case that
List::Util's version just really is *that* much more efficient, given
as it uses dMULTICALL - at which point maybe we can adopt some of its
performance tricks into perl core's way of doing grep/etc... and make
them faster too?

--
Paul "LeoNerd" Evans

leonerd@leonerd.org.uk | https://metacpan.org/author/PEVANS
http://www.leonerd.org.uk/ | https://www.tindie.com/stores/leonerd/
Re: Why is my `any` keyword so slow? [ In reply to ]
On Fri, Apr 23, 2021 at 10:11:16PM +0100, Paul "LeoNerd" Evans wrote:
> If anyone happens to find themselves with some spare time, I'd
> appreciate any assistance or suggestions of something to look into
> here, to see why my version is being so slow. Or is it the case that
> List::Util's version just really is *that* much more efficient, given
> as it uses dMULTICALL - at which point maybe we can adopt some of its
> performance tricks into perl core's way of doing grep/etc... and make
> them faster too?

I profiled the lua and lka cases:

for my $x (1 .. 100_000) {
my $ret = any { $_ > 50 } @nums;
}

The lua version did 103k calls to Perl_push_scope/pop_scope.

The lka version did 5.2M calls to Perl_push_scope/pop_scope.

Presumably from the ENTER/LEAVEs done in pp_anywhile, while L::U::any
doesn't have that overhead.

Profiled with:

valgrind --tool=callgrind --callgrind-out-file=lka.out ~/perl/5.32.0-dbg/bin/perl -Mblib lka.pl

and examined with kcachegrind.

The lka version warns:

$ret never introduced at lka.pl line 13.

Tony
Re: Why is my `any` keyword so slow? [ In reply to ]
On Tue, 27 Apr 2021 10:33:04 -0400
"Matthew Horsfall (alh)" <wolfsage@gmail.com> wrote:

> After some experimenting I found this worked:
>
> + OP **startp = &(anywhile->op_other);
> + PL_peepp(*startp);
>
> *out = (OP *)anywhile;
> return KEYWORD_PLUGIN_EXPR;

Adding the peephole optimiser helped a bit, yes.

> You call op_scope() on the block which DTRT to make sure the code gets
> wrapped properly to clean things up (ENTER/LEAVE if needed, pp_scope
> otherwise, etc)
> But then your code is also doing:
>
> SAVETMPS;
> ENTER_with_name("any_item");
>
> ...
> FREETMPS;
> LEAVE_with_name("any_item");
>
> I'm... not entirely sure these are necessary.
>
> And removing them does give a bit of a speed boost.

Removing those did help a bit more.


A third thing which was discussed on #p5p was to add

#define PERL_NO_GET_CONTEXT

In summary: the effect of this is to stop aTHX from trying to look in
threadlocal storage (i.e. slow) and actually start paying attention to
the pTHX parameters being passed around all the functions. Adding this
gives quite the speed bump - and crucially, now makes the thing run
faster than the equivalent List::Util function.

Having applied all those three, I have now made a real CPAN release.
Well, three in fact - at time of writing latest version is 0.03 and
implements `first`, along with the specialisations `any`, `all`,
`none`, `notall`.

https://metacpan.org/pod/List::Keywords

Performance varies from machine to machine, but consistently I'm seeing
at least a 10% speedup on any of the smoke testers.. Quite often at
least 15 to 20% in fact. On my machine I now get

# List::Util took 0.309sec, this was 21% faster at 0.254sec

Now this seems to be working reliably, next steps will be on adding
more keywords to it, as per the TODO notes.


Thanks all,

--
Paul "LeoNerd" Evans

leonerd@leonerd.org.uk | https://metacpan.org/author/PEVANS
http://www.leonerd.org.uk/ | https://www.tindie.com/stores/leonerd/
Re: Why is my `any` keyword so slow? [ In reply to ]
On Thu, Apr 29, 2021 at 4:00 PM Paul "LeoNerd" Evans <leonerd@leonerd.org.uk>
wrote:

> On Tue, 27 Apr 2021 10:33:04 -0400
> "Matthew Horsfall (alh)" <wolfsage@gmail.com> wrote:
>
> > After some experimenting I found this worked:
> >
> > + OP **startp = &(anywhile->op_other);
> > + PL_peepp(*startp);
> >
> > *out = (OP *)anywhile;
> > return KEYWORD_PLUGIN_EXPR;
>
> Adding the peephole optimiser helped a bit, yes.
>
> > You call op_scope() on the block which DTRT to make sure the code gets
> > wrapped properly to clean things up (ENTER/LEAVE if needed, pp_scope
> > otherwise, etc)
> > But then your code is also doing:
> >
> > SAVETMPS;
> > ENTER_with_name("any_item");
> >
> > ...
> > FREETMPS;
> > LEAVE_with_name("any_item");
> >
> > I'm... not entirely sure these are necessary.
> >
> > And removing them does give a bit of a speed boost.
>
> Removing those did help a bit more.
>
>
> A third thing which was discussed on #p5p was to add
>
> #define PERL_NO_GET_CONTEXT
>
> In summary: the effect of this is to stop aTHX from trying to look in
> threadlocal storage (i.e. slow) and actually start paying attention to
> the pTHX parameters being passed around all the functions. Adding this
> gives quite the speed bump - and crucially, now makes the thing run
> faster than the equivalent List::Util function.
>
> Having applied all those three, I have now made a real CPAN release.
> Well, three in fact - at time of writing latest version is 0.03 and
> implements `first`, along with the specialisations `any`, `all`,
> `none`, `notall`.
>
> https://metacpan.org/pod/List::Keywords
>
> Performance varies from machine to machine, but consistently I'm seeing
> at least a 10% speedup on any of the smoke testers.. Quite often at
> least 15 to 20% in fact. On my machine I now get
>
> # List::Util took 0.309sec, this was 21% faster at 0.254sec
>
> Now this seems to be working reliably, next steps will be on adding
> more keywords to it, as per the TODO notes.
>
>
> Thanks all,
>


That's absolutely awesome Paul!
I've wanted array functions in core so badly since I started using Perl 20
years ago.

Thanks!



> --
> Paul "LeoNerd" Evans
>
> leonerd@leonerd.org.uk | https://metacpan.org/author/PEVANS
> http://www.leonerd.org.uk/ | https://www.tindie.com/stores/leonerd/
>