A question in two parts:
1. A specific question,
2. ... which leads on to a broader generalisation
-----
I've been looking lately into the details of how the optree is formed
around regular assignments into lexical variables. Assignments into
scalar lexicals aren't too bad, they look like:
my $lex = (some RHS scalar value);
OP_SASSIGN=BINOP:
(whatever ops for the RHS value)
OP_PADSV [targ = $lex + OPf_REF|OPf_MOD flags]
assignments into arrays or hashes look a little bit more complex:
my @lex = (some RHS list value);
OP_AASSIGN=BINOP:
OP_LIST
OP_PUSHMARK
(whatever ops for the RHS value)
OP_LIST
OP_PUSHMARK
OP_PADAV [targ = $lex + OPf_REF|OPf_MOD flags]
with similar for hashes except using OP_PADHV
I can't help thinking that this puts quite a bit of churn on the value
stack, and also the markstack in the list assignment case. I wonder
whether in these relatively-common cases, it might make sense to
peephole-optimise a set of three specialised ops that use op_targ to
store the pad index of a variable to be assigned into; thus turning
these cases into:
OP_PADSV_STORE=UNOP [targ = $lex]
(whatever ops for the RHS value)
OP_PADAV_STORE=LISTOP [targ = @lex]
OP_PUSHMARK
(whatever ops for RHS value)
(plus OP_PADHV_STORE which would look similar)
To this end I might have a go at making a little CPAN module for doing
this.
It would also be useful to measure whether it actually makes any
performance benefit. If so it might become a useful core addition.
-----
Except, now this leads me onto the larger question. There's nothing
*particularly* specific to lexical assignments about this. I'm sure
similar optimisations could be argued about for many other situations
in core perl.
Right now, we have a few bits of core already that do things like this;
e.g. OP_MULTICONCAT or OP_AELEMFAST_LEX, which are just high-speed
optimisations of common optree shapes. They take the observation that
running a few, larger ops ends up being faster overall than lots of
small ones.
It's a nice pattern - I've already written a module for doing similar
things to MULTICONCAT with maths operations:
https://metacpan.org/pod/Faster::Maths
This is also loosely inspired by Zefram's
https://metacpan.org/pod/Devel::GoFaster
I wonder if there is scope for creating a general way to do this sort
of thing, and a way to measure the performance boost you get from doing
that in any reasonable workflow, to judge whether it's worth doing that.
--
Paul "LeoNerd" Evans
leonerd@leonerd.org.uk | https://metacpan.org/author/PEVANS
http://www.leonerd.org.uk/ | https://www.tindie.com/stores/leonerd/
1. A specific question,
2. ... which leads on to a broader generalisation
-----
I've been looking lately into the details of how the optree is formed
around regular assignments into lexical variables. Assignments into
scalar lexicals aren't too bad, they look like:
my $lex = (some RHS scalar value);
OP_SASSIGN=BINOP:
(whatever ops for the RHS value)
OP_PADSV [targ = $lex + OPf_REF|OPf_MOD flags]
assignments into arrays or hashes look a little bit more complex:
my @lex = (some RHS list value);
OP_AASSIGN=BINOP:
OP_LIST
OP_PUSHMARK
(whatever ops for the RHS value)
OP_LIST
OP_PUSHMARK
OP_PADAV [targ = $lex + OPf_REF|OPf_MOD flags]
with similar for hashes except using OP_PADHV
I can't help thinking that this puts quite a bit of churn on the value
stack, and also the markstack in the list assignment case. I wonder
whether in these relatively-common cases, it might make sense to
peephole-optimise a set of three specialised ops that use op_targ to
store the pad index of a variable to be assigned into; thus turning
these cases into:
OP_PADSV_STORE=UNOP [targ = $lex]
(whatever ops for the RHS value)
OP_PADAV_STORE=LISTOP [targ = @lex]
OP_PUSHMARK
(whatever ops for RHS value)
(plus OP_PADHV_STORE which would look similar)
To this end I might have a go at making a little CPAN module for doing
this.
It would also be useful to measure whether it actually makes any
performance benefit. If so it might become a useful core addition.
-----
Except, now this leads me onto the larger question. There's nothing
*particularly* specific to lexical assignments about this. I'm sure
similar optimisations could be argued about for many other situations
in core perl.
Right now, we have a few bits of core already that do things like this;
e.g. OP_MULTICONCAT or OP_AELEMFAST_LEX, which are just high-speed
optimisations of common optree shapes. They take the observation that
running a few, larger ops ends up being faster overall than lots of
small ones.
It's a nice pattern - I've already written a module for doing similar
things to MULTICONCAT with maths operations:
https://metacpan.org/pod/Faster::Maths
This is also loosely inspired by Zefram's
https://metacpan.org/pod/Devel::GoFaster
I wonder if there is scope for creating a general way to do this sort
of thing, and a way to measure the performance boost you get from doing
that in any reasonable workflow, to judge whether it's worth doing that.
--
Paul "LeoNerd" Evans
leonerd@leonerd.org.uk | https://metacpan.org/author/PEVANS
http://www.leonerd.org.uk/ | https://www.tindie.com/stores/leonerd/