Mailing List Archive

Any outstanding branches of op.c? / Split into smaller pieces
Lately I'm trying to make some edits on op.c, and finding that the file
has become a giant sprawling monstrosity of way too much code in one
file; 630k in size and 18k lines of C source code.

I intend to split it into a few smaller pieces.

My plan is to create two new .c files to contain subsections of what is
currently in op.c:

peep.c: to contain the finalizer and peephole optimiser

ck.c: to contain the opchecker functions (Perl_ck_*)

This will hopefully end up with a much smaller op.c that contains
mostly just the lowlevel machinery of being an optree, with the other
specialised parts of it moved to their own files.

I bring this to people's attention mostly as a way to point out I
intend to move it, and to see if anyone has any outstanding code on
that file in branches. Depending on what or where that is, it might
cause a merge conflict, so I'd like to see what we can do to minimise
the impact of that.

--
Paul "LeoNerd" Evans

leonerd@leonerd.org.uk | https://metacpan.org/author/PEVANS
http://www.leonerd.org.uk/ | https://www.tindie.com/stores/leonerd/
Re: Any outstanding branches of op.c? / Split into smaller pieces [ In reply to ]
On Sat, 4 Dec 2021 14:16:37 +0000
"Paul \"LeoNerd\" Evans" <leonerd@leonerd.org.uk> wrote:

> My plan is to create two new .c files to contain subsections of what
> is currently in op.c:
>
> peep.c: to contain the finalizer and peephole optimiser

I now have an (as-yet unpushed) branch which does this part; it doesn't
seem too bad. I had to make a few static functions into internal API,
but that didn't seem too bad.

Sizes now:

> ck.c: to contain the opchecker functions (Perl_ck_*)

This part seems like it might go roughly the same way, requiring some
more currently-static functions to be exposed. But here I find more of
a question: it's intended that custom operators can provide their own
opchecker functions. Several of the currently-static functions used by
existing opcheckers feel like they would be useful to custom checkers
too. Perhaps for at least some of these, a case might be made to make
them exposed API so that custom ops can use them too?

All the functions needed are:

new_entersubop
modkids
set_has_eval
op_varname_subscript [.I already partly exposed op_varname as part of
the peephole optimiser change]
op_sibling_newUNOP
handle_constructor
scalarkids
force_list
ref_array_or_hash
newONCEOP
fold_constants
op_integerize
op_std_init

If (some of) these became public API that would help move the
opcheckers to their own file, as well as allowing custom code in CPAN
modules to make use of them as well.

--
Paul "LeoNerd" Evans

leonerd@leonerd.org.uk | https://metacpan.org/author/PEVANS
http://www.leonerd.org.uk/ | https://www.tindie.com/stores/leonerd/
Re: Any outstanding branches of op.c? / Split into smaller pieces [ In reply to ]
On Sun, 5 Dec 2021 10:45:58 +0000
"Paul \"LeoNerd\" Evans" <leonerd@leonerd.org.uk> wrote:

> Sizes now:
(oops I forgot to paste it)

$ ls -l op.c peep.c
-rw-r--r-- 1 leo leo 492K Dec 5 10:46 op.c
-rw-r--r-- 1 leo leo 139K Dec 4 14:08 peep.c

$ wc -l op.c peep.c
15014 op.c
3884 peep.c

--
Paul "LeoNerd" Evans

leonerd@leonerd.org.uk | https://metacpan.org/author/PEVANS
http://www.leonerd.org.uk/ | https://www.tindie.com/stores/leonerd/
Re: Any outstanding branches of op.c? / Split into smaller pieces [ In reply to ]
On Sun, Dec 05, 2021 at 10:45:58AM +0000, Paul "LeoNerd" Evans wrote:
> On Sat, 4 Dec 2021 14:16:37 +0000
> "Paul \"LeoNerd\" Evans" <leonerd@leonerd.org.uk> wrote:
>
> > My plan is to create two new .c files to contain subsections of what
> > is currently in op.c:
> >
> > peep.c: to contain the finalizer and peephole optimiser
>
> I now have an (as-yet unpushed) branch which does this part; it doesn't
> seem too bad. I had to make a few static functions into internal API,
> but that didn't seem too bad.
>
> Sizes now:
>
> > ck.c: to contain the opchecker functions (Perl_ck_*)
>
> This part seems like it might go roughly the same way, requiring some
> more currently-static functions to be exposed. But here I find more of
> a question: it's intended that custom operators can provide their own
> opchecker functions. Several of the currently-static functions used by
> existing opcheckers feel like they would be useful to custom checkers
> too. Perhaps for at least some of these, a case might be made to make
> them exposed API so that custom ops can use them too?

Few to none of these were written with any intent of being a coherent,
public, supportable API. So as-is there's risk in exposing and ossifying
interfaces that constrain implementation details and prevent refactoring
or optimisation. (That's not "no" - that's "tread carefully")

Also, more pragmatically, it's foolish to "declare" them API unless what
they do can be documented, and what they guarantee can be tested. Else
we're just setting ourselves up for more pain in the future.

For example, fold_constants *must* be called with PL_curcop == &PL_compiling
That sort of pre-condition belongs in documentation, and I don't know
what other traps lurk in these various functions.

Quite often the names are deceptive - for example, I don't know if
fold_constants actually does *mandatory* fixups - ie despite having a name
that shouts "I'm an optimisation step" it's actually essential.

> If (some of) these became public API that would help move the
> opcheckers to their own file, as well as allowing custom code in CPAN
> modules to make use of them as well.

I don't see why it's necessary for *any* of these to be a public API to move
the opcheckers to a different file - we have plenty of non-static functions
that are not public (and not exported on platforms with decent linker
hygiene).

I don't see any risk in marking anything 'p' rather than 's' in embed.fnc
- ie turning functions non-static. That's not making any promises that might
be hard to keep, or we might regret with hindsight.

We can think about what APIs *should* be independent of that change, without
rushing it. Otherwise we're just assuming that our 25-year old internal
spaghetti makes sense as exposed APIs, which I very much doubt.


Nicholas Clark
Re: Any outstanding branches of op.c? / Split into smaller pieces [ In reply to ]
On Sun, 5 Dec 2021 11:31:40 +0000
Nicholas Clark <nick@ccl4.org> wrote:

> I don't see any risk in marking anything 'p' rather than 's' in
> embed.fnc
> - ie turning functions non-static. That's not making any promises
> that might be hard to keep, or we might regret with hindsight.
>
> We can think about what APIs *should* be independent of that change,
> without rushing it. Otherwise we're just assuming that our 25-year
> old internal spaghetti makes sense as exposed APIs, which I very much
> doubt.

I guess that's fair, yes. I can just mark them all internal, and
perhaps in a later step we can review what new functions I added in
that way, with a view to deciding "can this be publicised?".

But you're right - that decision should be a separate step and not
rushed as part of this internal shuffling-of-deckchairs.

--
Paul "LeoNerd" Evans

leonerd@leonerd.org.uk | https://metacpan.org/author/PEVANS
http://www.leonerd.org.uk/ | https://www.tindie.com/stores/leonerd/
Re: Any outstanding branches of op.c? / Split into smaller pieces [ In reply to ]
On Sat, 4 Dec 2021 14:16:37 +0000
"Paul \"LeoNerd\" Evans" <leonerd@leonerd.org.uk> wrote:

> I intend to split it into a few smaller pieces.
>
> My plan is to create two new .c files to contain subsections of what
> is currently in op.c:

Quick update: This is still my intention but I haven't made any headway
at all due to a combination of machine breakages over Christmas, and
other more urgent changes wanting to get done before Jan 20th.

I'll likely get around to doing this once the change-freeze for 5.36
happens, as that'll give a nice quiet time to work on it on a branch to
merge as soon as we open for 5.37.x development.

--
Paul "LeoNerd" Evans

leonerd@leonerd.org.uk | https://metacpan.org/author/PEVANS
http://www.leonerd.org.uk/ | https://www.tindie.com/stores/leonerd/
Re: Any outstanding branches of op.c? / Split into smaller pieces [ In reply to ]
On Sat, 4 Dec 2021 14:16:37 +0000
"Paul \"LeoNerd\" Evans" <leonerd@leonerd.org.uk> wrote:

> Lately I'm trying to make some edits on op.c, and finding that the
> file has become a giant sprawling monstrosity of way too much code in
> one file; 630k in size and 18k lines of C source code.
>
> I intend to split it into a few smaller pieces.
>
> My plan is to create two new .c files to contain subsections of what
> is currently in op.c:
>
> peep.c: to contain the finalizer and peephole optimiser
>
> ck.c: to contain the opchecker functions (Perl_ck_*)
>
> This will hopefully end up with a much smaller op.c that contains
> mostly just the lowlevel machinery of being an optree, with the other
> specialised parts of it moved to their own files.

Since we're at the start of the 5.37 development cycle and hopefully
all of the remaining "little edits" that didn't get into 5.36 are done,
I'm going to do this this week.

I have a number of new things I want to introduce into the peephole
optimiser in the coming weeks, so now's probably the best time to make
a start on it.

If anyone does have any outstanding changes to basically any of the
code in op.c, please shout ASAP so we can find a way to not cause too
much merge conflict.

--
Paul "LeoNerd" Evans

leonerd@leonerd.org.uk | https://metacpan.org/author/PEVANS
http://www.leonerd.org.uk/ | https://www.tindie.com/stores/leonerd/
Re: Any outstanding branches of op.c? / Split into smaller pieces [ In reply to ]
2021-12-4 23:17 Paul "LeoNerd" Evans <leonerd@leonerd.org.uk> wrote:

> Lately I'm trying to make some edits on op.c, and finding that the file
> has become a giant sprawling monstrosity of way too much code in one
> file; 630k in size and 18k lines of C source code.
>
> I intend to split it into a few smaller pieces.
>
> My plan is to create two new .c files to contain subsections of what is
> currently in op.c:
>
> peep.c: to contain the finalizer and peephole optimiser
>
> ck.c: to contain the opchecker functions (Perl_ck_*)
>
>
I think this is a good cleanup.

1. op.c (build op. These functions are called from perly.y)
2. ck.c (traversal the tree of ops(AST) to check them)

If ck.c is too big, peep.c is a good idea.

I feel the name "ck.c" is a little difficult to understand. Alternative
proposal "check.c", "op_check.c"
Re: Any outstanding branches of op.c? / Split into smaller pieces [ In reply to ]
On Mon, 6 Jun 2022 12:05:34 +0100
"Paul \"LeoNerd\" Evans" <leonerd@leonerd.org.uk> wrote:

> Since we're at the start of the 5.37 development cycle and hopefully
> all of the remaining "little edits" that didn't get into 5.36 are
> done, I'm going to do this this week.
>
> I have a number of new things I want to introduce into the peephole
> optimiser in the coming weeks, so now's probably the best time to make
> a start on it.

I've now done the first bit, by splitting the optimizers and optree
finalizer into their own file "peep.c". PR ready for review at

https://github.com/Perl/perl5/pull/19835

Once that's merged I can continue my work actually adding stuff to the
optimizer, but concurrently I'll do the second part of this, which was
to move the opchecker functions also into their own file. My original
suggestion was "ck.c" but perhaps that's a bit of a short name; I'll
likely call it "opcheck.c" instead.

--
Paul "LeoNerd" Evans

leonerd@leonerd.org.uk | https://metacpan.org/author/PEVANS
http://www.leonerd.org.uk/ | https://www.tindie.com/stores/leonerd/
Re: Any outstanding branches of op.c? / Split into smaller pieces [ In reply to ]
On Wed, 8 Jun 2022 15:27:10 +0100
"Paul \"LeoNerd\" Evans" <leonerd@leonerd.org.uk> wrote:

> I've now done the first bit, by splitting the optimizers and optree
> finalizer into their own file "peep.c". PR ready for review at
>
> https://github.com/Perl/perl5/pull/19835

This is now merged to blead.

> Once that's merged I can continue my work actually adding stuff to the
> optimizer, but concurrently I'll do the second part of this, which was
> to move the opchecker functions also into their own file. My original
> suggestion was "ck.c" but perhaps that's a bit of a short name; I'll
> likely call it "opcheck.c" instead.

That's somewhat next on my queue.

--
Paul "LeoNerd" Evans

leonerd@leonerd.org.uk | https://metacpan.org/author/PEVANS
http://www.leonerd.org.uk/ | https://www.tindie.com/stores/leonerd/