Mailing List Archive

Re extension
Hello,

Could it be possible to have like

(?&&sub)

Which will keep the matched groups inside the sub.

I know I've asked this before but back then I wasn't subscribed and have no
idea what you guys answered me plus right now I've done some modifications
to the re engine myself and have more insight except trying to achieve this.

Thanks in advance,
Alexander Nikolov
Re: Re extension [ In reply to ]
that sounds really powerful now that I'm looking at it. I'm assuming
that it would be triggered each time the (?&&sub) is hit going
forward, like a positive zero width assertion and that the matched
groups at that time would be passed as arguments to the sub.

Ed

On Fri, Dec 17, 2021 at 7:03 PM sasho648 <sasho648@gmail.com> wrote:
>
> Hello,
>
> Could it be possible to have like
>
> (?&&sub)
>
> Which will keep the matched groups inside the sub.
>
> I know I've asked this before but back then I wasn't subscribed and have no idea what you guys answered me plus right now I've done some modifications to the re engine myself and have more insight except trying to achieve this.
>
> Thanks in advance,
> Alexander Nikolov
Re: Re extension [ In reply to ]
OK so I was able to partially implement this (not counting the fact that it
needs the previous patch to compile initially (and then you apply this one
and do again)) GOSUB extensions · 6a4h8/perl5@cdca990 (github.com)
<https://github.com/6a4h8/perl5/commit/cdca990a3b00656a617edd00c0acfc93de743d1c>
(also
this is WIP - issues explained below)

Why is the stack from 0 to maxopenparen saved on a sub call - shouldn't we
save the parens that the sub will occupy instead (I start to see that this
is considering the most basic scenario where we recurse the whole pattern
but in a more complex one we can run entirely different code and thus
saving up to 0 - maxopenparen wouldn't make sense then).

Common comment I see is:

perl5/regexec.c:4249 at blead · Perl/perl5 (github.com)
<https://github.com/Perl/perl5/blob/blead/regexec.c#L4249>

perl5/regexec.c:389 at blead · Perl/perl5 (github.com)
<https://github.com/Perl/perl5/blob/blead/regexec.c#L389>

Which leads me to believe lastparen and maxopenparen aren't doing it for
other complex scenarios as well.

Anyway I'm working on this in pair with my extensions which are basically:

(?&&sub) - leaves the matches inside intact
(?&&&sub) - same as (?&sub) - restore matches at exit, but on entry hides
the matches of outer scope

Also (?&+sub) which will save the sub match group regardless of invocation
type down the recursion stack.

My question here is (if anyone can help) - how to get the closing paren + 1
of a GOSUB in regexec and if not any suggestions of how to store it (from
regcomp.c).

Like I see that reg2Lanode can only store 2 arguments - can I sneak a third
one somehow?


Post note:
I learned today that my previous email wasn't send to perl5-porters email
as well but only to my original replier. It looks like I don't have reply
all button on gmail.

Alexander Nikolov



On Sat, Dec 18, 2021 at 5:43 AM sasho648 <sasho648@gmail.com> wrote:

> I'm mostly looking for the logistics of how this could be implemented.
>
> I didn't consider arguments - my idea was that something like this:
>
> (?(DEFINE)(?<sub>(?<te>TE)(?<st>ST)))(?&&sub)
>
> Will not pop the matches made inside sub so we will have named matches
> `te` and `st` at the end of this regular expression.
>
> On Fri, Dec 17, 2021 at 9:21 PM Ed Peschko <horos22@gmail.com> wrote:
>
>> that sounds really powerful now that I'm looking at it. I'm assuming
>> that it would be triggered each time the (?&&sub) is hit going
>> forward, like a positive zero width assertion and that the matched
>> groups at that time would be passed as arguments to the sub.
>>
>> Ed
>>
>> On Fri, Dec 17, 2021 at 7:03 PM sasho648 <sasho648@gmail.com> wrote:
>> >
>> > Hello,
>> >
>> > Could it be possible to have like
>> >
>> > (?&&sub)
>> >
>> > Which will keep the matched groups inside the sub.
>> >
>> > I know I've asked this before but back then I wasn't subscribed and
>> have no idea what you guys answered me plus right now I've done some
>> modifications to the re engine myself and have more insight except trying
>> to achieve this.
>> >
>> > Thanks in advance,
>> > Alexander Nikolov
>>
>
Re: Re extension [ In reply to ]
By "not doing it" I mean depending on lastparen (sorry actually
maxopenparen are accurate since they get incremented on each OPEN) to
determine defined matches.

perl5/regexec.c:7822 at blead · Perl/perl5 (github.com)
<https://github.com/Perl/perl5/blob/blead/regexec.c#L7822> (one example)

On Sun, Jan 9, 2022 at 12:12 PM sasho648 <sasho648@gmail.com> wrote:

> OK so I was able to partially implement this (not counting the fact
> that it needs the previous patch to compile initially (and then you apply
> this one and do again)) GOSUB extensions · 6a4h8/perl5@cdca990
> (github.com)
> <https://github.com/6a4h8/perl5/commit/cdca990a3b00656a617edd00c0acfc93de743d1c> (also
> this is WIP - issues explained below)
>
> Why is the stack from 0 to maxopenparen saved on a sub call - shouldn't we
> save the parens that the sub will occupy instead (I start to see that this
> is considering the most basic scenario where we recurse the whole pattern
> but in a more complex one we can run entirely different code and thus
> saving up to 0 - maxopenparen wouldn't make sense then).
>
> Common comment I see is:
>
> perl5/regexec.c:4249 at blead · Perl/perl5 (github.com)
> <https://github.com/Perl/perl5/blob/blead/regexec.c#L4249>
>
> perl5/regexec.c:389 at blead · Perl/perl5 (github.com)
> <https://github.com/Perl/perl5/blob/blead/regexec.c#L389>
>
> Which leads me to believe lastparen and maxopenparen aren't doing it for
> other complex scenarios as well.
>
> Anyway I'm working on this in pair with my extensions which are basically:
>
> (?&&sub) - leaves the matches inside intact
> (?&&&sub) - same as (?&sub) - restore matches at exit, but on entry hides
> the matches of outer scope
>
> Also (?&+sub) which will save the sub match group regardless of invocation
> type down the recursion stack.
>
> My question here is (if anyone can help) - how to get the closing paren +
> 1 of a GOSUB in regexec and if not any suggestions of how to store it (from
> regcomp.c).
>
> Like I see that reg2Lanode can only store 2 arguments - can I sneak a
> third one somehow?
>
>
> Post note:
> I learned today that my previous email wasn't send to perl5-porters email
> as well but only to my original replier. It looks like I don't have reply
> all button on gmail.
>
> Alexander Nikolov
>
>
>
> On Sat, Dec 18, 2021 at 5:43 AM sasho648 <sasho648@gmail.com> wrote:
>
>> I'm mostly looking for the logistics of how this could be implemented.
>>
>> I didn't consider arguments - my idea was that something like this:
>>
>> (?(DEFINE)(?<sub>(?<te>TE)(?<st>ST)))(?&&sub)
>>
>> Will not pop the matches made inside sub so we will have named matches
>> `te` and `st` at the end of this regular expression.
>>
>> On Fri, Dec 17, 2021 at 9:21 PM Ed Peschko <horos22@gmail.com> wrote:
>>
>>> that sounds really powerful now that I'm looking at it. I'm assuming
>>> that it would be triggered each time the (?&&sub) is hit going
>>> forward, like a positive zero width assertion and that the matched
>>> groups at that time would be passed as arguments to the sub.
>>>
>>> Ed
>>>
>>> On Fri, Dec 17, 2021 at 7:03 PM sasho648 <sasho648@gmail.com> wrote:
>>> >
>>> > Hello,
>>> >
>>> > Could it be possible to have like
>>> >
>>> > (?&&sub)
>>> >
>>> > Which will keep the matched groups inside the sub.
>>> >
>>> > I know I've asked this before but back then I wasn't subscribed and
>>> have no idea what you guys answered me plus right now I've done some
>>> modifications to the re engine myself and have more insight except trying
>>> to achieve this.
>>> >
>>> > Thanks in advance,
>>> > Alexander Nikolov
>>>
>>