I'm looking at getting around to implementing PPC 0019 finally.
https://github.com/Perl/PPCs/blob/main/ppcs/ppc0019-qt-string.md
First interesting question: Should qt() strings be sub-lexed, or not..?
To explain this question, I'll first need to draw attention to an
annoying quirk of how existing strings like q() and qq() work.
When the lexer encounters a quote-start operator like q or qq, the
first thing it does is look at what the delimiting characters are, and
then it scans ahead looking for the end marker. While looking, it knows
how to count handed pairs *of that marker* and ignore escaped versions,
but it doesn't know anything else. Once it has found the bounds of that
string quoting form, it goes off into a separate parse phase to
understand the inner contents of it, which then get inserted at the
parse point.
q(this is the contents) and now we are outside
q(we can count (inner) parentheses) and now this is outside
q(we ignore \( escaped parens) and now this is outside
but that's as far as it goes. Note that it *does not* understand perl
code inside qq() strings.
eval: qq(This ${\ somefunc ')' } is not valid)
Compile error: Can't find string terminator "'" anywhere before EOF at
(eval 7) line 1.
What went wrong here?
Remember - the lexer first looks at the quoting marker, and then tries
to find the end. It found the end.
qq(This ${\ somefunc ')
################## ^-- Oh look here's the end.
That inside then gets passed into a sub-lexer to parse, and then gets
inserted back into the original syntax
qq(###################)' } is not valid)
Oops. Well, that definitely doesn't look like valid perl code - offhand
I don't know if the parse error comes from the sub-lex inside or the
main parse outside, but either way, it failed.
So with that in mind - what do we feel about the new qt() string syntax?
I.e. what do people feel -should- be the behaviour of a construction
like
sub f { ... }
say qt(Is this { f(")") } valid syntax?);
Should it:
1) Yield a parse error similar to the ones given in the example above?
2) Parse as valid perl code yielding a similar result to:
say 'Is this ', f(")"), ' valid syntax?';
3) Something else?
I feel that interpretation 2 might be most useful and powerful, but
would be inconsistent with existing behaviour of existing operators.
Interpretation 1 is certainly easier to achieve as it reüses existing
parser structures, but given the whole point is to interpolate code
inside the {braces} it might lead to weird annoying cases that don't
work so well.
Does anyone have any good examples one way or other from other
languages that have a similar construction?
(Cross-posted to https://github.com/Perl/PPCs/issues/47)
--
Paul "LeoNerd" Evans
leonerd@leonerd.org.uk | https://metacpan.org/author/PEVANS
http://www.leonerd.org.uk/ | https://www.tindie.com/stores/leonerd/
https://github.com/Perl/PPCs/blob/main/ppcs/ppc0019-qt-string.md
First interesting question: Should qt() strings be sub-lexed, or not..?
To explain this question, I'll first need to draw attention to an
annoying quirk of how existing strings like q() and qq() work.
When the lexer encounters a quote-start operator like q or qq, the
first thing it does is look at what the delimiting characters are, and
then it scans ahead looking for the end marker. While looking, it knows
how to count handed pairs *of that marker* and ignore escaped versions,
but it doesn't know anything else. Once it has found the bounds of that
string quoting form, it goes off into a separate parse phase to
understand the inner contents of it, which then get inserted at the
parse point.
q(this is the contents) and now we are outside
q(we can count (inner) parentheses) and now this is outside
q(we ignore \( escaped parens) and now this is outside
but that's as far as it goes. Note that it *does not* understand perl
code inside qq() strings.
eval: qq(This ${\ somefunc ')' } is not valid)
Compile error: Can't find string terminator "'" anywhere before EOF at
(eval 7) line 1.
What went wrong here?
Remember - the lexer first looks at the quoting marker, and then tries
to find the end. It found the end.
qq(This ${\ somefunc ')
################## ^-- Oh look here's the end.
That inside then gets passed into a sub-lexer to parse, and then gets
inserted back into the original syntax
qq(###################)' } is not valid)
Oops. Well, that definitely doesn't look like valid perl code - offhand
I don't know if the parse error comes from the sub-lex inside or the
main parse outside, but either way, it failed.
So with that in mind - what do we feel about the new qt() string syntax?
I.e. what do people feel -should- be the behaviour of a construction
like
sub f { ... }
say qt(Is this { f(")") } valid syntax?);
Should it:
1) Yield a parse error similar to the ones given in the example above?
2) Parse as valid perl code yielding a similar result to:
say 'Is this ', f(")"), ' valid syntax?';
3) Something else?
I feel that interpretation 2 might be most useful and powerful, but
would be inconsistent with existing behaviour of existing operators.
Interpretation 1 is certainly easier to achieve as it reüses existing
parser structures, but given the whole point is to interpolate code
inside the {braces} it might lead to weird annoying cases that don't
work so well.
Does anyone have any good examples one way or other from other
languages that have a similar construction?
(Cross-posted to https://github.com/Perl/PPCs/issues/47)
--
Paul "LeoNerd" Evans
leonerd@leonerd.org.uk | https://metacpan.org/author/PEVANS
http://www.leonerd.org.uk/ | https://www.tindie.com/stores/leonerd/