Mailing List Archive

xsubpp, %v global var and broken example
TL;DR: a facility was added to xsubpp around 5.005 to allow global state
to be kept in the %v global variable during the work of translating a XS
file to a C file. A confusing example was added to perlxs.pod (with more
explanatory text added later by Ilya), which AFAICT is, and always was,
broken. Then in 5.10.0, the %v facility broke (%v became a compile error
under 'use strict'). I need to decide whether to fix %v (and if so, how?)
and/or fix the docs or remove the entry from the docs.

Now in detail.

First, the background:

XSUB parameter declarations by default inject the code from the relevant
typemap into the C src file to do the conversion from SVs to C types. For
example, this entry in an XS file:

int
foo(a, b)
int a
char *b
...

generates C code which looks roughly like:

int a = (int)SvUV(ST(0));
char* b = (char *)SvPV_nolen(ST(1));

However this can overridden - you can specify a snippet of C code to use
directly instead of the typemap, and/or a piece of code to add to the end
in addition to the typemap by the presence of a [=+;] character after the
variable name:

foo(a, b)
int a = (int)PTR2IV($arg);
char *b + if (!b) { b = "NULL"; };

becomes

int a = (int)PTR2IV(ST(0));
char *b = (char *)SvPV_nolen(ST(1));
....
if (!b) { b = "NULL"; };

The '=' has caused the text to its right to be used instead of the
standard typemap; the '+' has caused the standard typemap to be used, but
the additional text to be inserted into the C function after the argument
processing.

The text to the right of a [=+;] is wrapped in double quotes, then passed
through eval by xsubpp. So for example, to process the 'int a = ...' line,
xsubpp does does the rough equivalent of:

{
# the raw text that was extracted from the XS line after the '=':
my $text = '(int)PTR2IV($arg);';
# 3 documented vars set by xsubpp before each eval:
my $arg = 'ST(0)';
my $val = ...;
my $type = ...;
$text = eval '"' . $text . '"';
# ... output $text to C file...
}

The commit perl-5.004_03-1637-g7ad6fb0b94 by Tye McQueen back in 1998
added the [;+] syntax in addition to the already existing [=], and added
"use vars '%v'" to the top of xsubpp. This meant that it was possible to
include perl code in XSUB parameter declarations (via tricks like
"@{[ $v{foo} = bar() ])") which could preserve state between such evals by
storing things in %v. The commit included adding an obscure example of the
use of %v to perlxs.pod (more on which later). I happen to think that the
example is broken.

Then with perl-5.8.0-4925-g6b09c16010 by Yitzchak Scott-Thoennes in 2005,
the guts of xsubpp was ripped out and put into a separate module,
ExtUtils::ParseXS. As a (probably inadvertent) side effect, the "use vars
'%v'" line got removed.

This means that from 5.10.0 onwards, any attempt to use the %v variable
within evaled code in an XS file will fail with a

Global symbol "%v" requires explicit package name

error. Note that it is still possible to make it work by using %::v (or
indeed any other fully-qualified package variable name).

But the fact that the intended usage of %v has been broken since 5.10.0,
and that no *.xs file on CPAN appears to match /[$%]v\b/, leads me to
believe that probably no-one is using this obscure facility.

So I'm tempted to remove the "facility" by simply removing the documentation
from perlxs.pod which mentions it.

Now onto the example in perlxs.pod:

bool_t
rpcb_gettime(host,timep)
time_t &timep; /* \$v{timep}=@{[$v{timep}=$arg]} */
char *host + SvOK($v{timep}) ? SvPVbyte_nolen($arg) : NULL;
OUTPUT:
timep

(Note that the parameters are declared in the order (host,timep), but then
defined in the opposite order - i.e. timep first.

The intent of the first declaration (for timep) appears to be to skip the
automatic generation of initialisation code for timep (which is what the
non-terminal ';' does), but as a side-effect while evaling the text to the
right of the ';' set $v{{timep} to the current value of $arg. The result
of the eval will be harmlessly injected as a code comment at the end of
the initialisation section of the C function.

The intent of the second declaration (for host) appears to be: initialise
host as normal, but in addition, inject the extra code at the end, where
the previous value of $arg can be used.

If in all the above, $::v is substituted for $v so that it actually runs,
then xsubpp outputs something like this to the C file:

time_t timep;
char * host = (char *)SvPV_nolen(ST(0)) ;
....

/* $::v{timep}=ST(1) */

SvOK(ST(1)) ? SvPVbyte_nolen(ST(0)) : NULL;

The two big problems with this are:

- the '$::v{timep}=' in the code comment doesn't serve any purpose, unless
its intended for human debugging purposes (in which case the docs should
explain this).

- the 'SvOK(ST(1)) ? ...' is just sitting in void context; it is not
assigned to anything.

It's really difficult to guess the original intent of the example in order
to know how to fix it. Was the result of that tertiary conditional
supposed to have been assigned to host? So is it trying to say:
'if timep is undef, set host to NULL? The man page for rpcb_gettime()
says that a NULL host implies localhost. But I don't see why passing an
undef timep SV should make it default to localhost and override the host
variable.

So, questions:

What should I do to fix this; i.e. what to do about %v?

If we want to keep it, we need to:

- either re-add "use vars '%v'" to the code, or update the docs to tell
people to use %::v rather than %v. For the former, it means that any XS
code using %v wont work between 5.10.0 and 5.38.0.

- We will also need to fix the code example involving rpcb_gettime() and
%v. To do that, we need to agree on what the code example was actually
%trying to do, or come up with a different code example.

The other option is to remove mentions of %v from perlxs.pod - both the
stating of what it is, and the example code. (It doesn't involve actually
removing the facility, since the facility already isn't there, apart from
in the general sense that any code in the XS file which gets evaled can in
principle get or set any global var, %v or otherwise).

My vote is to get rid of it.

--
"I do not resent criticism, even when, for the sake of emphasis,
it parts for the time with reality".
-- Winston Churchill, House of Commons, 22nd Jan 1941.
Re: xsubpp, %v global var and broken example [ In reply to ]
Hi there,

On Wed, 10 Jan 2024, Dave Mitchell wrote:

> ... a facility was added to xsubpp around 5.005 ...
> ... Then in 5.10.0, the %v facility broke ...
> ... I need to decide whether to fix %v ...
> ... or remove the entry from the docs. ...

> ... no *.xs file on CPAN appears to match /[$%]v\b/ ...

Did you also look for /[$%]::v\b/ ? :)

> My vote is to get rid of it.

Given how long it's been broken I think you're probably right.

Outstanding analysis, thank you.

--

73,
Ged.
Re: xsubpp, %v global var and broken example [ In reply to ]
On Wed, Jan 10, 2024 at 10:34:57PM +0000, Dave Mitchell wrote:
> But the fact that the intended usage of %v has been broken since 5.10.0,
> and that no *.xs file on CPAN appears to match /[$%]v\b/, leads me to
> believe that probably no-one is using this obscure facility.
>
> So I'm tempted to remove the "facility" by simply removing the documentation
> from perlxs.pod which mentions it.
>
> Now onto the example in perlxs.pod:
>
> bool_t
> rpcb_gettime(host,timep)
> time_t &timep; /* \$v{timep}=@{[$v{timep}=$arg]} */
> char *host + SvOK($v{timep}) ? SvPVbyte_nolen($arg) : NULL;
> OUTPUT:
> timep
>
> (Note that the parameters are declared in the order (host,timep), but then
> defined in the opposite order - i.e. timep first.
>
> The intent of the first declaration (for timep) appears to be to skip the
> automatic generation of initialisation code for timep (which is what the
> non-terminal ';' does), but as a side-effect while evaling the text to the
> right of the ';' set $v{{timep} to the current value of $arg. The result
> of the eval will be harmlessly injected as a code comment at the end of
> the initialisation section of the C function.
>
> The intent of the second declaration (for host) appears to be: initialise
> host as normal, but in addition, inject the extra code at the end, where
> the previous value of $arg can be used.
>
> If in all the above, $::v is substituted for $v so that it actually runs,
> then xsubpp outputs something like this to the C file:
>
> time_t timep;
> char * host = (char *)SvPV_nolen(ST(0)) ;
> ....
>
> /* $::v{timep}=ST(1) */
>
> SvOK(ST(1)) ? SvPVbyte_nolen(ST(0)) : NULL;
>
> The two big problems with this are:
>
> - the '$::v{timep}=' in the code comment doesn't serve any purpose, unless
> its intended for human debugging purposes (in which case the docs should
> explain this).

I think it's just intended to explain the action - ie. that $v{timep}
has been set to ST(1), also possibly as an example of how to debug
this type of %v usage (which I've never used).

>
> - the 'SvOK(ST(1)) ? ...' is just sitting in void context; it is not
> assigned to anything.

I think it's just a bug in the example.

> It's really difficult to guess the original intent of the example in order
> to know how to fix it. Was the result of that tertiary conditional
> supposed to have been assigned to host? So is it trying to say:
> 'if timep is undef, set host to NULL? The man page for rpcb_gettime()
> says that a NULL host implies localhost. But I don't see why passing an
> undef timep SV should make it default to localhost and override the host
> variable.

I suspect here they're taking the function used in several examples
(rcpb_gettime()) and tried to build a new example for %v out of it,
even though the example doesn't match how rcpb_gettime() is used.

>
> So, questions:
>
> What should I do to fix this; i.e. what to do about %v?
>
> If we want to keep it, we need to:
>
> - either re-add "use vars '%v'" to the code, or update the docs to tell
> people to use %::v rather than %v. For the former, it means that any XS
> code using %v wont work between 5.10.0 and 5.38.0.

EU::PXS is dual-life, if we do re-allow %v they can depend on a
sufficently recent release of EU::PXS.

> - We will also need to fix the code example involving rpcb_gettime() and
> %v. To do that, we need to agree on what the code example was actually
> %trying to do, or come up with a different code example.
>
> The other option is to remove mentions of %v from perlxs.pod - both the
> stating of what it is, and the example code. (It doesn't involve actually
> removing the facility, since the facility already isn't there, apart from
> in the general sense that any code in the XS file which gets evaled can in
> principle get or set any global var, %v or otherwise).
>
> My vote is to get rid of it.

I think so too.

Tony
Re: xsubpp, %v global var and broken example [ In reply to ]
Dave Mitchell <davem@iabyn.com> wrote:
:My vote is to get rid of it.

Thanks for the analysis, +1 from me.

The point someone comes up with an actual need for it is early enough
to think about whether this is the solution we'd actually want to support.

It does feel like a lightweight addition that offers quite powerful
functionality. But Perl code hidden in C comments for the side effects
of a string eval on a global variable feels like quite an icky way
to take advantage of it.

Hugo
Re: xsubpp, %v global var and broken example [ In reply to ]
On Wed, Jan 10, 2024 at 11:35?PM Dave Mitchell <davem@iabyn.com> wrote:

> - either re-add "use vars '%v'" to the code, or update the docs to tell
> people to use %::v rather than %v. For the former, it means that any XS
> code using %v wont work between 5.10.0 and 5.38.0.
>

No it would, it would just mean that one needs to add a build-time
dependency on a version of ExtUtils::ParseXS that restores the feature.

That said, if it hasn't been missed for all those years we can probably
safely remove it.

Leon