Mailing List Archive

refaliasing list assignment in list context
TL;DR: refaliasing list assignment in list context is neither documented
nor tested, and the semantics are not obvious and need to be agreed and
possibly altered.

Background

In normal list assignments, it's possible for the list assignment itself
to be in list context, either rvalue or lvalue. For example:

First, there's void context:

($a, @b) = ($x, $y, $z); # does the obvious

Then there's rvalue list context:

@m = ( ($a, @b) = ($x, $y, $z) );

which is about equivalent to:

($a, @b) = ($x, $y, $z);
@m = ($a, $b[0], $b[1]);

and finally, lvalue list context:

( ($a, @b) = ($x, $y, $z) ) = @m;

which is about equivalent to:

($a, @b) = ($x, $y, $z);
($a, $b[0], $b[1]) = @m;

All straightforward, documented and tested. Note that generally speaking,
the return value of list assign in list context is all the LHS elements,
with aggregates being flattened.

But then there's experimental reference aliasing, which aliases elements
rather than assigning them:

use feature 'refaliasing';
no warnings 'experimental';

(\$a, \(@b)) = (\$x, \$y, \$z);

This aliases $a to $x, $b[0] to $y and $b[1] to $z. This is documented
and tested.

But there is nothing in the test suite or elsewhere in core which
exercises these:

@m = ( (\$a, \(@b)) = (\$x, \$y, \$z) );
nor
( (\$a, \(@b)) = (\$x, \$y, \$z) ) = @m;
nor
foo( (\$a, \(@b)) = (\$x, \$y, \$z) );

and as far as I can tell, its behaviour isn't documented. So the question
for this thread is, what should be the expected behaviour?

For example in:

( (\$a, \(@b)) = (\$x, \$y, \$z) ) = @m;

should the outer list assign be equivalent to:

(1)
$a = $m[0];
$b[0] = $m[1];
$b[1] = $m[2];
(2)
\$a = $m[0]; # croaks if $m[i] isn't a ref
\$b[0] = $m[1];
\$b[1] = $m[2];
(3)
\$a = \$m[0];
\$b[0] = \$m[1];
\$b[1] = \$m[2];

Or something else?

My gut feeling is that it should probably be (1). Part of me thinks that
the parser should be clever enough to mark the outer assign as also doing
refalising, so it expects its LHS and RHS to be refs, so (2). But this is
problematic when the lvalue context is provided by a sub call:

foo( (\$a, \(@b)) = (\$x, \$y, \$z) );

in that case I would expect the args passed to foo() to be $a, $b[0],
$b[1] rather than \$a etc. So (1) it is.

However, as currently implemented, it does none of the above. $a and $x
remain as two aliases to to the same unchanged SV, and similarly for $b[0]
and $y, $b[1] and $z. The values in @m don't end getting assigned to or
aliased to anything.

Technically what is happening is that the inner list assignment in list
context returns a list of temporary refs to the aliased SVs. In rvalue
contexts, these temp RVs can be assigned, e.g.:

@m = ( (\$a, \(@b)) = ($rx, $ry, $rz) );
# @m is now a list of refs.

${$m[0]} = 1; # equivalent to $a = 1;

In lvalue list context, these temporary RVs get the value assigned to
them, rather than to the thing they refer to. Which seems pointless.

So my vague proposal is that we should change the behaviour of refaliasing
in list lvalue/rvalue context so that the returned list is a list of
the aliased SVs, not a list of temporary refs to aliased SVs.

Alternatively we could just croak in list context (either at compile time,
or in runtime for something like sub f { ...; \(@a) = .... }. Or we could
just decree that aassign returns the empty list in list context, or ... ?

But I'm mostly confused by the whole thing at the moment, and I've being
staring at the code in Perl_pp_aassign for days now.

--
I don't want to achieve immortality through my work...
I want to achieve it through not dying.
-- Woody Allen
Re: refaliasing list assignment in list context [ In reply to ]
Hi there,

On Thu, 21 Sep 2023, Dave Mitchell wrote:

> ... in:
>
> ( (\$a, \(@b)) = (\$x, \$y, \$z) ) = @m;
>
> should the outer list assign be equivalent to:
> (1)...
> (2)...
> (3)...
> Or something else?

For the above, personally I'd prefer a compile time error.

With syntax so debatable I'll never do anything like that.

The intent of code should be both obvious and unambiguous.

I can't see the point of condensing something into one line if you
have to spend twenty minutes figuring out what it's supposed to do,
when you could do it in a few lines and make sure that there could
never be the slightest doubt about it.

--

73,
Ged.
Re: refaliasing list assignment in list context [ In reply to ]
On Thu, Sep 21, 2023 at 08:15:06PM +0100, G.W. Haywood via perl5-porters wrote:
> Hi there,
>
> On Thu, 21 Sep 2023, Dave Mitchell wrote:
>
> > ... in:
> >
> > ( (\$a, \(@b)) = (\$x, \$y, \$z) ) = @m;
> >
> > should the outer list assign be equivalent to:
> > (1)...
> > (2)...
> > (3)...
> > Or something else?
>
> For the above, personally I'd prefer a compile time error.
>
> With syntax so debatable I'll never do anything like that.
>
> The intent of code should be both obvious and unambiguous.
>
> I can't see the point of condensing something into one line if you
> have to spend twenty minutes figuring out what it's supposed to do,
> when you could do it in a few lines and make sure that there could
> never be the slightest doubt about it.

Notes that the nested list assigns are just one way of applying list
context - there are others. And with some, you won't know what the context
is until runtime.

foo( \(@b) = .... ); # refalias list assign in lvalue list context


sub bar {
.....;
\(@b) = ...; # last statement in sub - context supplied by caller
}
bar(); # ok?
my $nelems = bar(); # ok?
my @a = bar(); # runtime error???

We could make all these compile-time errors, but should we?

--
"Strange women lying in ponds distributing swords is no basis for a system
of government. Supreme executive power derives from a mandate from the
masses, not from some farcical aquatic ceremony."
-- Dennis, "Monty Python and the Holy Grail"
Re: refaliasing list assignment in list context [ In reply to ]
On 2023-09-22 00:16, Dave Mitchell wrote:
> On Thu, Sep 21, 2023 at 08:15:06PM +0100, G.W. Haywood via perl5-porters wrote:
>> On Thu, 21 Sep 2023, Dave Mitchell wrote:
>>
>>> ... in:
>>>
>>> ( (\$a, \(@b)) = (\$x, \$y, \$z) ) = @m;
>>>
>>> should the outer list assign be equivalent to:
>>> (1)...
>>> (2)...
>>> (3)...
>>> Or something else?

Thanks for addressing this.


>> For the above, personally I'd prefer a compile time error.
>> With syntax so debatable I'll never do anything like that.
>> The intent of code should be both obvious and unambiguous.

Yes please.


>> I can't see the point of condensing something into one line if you
>> have to spend twenty minutes figuring out what it's supposed to do,
>> when you could do it in a few lines and make sure that there could
>> never be the slightest doubt about it.
> Notes that the nested list assigns are just one way of applying list
> context - there are others. And with some, you won't know what the context
> is until runtime.
>
> foo( \(@b) = .... ); # refalias list assign in lvalue list context
>
>
> sub bar {
> .....;
> \(@b) = ...; # last statement in sub - context supplied by caller
> }
> bar(); # ok?
> my $nelems = bar(); # ok?
> my @a = bar(); # runtime error???
>
> We could make all these compile-time errors, but should we?

I also prefer to make it a compile error. The "not until runtime" hit a
final nail.


-- Ruud

P.S. Shouldn't perltrap also mention the ones that still bite after many
years?
Re: refaliasing list assignment in list context [ In reply to ]
On Fri, Sep 22, 2023 at 10:49:40AM +0200, Ruud H.G. van Tol via perl5-porters wrote:
>
> On 2023-09-22 00:16, Dave Mitchell wrote:
> > Notes that the nested list assigns are just one way of applying list
> > context - there are others. And with some, you won't know what the context
> > is until runtime.
> >
> > foo( \(@b) = .... ); # refalias list assign in lvalue list context
> >
> >
> > sub bar {
> > .....;
> > \(@b) = ...; # last statement in sub - context supplied by caller
> > }
> > bar(); # ok?
> > my $nelems = bar(); # ok?
> > my @a = bar(); # runtime error???
> >
> > We could make all these compile-time errors, but should we?
>
> I also prefer to make it a compile error. The "not until runtime" hit a
> final nail.

I'm not sure I follow you - that's the standard behaviour of the last
statement in a function, where the context is determined by the caller.
That is the behaviour of all last lines, not just ref aliasing.
I included it as an example because it's easy to say "make it a compile-
time error if it's in list context", but one needs to be careful to think
through all the implications of such a stance,

So it may well be that you and others are proposing the following:

That assignments to refs (the 'refaliasing' experimental facility) such as

\$x = \$y;
\@a = \@b;
(\$x, \@a) = (\$y, \@b);
(\$x, \(@a)) = (\$y, \$p, \$q, \$r);

should all become compile-time errors if they appear in anything other
than compile-time void or scalar context? So these would all be
compile-time errors:

@m = (\$x = \$y);

(\$x = \$y) = @m;

foo(\$x = \$y);

sub bar {
....;
\$x = \$y;
}

Is that what you intend?

> P.S. Shouldn't perltrap also mention the ones that still bite after many
> years?

What are you referring to here? The quirks of refaliasing? Or the fact
that the context of the last statement in a sub is determined at runtime?
Or what?

--
This is a great day for France!
-- Nixon at Charles De Gaulle's funeral
Re: refaliasing list assignment in list context [ In reply to ]
On Thu, 21 Sep 2023 19:39:38 +0100 Dave Mitchell <davem@iabyn.com> wrote:

> (1)
> $a = $m[0];
> $b[0] = $m[1];
> $b[1] = $m[2];

should be written as:

( (\$a, \(@b)) = (\$x, \$y, \$z) ) = @m; # it is your example

> (2)
> \$a = $m[0]; # croaks if $m[i] isn't a ref
> \$b[0] = $m[1];
> \$b[1] = $m[2];

should be written as:

\( (\$a, \(@b)) = (\$x, \$y, \$z) ) = @m;

> (3)
> \$a = \$m[0];
> \$b[0] = \$m[1];
> \$b[1] = \$m[2];

should be written as:

\( (\$a, \(@b)) = (\$x, \$y, \$z) ) = \(@m);

Perl has \(...) construct to specify reference assignment semantics to
elements the construct surrounds, and it lacks a construct to remove
reference assignment semantics. So there is no choice whatsoever. Inner
assignment (which is to references) returns aliased variables (not
references). When one wants apply reference assignment to the result
one utilizes \(...) construct. This way is also explicit what a
programmer wants.

> foo( (\$a, \(@b)) = (\$x, \$y, \$z) );
>
> in that case I would expect the args passed to foo() to be $a, $b[0],
> $b[1] rather than \$a etc. So (1) it is.

Agree with that.

> So my vague proposal is that we should change the behaviour of refaliasing
> in list lvalue/rvalue context so that the returned list is a list of
> the aliased SVs, not a list of temporary refs to aliased SVs.

Agree with that.

> Alternatively we could just croak in list context (either at compile time,
> or in runtime for something like sub f { ...; \(@a) = .... }. Or we could
> just decree that aassign returns the empty list in list context, or ... ?

If it could be implemented I would rather have above described behavior.

--
Ivan Vorontsov <ivrntsv@yandex.ru>
Re: refaliasing list assignment in list context [ In reply to ]
On Wed, 27 Sep 2023 09:05:50 +0300 Ivan Vorontsov <ivrntsv@yandex.ru> wrote:

> On Thu, 21 Sep 2023 19:39:38 +0100 Dave Mitchell <davem@iabyn.com> wrote:
>
> > (1)
> > $a = $m[0];
> > $b[0] = $m[1];
> > $b[1] = $m[2];
>
> should be written as:
>
> ( (\$a, \(@b)) = (\$x, \$y, \$z) ) = @m; # it is your example
>
> > (2)
> > \$a = $m[0]; # croaks if $m[i] isn't a ref
> > \$b[0] = $m[1];
> > \$b[1] = $m[2];
>
> should be written as:
>
> \( (\$a, \(@b)) = (\$x, \$y, \$z) ) = @m;
>
> > (3)
> > \$a = \$m[0];
> > \$b[0] = \$m[1];
> > \$b[1] = \$m[2];
>
> should be written as:
>
> \( (\$a, \(@b)) = (\$x, \$y, \$z) ) = \(@m);
>
> Perl has \(...) construct to specify reference assignment semantics to
> elements the construct surrounds, and it lacks a construct to remove
> reference assignment semantics. So there is no choice whatsoever. Inner
> assignment (which is to references) returns aliased variables (not
> references). When one wants apply reference assignment to the result
> one utilizes \(...) construct. This way is also explicit what a
> programmer wants.
>
> > foo( (\$a, \(@b)) = (\$x, \$y, \$z) );
> >
> > in that case I would expect the args passed to foo() to be $a, $b[0],
> > $b[1] rather than \$a etc. So (1) it is.
>
> Agree with that.
>
> > So my vague proposal is that we should change the behaviour of refaliasing
> > in list lvalue/rvalue context so that the returned list is a list of
> > the aliased SVs, not a list of temporary refs to aliased SVs.
>
> Agree with that.
>
> > Alternatively we could just croak in list context (either at compile time,
> > or in runtime for something like sub f { ...; \(@a) = .... }. Or we could
> > just decree that aassign returns the empty list in list context, or ... ?
>
> If it could be implemented I would rather have above described behavior.

I'm wrong here. Examples are too narrow. \(...) doesn't work in a case
when references and non-references are mixed.

More intricate example:

use feature 'refaliasing';
no warnings 'experimental';
use Data::Dumper ();

my ($x, $y, $z);
my ($v1, $v2, $v3);
my ($v4, $v5, $v6) = qw(a b c);
my ($v7, $v8, $v9) = (1, 2, 3);
( ($x, \$v1, \$v2, $y, \$v3, $z) = (11, \$v4, \$v5, 12, \$v6, 13) ) = (21, \$v7, \$v8, 22, \$v9, 23);
print(Data::Dumper->new(
[.$x, $y, $z, \$v1, \$v2, \$v3, \$v4, \$v5, \$v6, \$v7, \$v8, \$v9],
[.qw(x y z v1 v2 v3 v4 v5 v6 v7 v8 v9)],
)->Dump);

Result:

$x = 21;
$y = 22;
$z = 23;
$v1 = \'a';
$v2 = \'b';
$v3 = \'c';
$v4 = $v1; # they are aliases, that's how Dumper depicts it
$v5 = $v2;
$v6 = $v3;
$v7 = \1;
$v8 = \2;
$v9 = \3;

What I would expect:

$x = 21;
$y = 22;
$z = 23;
$v1 = \1;
$v2 = \2;
$v3 = \3;
$v4 = $v1;
$v5 = $v2;
$v6 = $v3;
$v7 = $v1;
$v8 = $v2;
$v9 = $v3;

Inner assignment should return references as references, then outer
assignment should demand references to assign to them to alias.
Reference assignment is "poisonous". There are references all the way.

In function call context:

> foo( (\$a, \(@b)) = (\$x, \$y, \$z) );

foo gets \$a, \$b[0], \$b[1]. When one wants different behavior two
statements will work:

(\$a, \(@b)) = (\$x, \$y, \$z);
foo($a, @b);

--
Ivan Vorontsov <ivrntsv@yandex.ru>
Re: refaliasing list assignment in list context [ In reply to ]
On Thu, Sep 28, 2023 at 09:08:48AM +0300, Ivan Vorontsov wrote:
> Inner assignment should return references as references, then outer
> assignment should demand references to assign to them to alias.
> Reference assignment is "poisonous". There are references all the way.

I've been thinking about this some more. For an aliasing list assignment
in *rvalue* list context, I think the current behaviour is good. In lvalue
context, it's weird - but I think its a rare enough case not to put in the
extra work to make it have a 'useful' behaviour (for some definition of
'useful'). So we should just document the current behaviour.

In more detail:

The general principle of a list assignment in list context is that it
returns the LHS, as if you'd written it out separately. So as a simple
example,

foo( ($a, @b) = ($c, $d, $e) );

could be written equivalently as:

($a, @b) = ($c, $d, $e);
foo($a, @b);

For aliasing in rvalue context, this principle still works:

@x = ( (\$a, \(@b)) = ($c, $d, $e) );

could be written equivalently as:

(\$a, \(@b)) = ($c, $d, $e);
@x = (\$a, \(@b)); # or @x = (\$a, \$b[0], \$b[1]);

Note that in the second assign, there's no aliasing behaviour, instead @x
just gets copies of a bunch of refs.

The current behaviour is that aliases in list context just return a
temporary reference to each aliased element. This behaviour works well
with the rvalue example above.

Note however that \ on the LHS of a list assign *doesn't* literally
generate a temporary ref - its just syntax that tells the parser this is
an aliasing operation. The actual code behind the scenes uses the lvref
and lvavref ops to generate special magic values or push a special NULL
value on the stack to tell pp_aassign to handle things specially.

Thus in *lvalue* context, the aliasing behaviour isn't propagated. So in:

( (\$a, \(@b)) = ($c, $d, $e) ) = @x;

the elements of @x are just (uselessly) assigned to temporary references.
It's equivalent to:

(\$a, \(@b)) = ($c, $d, $e);
{
my $t1 = \$a;
my $t2 = \$b[0];
my $t3 = \$b[1];
$t1 = $x[0];
$t2 = $x[1];
$t3 = $x[2];
}

This behaviour is useless but understandable once you accept the principle
that the list assign returns a list of temporary references.

I think trying to make aliasing cascade across nested list assignments
would be hard. It would create an inconsistency when a function is called.
In:

foo( (\$x, \$y) = (\$a, \$b));
sub foo {
$_[0] = \$c;
$_[1] = \$d;
}

should the assignments to @_ elements just (uselessly) assign values to
temporary refs (current behaviour), or is some sort of aliasing occurring?
I think it should be useless assignment - making it do aliasing would
involve some sort of complex and spooky action-at-a-distance.

If foo() gets just plain references with useless assignments, I think the
nested assignment should too.

--
The Enterprise successfully ferries an alien VIP from one place to another
without serious incident.
-- Things That Never Happen in "Star Trek" #7
Re: refaliasing list assignment in list context [ In reply to ]
On Wed, 18 Oct 2023 21:16:23 +0100 Dave Mitchell <davem@iabyn.com> wrote:

> I've been thinking about this some more. For an aliasing list assignment
> in *rvalue* list context, I think the current behaviour is good. In lvalue
> context, it's weird - but I think its a rare enough case not to put in the
> extra work to make it have a 'useful' behaviour (for some definition of
> 'useful'). So we should just document the current behaviour.

I agree with that.

> I think trying to make aliasing cascade across nested list assignments
> would be hard. It would create an inconsistency when a function is called.
> In:
>
> foo( (\$x, \$y) = (\$a, \$b));
> sub foo {
> $_[0] = \$c;
> $_[1] = \$d;
> }
>
> should the assignments to @_ elements just (uselessly) assign values to
> temporary refs (current behaviour), or is some sort of aliasing occurring?
> I think it should be useless assignment - making it do aliasing would
> involve some sort of complex and spooky action-at-a-distance.

@_ should alias only scalars (as now), not mixing references. I prefer
explicit actions. I can imagine something like:

foo( (\$x, \$y) = (\$a, \$b));
sub foo {
\${$_[0]} = \$c; # aliasing through supplied reference
\${$_[1]} = \$d;
}

or:

foo( (\$x, \$y) = (\$a, \$b));
sub foo {
$_[0]->\$* = \$c; # same as above with postfix syntax
$_[1]->\$* = \$d;
}

or maybe even:

foo( (\$x, \$y) = (\$a, \$b));
sub foo {
$_[0]->$* \= $c; # it's aliasing assignment operator, a whole different beast
$_[1]->$* \= $d;
}

All above is just my speculations. Note their explicitness. That what I
would preferred. But I don't push this idea (maybe it is even a
different feature from discussed in this thread). Consider it thought
sharing. I can't imagine what it must take from the point of
implementor.

As for current refaliasing behavior my conclusion is to keep it
usefully simple with proper documentation of what to expect (and not
to).

--
Ivan Vorontsov <ivrntsv@yandex.ru>