Mailing List Archive

proposal: deprecate exists on array elements
**Porters,

Every once in a while, somebody suggests that "exists $arr[3]" can be useful. This is almost never true, and most often based on a misapprehension of what it means. (No, it's not a sparse array. No, it's not a useful test of array length.)

I think it's time to deprecate this behavior so it can be fatalized, as "defined @array" finally was. The reasoning here is "this feature is not useful, and is providing a leak into our abstraction. It confuses experts and new users alike, and is unlikely to be performing important work in real code."

This is a "What objections, if any, still stand in the way of doing this?" email.

--
rjbs
Re: proposal: deprecate exists on array elements [ In reply to ]
On Fri, Sep 3, 2021 at 12:52 PM Ricardo Signes <perl.p5p@rjbs.manxome.org>
wrote:

> Porters,
>
> Every once in a while, somebody suggests that "exists $arr[3]" can be
> useful. This is almost never true, and most often based on a
> misapprehension of what it means. (No, it's not a sparse array. No, it's
> not a useful test of array length.)
>
> I think it's time to deprecate this behavior so it can be fatalized, as
> "defined @array" finally was. The reasoning here is "this feature is not
> useful, and is providing a leak into our abstraction. It confuses experts
> and new users alike, and is unlikely to be performing important work in
> real code."
>
> This is a "What objections, if any, still stand in the way of doing this?"
> email.
>

I have no objection except that 'delete $arr[3]' should be deprecated in
the same sweep, as these two functionalities are codependent.

-Dan
Re: proposal: deprecate exists on array elements [ In reply to ]
Quoth Ricardo Signes <perl.p5p@rjbs.manxome.org>:
> (No, it's not a sparse array. No, it's not a useful test of array length.)

Perl's _built-in_ arrays aren't sparse. A _tied_ array definitely can be.
Forcing all users of such to switch from
exist($a[$i])
to something like
tied(@a)->exists($i)
seems like a bad idea.


/Bo Lindbergh
Re: proposal: deprecate exists on array elements [ In reply to ]
On Fri, Sep 03, 2021 at 12:58:41PM -0400, Dan Book wrote:
> On Fri, Sep 3, 2021 at 12:52 PM Ricardo Signes <perl.p5p@rjbs.manxome.org>
> wrote:
>
> > Porters,
> >
> > Every once in a while, somebody suggests that "exists $arr[3]" can be
> > useful. This is almost never true, and most often based on a
> > misapprehension of what it means. (No, it's not a sparse array. No, it's
> > not a useful test of array length.)
> >
> > I think it's time to deprecate this behavior so it can be fatalized, as
> > "defined @array" finally was. The reasoning here is "this feature is not
> > useful, and is providing a leak into our abstraction. It confuses experts
> > and new users alike, and is unlikely to be performing important work in
> > real code."
> >
> > This is a "What objections, if any, still stand in the way of doing this?"
> > email.
> >
>
> I have no objection except that 'delete $arr[3]' should be deprecated in
> the same sweep, as these two functionalities are codependent.

I think that deprecating the two together makes sense, because what does
`delete $arr[3]` "mean" if `exists $arr[3]` is no longer allowed? It seems
to just end up being an obfuscated form of undef.


As to exists and delete on arrays, the insanity come from the combination of
the two, sparse assignment and pop, because Perl doesn't really have sparse
arrays, and pop exposes this. You get the situation where you can delete an
element that does not exist and the array changes size:


$ cat sparse.pl
my @a;

sub show_array {
my $desc = shift;
my @status;
for (my $i = 0; $i < @a; ++$i) {
push(@status, defined($a[$i]) ? "'$a[$i]'" :
exists($a[$i]) ? '<undef>' :
'...');
}
printf "%-21s: length=%u \@a = ( %s )\n",
$desc, scalar(@a), join(', ', @status);
}

show_array("1. start");
$a[0] = "foo";
$a[3] = "bar";
show_array('2. set $a[0] & $a[3]');
pop @a;
show_array('3. popped $a[3]');
die if exists $a[2] || defined $a[2];
delete $a[2];
show_array('4. have deleted $a[2]');

__END__

$ perl sparse.pl
1. start : length=0 @a = ( )
2. set $a[0] & $a[3] : length=4 @a = ( 'foo', ..., ..., 'bar' )
3. popped $a[3] : length=3 @a = ( 'foo', ..., ... )
4. have deleted $a[2]: length=1 @a = ( 'foo' )


After step 3, $a[2] doesn't exist and isn't defined, yet deleting it changes
the size of the array!


Nicholas Clark
Re: proposal: deprecate exists on array elements [ In reply to ]
On 2021/09/03 09:50, Ricardo Signes wrote:
> Porters,
>
> Every once in a while, somebody suggests that "exists $arr[3]" can be
> useful. This is almost never true, and most often based on a
> misapprehension of what it means. (No, it's not a sparse array. No,
> it's not a useful test of array length.)
----
Reason for deprecation? Because usefulness is rare?
But it is useful as to whether or not it has been set in the
prog below. Just because roman numerals are almost never
used, do we deprecate them for clocks and numbering lists?
"almost 0" isn't the same as 0. Also, would you also deprecate
the test for indirect arrays? I know that test is used on CPAN.

Prog:
#!/usr/bin/perl
use warnings; use strict; use P;
our @results=();
sub isnum($) {
defined($_[0]) && $_[0] =~ m{\d+};
}
my @tst0=(qw(1 0 1));
my @tst1=(qw(1 0 0 - 0 - 1));
my @tst2=(qw(1 0 0 1 0 ));
my @tstN=(qw(1 0 0 0 - - 0 1));
sub chkres(){
#records vals in @results
}
sub set($){ my $tn=shift;
my $c=0; for (@$tn) { $results[$c]=$_ if isnum($_); ++$c; }
}
sub clr($){ my $tn=shift;
my $c=0; for (@$tn) { $results[$c]=undef if isnum($_); ++$c; }
}

sub show_cov(;$) {
if (@_) {
printf "after tst %s\n", $_[0];
}
P "highest testcase=%s; ", scalar @results;
my $cnt=0; my $out="";
for (@results) { $out.=P "%s ",$cnt unless exists $results[$cnt++]; }
P "nothing for case: %s", $out if $out;
}
set(\@tst0); clr(\@tst0);
show_cov(0);
set(\@tst1); clr(\@tst1);
show_cov(1);
set(\@tst2); clr(\@tst2);
show_cov(2);
set(\@tstN); clr(\@tstN);
show_cov('N');
Re: proposal: deprecate exists on array elements [ In reply to ]
On Fri, 03 Sep 2021 12:50:34 -0400
"Ricardo Signes" <perl.p5p@rjbs.manxome.org> wrote:

> **Porters,
>
> Every once in a while, somebody suggests that "exists $arr[3]" can be useful. This is almost never true, and most often based on a misapprehension of what it means. (No, it's not a sparse array. No, it's not a useful test of array length.)
>
> I think it's time to deprecate this behavior so it can be fatalized, as "defined @array" finally was. The reasoning here is "this feature is not useful, and is providing a leak into our abstraction. It confuses experts and new users alike, and is unlikely to be performing important work in real code."
>
> This is a "What objections, if any, still stand in the way of doing this?" email.

My personal opinion:

- delete($array[$n]) should be either removed *or* redefined as
"= undef". I *think* I prefer the former.

- exists($array[$n]) should be redefined as "$#array >= $n".

This way we would remove pseudo sparse arrays without compromising the
consistency of the language.
Re: proposal: deprecate exists on array elements [ In reply to ]
On Sat, Sep 25, 2021, at 2:35 PM, L A Walsh wrote:
> On 2021/09/03 09:50, Ricardo Signes wrote:
> > Porters,
> >
> > Every once in a while, somebody suggests that "exists $arr[3]" can be
> > useful. This is almost never true, and most often based on a
> > misapprehension of what it means. (No, it's not a sparse array. No,
> > it's not a useful test of array length.)
> ----
> Reason for deprecation? Because usefulness is rare?

Because it confuses experts and new users alike, as I said in a section you did not quote.

You included a sizable hunk of code, but didn't indicate what the point of its use of exists on an array element was.

Also, what does "the test for indirect arrays" mean? This isn't a term that exists in the documentation.

--
rjbs
Re: proposal: deprecate exists on array elements [ In reply to ]
On Sun, 26 Sept 2021 at 10:29, Ricardo Signes <perl.p5p@rjbs.manxome.org>
wrote:

> On Sat, Sep 25, 2021, at 2:35 PM, L A Walsh wrote:
>
> On 2021/09/03 09:50, Ricardo Signes wrote:
> > Every once in a while, somebody suggests that "exists $arr[3]" can be
> > useful. This is almost never true, and most often based on a
> > misapprehension of what it means. (No, it's not a sparse array. No,
> > it's not a useful test of array length.)
> ----
> Reason for deprecation? Because usefulness is rare?
>
>
> Because it confuses experts and new users alike, as I said in a section
> you did not quote.
>

There are quite a few other confusing features in Perl - I think there's
even a document of "Perl quirks" going around at the moment with a list of
them! - so "can cause confusion" on its own is perhaps not the most
compelling reason?

Personally, I've used delete/exists on fixed-size Perl arrays a few times
in the past. The relevance for tied arrays has already been raised, so I'll
ignore that for now. Instead, here's one example for practical uses in pure
Perl - I can probably dig out a few more if this one isn't enough...

Having extra exists vs. defined state is useful for caching: it allows
differentiation between "it might be in the backing store, we haven't
checked yet" and "we have confirmed the absence of a value".

Yes, you can implement this in other ways, such as a placeholder object,
but compare that with the simplicity of the current approach:

use Object::Pad;
use Future::AsyncAwait;
class Cache::Array {
has @cache;
has $storage;
async method get($id) {
return $cache[$id] if exists $cache[$id];
$self->after(seconds => 30, async method { delete $cache[$id] });
return $cache[$id] = await $storage->get($id);
}
}

This also provides excellent symmetry with hash functionality:

use Object::Pad;
use Future::AsyncAwait;
class Cache::Hash {
has %cache;
has $storage;
async method get($id) {
return $cache{$id} if exists $cache{$id};
$self->after(seconds => 30, async method { delete $cache{$id} });
return $cache{$id} = await $storage->get($id);
}
}

Same concepts - near-identical code: only the punctuation changes.

I think that's a valuable feature, particularly considering the increased
alignment between hash and array functionality in recent years: keys(),
values() and kvslices for example.

So, what do we gain by removing delete/exists on arrays? Seems that'd be
important additional information to guide decisions - don't we have a
process for this already? Some typical reasons for removing a feature might
include:

- improved performance? how much difference might we expect to see?
- easier code maintenance? can we quantify this somehow? maybe links to the
relevant areas of concern?
- is it standing in the way of any proposed refactoring or new features?
are there any options for allowing the current behaviour to work alongside
such changes?
Re: proposal: deprecate exists on array elements [ In reply to ]
On Sun, Sep 26, 2021 at 12:02:18PM +0800, Tom Molesworth via perl5-porters wrote:
> On Sun, 26 Sept 2021 at 10:29, Ricardo Signes <perl.p5p@rjbs.manxome.org>
> wrote:

> > Because it confuses experts and new users alike, as I said in a section
> > you did not quote.
> >
>
> There are quite a few other confusing features in Perl - I think there's
> even a document of "Perl quirks" going around at the moment with a list of
> them! - so "can cause confusion" on its own is perhaps not the most
> compelling reason?
>
> Personally, I've used delete/exists on fixed-size Perl arrays a few times
> in the past. The relevance for tied arrays has already been raised, so I'll
> ignore that for now. Instead, here's one example for practical uses in pure
> Perl - I can probably dig out a few more if this one isn't enough...

I can see that it has some use on fixed-size arrays. But Perl can't constrain
an array to stay a fixed size, *And* delete on an array is defined to change
the size in certain conditions.

Here is the message I sent 2 weeks ago, as part of this thread, that no-one
responded to:


As to exists and delete on arrays, the insanity come from the combination of
the two, sparse assignment and pop, because Perl doesn't really have sparse
arrays, and pop exposes this. You get the situation where you can delete an
element that does not exist and the array changes size:


$ cat sparse.pl
my @a;

sub show_array {
my $desc = shift;
my @status;
for (my $i = 0; $i < @a; ++$i) {
push(@status, defined($a[$i]) ? "'$a[$i]'" :
exists($a[$i]) ? '<undef>' :
'...');
}
printf "%-21s: length=%u \@a = ( %s )\n",
$desc, scalar(@a), join(', ', @status);
}

show_array("1. start");
$a[0] = "foo";
$a[3] = "bar";
show_array('2. set $a[0] & $a[3]');
pop @a;
show_array('3. popped $a[3]');
die if exists $a[2] || defined $a[2];
delete $a[2];
show_array('4. have deleted $a[2]');

__END__

$ perl sparse.pl
1. start : length=0 @a = ( )
2. set $a[0] & $a[3] : length=4 @a = ( 'foo', ..., ..., 'bar' )
3. popped $a[3] : length=3 @a = ( 'foo', ..., ... )
4. have deleted $a[2]: length=1 @a = ( 'foo' )


After step 3, $a[2] doesn't exist and isn't defined, yet deleting it changes
the size of the array!


Nicholas Clark
Re: proposal: deprecate exists on array elements [ In reply to ]
On Sun, 26 Sept 2021 at 15:10, Nicholas Clark <nick@ccl4.org> wrote:

> On Sun, Sep 26, 2021 at 12:02:18PM +0800, Tom Molesworth via perl5-porters
> wrote:
> > On Sun, 26 Sept 2021 at 10:29, Ricardo Signes <perl.p5p@rjbs.manxome.org
> >
> > wrote:
>
> > > Because it confuses experts and new users alike, as I said in a section
> > > you did not quote.
> > >
> >
> > There are quite a few other confusing features in Perl - I think there's
> > even a document of "Perl quirks" going around at the moment with a list
> of
> > them! - so "can cause confusion" on its own is perhaps not the most
> > compelling reason?
> >
> > Personally, I've used delete/exists on fixed-size Perl arrays a few times
> > in the past. The relevance for tied arrays has already been raised, so
> I'll
> > ignore that for now. Instead, here's one example for practical uses in
> pure
> > Perl - I can probably dig out a few more if this one isn't enough...
>
> I can see that it has some use on fixed-size arrays. But Perl can't
> constrain
> an array to stay a fixed size, *And* delete on an array is defined to
> change
> the size in certain conditions.
>
> Here is the message I sent 2 weeks ago, as part of this thread, that no-one
> responded to:
>

Yes, that message was partly why I sent this example: the abstraction does
indeed leak when considering array size, and in other places as well (e.g.
keys @x after a delete still shows the full 0..$#x range of indices, push
appends items without using deleted slots, etc.). Despite this, many useful
scenarios still exist where the size information is not important. There
may still be good reasons for removing the code, of course: I'm just
surprised not to see much discussion of counter-arguments first.

Anyway, the current delete/pop behaviour is quite logical, if perhaps
somewhat unintuitive:

- `pop @x` remove and return *one* item (if the size is nonzero)
- `delete $x[$#x]` similar, but is also free to trim nonexistent elements
from the end of the array

I find both of these behaviours useful in their own way - both the strict
delta-at-most-1 guarantee of `@post_x == max(0, @pre_x - 1)` that `pop @x`
provides, and the "clean up any leftover junk" that delete offers. You
wouldn't expect `pop` or `push` to change the size by more than 1, but
seems reasonable to have less strict expectations from `delete`.
Re: proposal: deprecate exists on array elements [ In reply to ]
On 2021/09/25 21:02, Tom Molesworth wrote:
> On Sun, 26 Sept 2021 at 10:29, Ricardo Signes
> <perl.p5p@rjbs.manxome.org <mailto:perl.p5p@rjbs.manxome.org>> wrote:
>
> On Sat, Sep 25, 2021, at 2:35 PM, L A Walsh wrote:
>> On 2021/09/03 09:50, Ricardo Signes wrote:
>> > Every once in a while, somebody suggests that "exists $arr[3]"
>> can be
>> > useful. This is almost never true, and most often based on a
>> > misapprehension of what it means. (No, it's not a sparse
>> array. No,
>> > it's not a useful test of array length.)
>> ----
>> Reason for deprecation? Because usefulness is rare?
>
> Because it confuses experts and new users alike, as I said in a
> section you did not quote.
>
>
> There are quite a few other confusing features in Perl - I think
> there's even a document of "Perl quirks" going around at the moment
> with a list of them! - so "can cause confusion" on its own is perhaps
> not the most compelling reason?
----
I'm in agreement w/Tom... I didn't think that part of your email was
really relevant to the issue.

As for my example -- you seemed to want an example of valid usage.
Off the top of my head, I could use it to see what entries in @results
had not been touched by the existing test sets. I thought it was a
sufficiently trivial and named example (cov=coverage).

As for big-hunk of code -- 37 lines, not meant to be a smallest,
not-understandable case. As for indirect usage -- before claiming it
isn't in the documentation, you might at least check:

perlfunc:
....
If the hook is an array reference, its first element must be a
subroutine reference. This subroutine is called as above, but the
first parameter is the array reference. This lets you indirectly
pass arguments to the subroutine.

It's talking about using a syntax similar to P's print-to-string
can take an array (unlike sprintf) for use in arguments -- put
format as 1st element, with args that follow. I think you can also
unshift on the FH param into the array. Now you want to talk
"understanding" -- why would perl document that using an array with
sprintf is [almost] never useful (actually it is never used and could
be used). Seems like there are lower hanging fruit on the Perl tree
that would not cause unnecessary breakage in any CPAN prog, vs. killing
eval's usage on ARRAY's definitely would.

There are other areas that wouldn't break compatibility with existing
CPAN programs where perl *creates* an error rather than doing what
the user meant. Having a language that tells the user that it knew
what the user meant, but then refuses to do it, is a quick way to lose
users -- like using a ref to something as equivalent to the something
where it can. If perl can't figure it out, then it can't. But if perl
*can* figure it out but *won't*, based on some case that doesn't even
exist in the language, that's just churlish.
Re: proposal: deprecate exists on array elements [ In reply to ]
On 2021/09/26 00:10, Nicholas Clark wrote:
>
> As to exists and delete on arrays, the insanity come from the combination of
> the two, sparse assignment and pop, because Perl doesn't really have sparse
> arrays, and pop exposes this. You get the situation where you can delete an
> element that does not exist and the array changes size:
>
----
Interesting example.
Bash supports sparse arrays through "some" mechanism.
I'm a bit surprised perl doesn't as well. Still it is an
interesting side effect, but with arrays, I always expect
the size to be 1+the last element index. Even though
perl doesn't support sparse, exists is properly saying
space for that element has never been initialized. However
that doesn't mean it might not have been allocated behind the
scenes as part of perl's internal ARRAY representation.

I don't think the internal representation of such a structure
should be used as some reason to change the external functionality.
I.e. we wouldn't want the tail to be wagging the dog, would we?
AFAIK, the internals are "internal" and not really part of the
documented or external interface, no?