Mailing List Archive: Pre-RFC: `unknown` versus `undef`

Re: Pre-RFC: `unknown` versus `undef` [ In reply to ]

Dec 19, 2021, 7:11 AM

Post #26 of 38 (872 views)

On Sun, 19 Dec 2021 12:45:18 +0000 (UTC), Ovid via perl5-porters <perl5-porters@perl.org> wrote:

> On Sunday, 19 December 2021, 13:28:23 CET, Oodler 577 <oodler577@sdf-eu.org> wrote:
>
>
> > Thank you, this is super helpful. My final comment is just to
> > reiterate what I most recently said; as long as this doesn't
> > affect how things currently work with undef/q{}/0 and existing
> > built-ins/ops; and we get a C<unknown> built-in that does for
> > unknown values what C<defined> does for undef'd values,
>
> For interpolation, I would suggest it behave like undef, but with a
> warning. I would (only half-joking here), also consider it to
> stringify to U+FFFD REPLACEMENT CHARACTER.

100% joking: 016844 ???? BAMUM LETTER PHASE-A UNKNOWN

> my $name = unknown;
> say "Hello, $name!";
>
> Output:
>
> Use of unknown value $name in say at ...
> Hello, ?!
>
> > As an exercise, I wonder how many use cases for undef would remain
> > if unknown was available. If the answer is "not many", then maybe
> > the answer would be a compatible tweak to undef and not the
> > creation of a new special value. Just a thought...
>
> I would not recommend changing current behavior of undef. That would
> be widespread carnage.
>
> Ovid

--
H.Merijn Brand https://tux.nl Perl Monger http://amsterdam.pm.org/
using perl5.00307 .. 5.33 porting perl5 on HP-UX, AIX, and Linux
https://tux.nl/email.html http://qa.perl.org https://www.test-smoke.org

Re: Pre-RFC: `unknown` versus `undef` [ In reply to ]

public at khwilliamson

Dec 19, 2021, 7:49 AM

Post #27 of 38 (872 views)

Permalink

On 12/18/21 16:18, Paul "LeoNerd" Evans wrote:
> There may be a Lewis Carroll quote applicable here...

Shun the frumious bandersnatch?

Re: Pre-RFC: `unknown` versus `undef` [ In reply to ]

grinnz at gmail

Dec 19, 2021, 8:09 AM

Post #28 of 38 (872 views)

Permalink

On Sun, Dec 19, 2021 at 7:14 AM Ovid via perl5-porters <
perl5-porters@perl.org> wrote:

> The if/else is actually pretty simple if we step back for a moment. I
> think the confusion is that we misunderstand what an "else" block means in
> Perl. Let's consider this:
>
> if ( $var > 3 ) {
> ...
> }
> else {
> ...
> }
>
> In the above, in the else block, we mentally assume that "$var <= 3"
> holds. In many statically typed languages, that assumption might hold true.
>
> In Perl, $var might be undef and be evaluated as less than three. However,
> $var might be the string "Hello, World". $var might also be a reference to
> a hash, we get absolutely no warning, and we hit our else block with an
> assumption that is probably true ($var <= 3), but not in this particular
> case. We _should_ be verifying what kind of data that $var holds, but
> usually we don't.
>

But this isn't really what's going on here. In Perl, every scalar value is
a number, once you use it as one. Hash references numify to their refaddr.
So this comparison is still perfectly valid and there are no type
conflicts, for any scalar value except those which die when numified.

-Dan

Re: Pre-RFC: `unknown` versus `undef` [ In reply to ]

grinnz at gmail

Dec 19, 2021, 8:12 AM

Post #29 of 38 (872 views)

Permalink

On Sun, Dec 19, 2021 at 11:09 AM Dan Book <grinnz@gmail.com> wrote:

> On Sun, Dec 19, 2021 at 7:14 AM Ovid via perl5-porters <
> perl5-porters@perl.org> wrote:
>
>> The if/else is actually pretty simple if we step back for a moment. I
>> think the confusion is that we misunderstand what an "else" block means in
>> Perl. Let's consider this:
>>
>> if ( $var > 3 ) {
>> ...
>> }
>> else {
>> ...
>> }
>>
>> In the above, in the else block, we mentally assume that "$var <= 3"
>> holds. In many statically typed languages, that assumption might hold true.
>>
>> In Perl, $var might be undef and be evaluated as less than three.
>> However, $var might be the string "Hello, World". $var might also be a
>> reference to a hash, we get absolutely no warning, and we hit our else
>> block with an assumption that is probably true ($var <= 3), but not in this
>> particular case. We _should_ be verifying what kind of data that $var
>> holds, but usually we don't.
>>
>
> But this isn't really what's going on here. In Perl, every scalar value is
> a number, once you use it as one. Hash references numify to their refaddr.
> So this comparison is still perfectly valid and there are no type
> conflicts, for any scalar value except those which die when numified.
>

Addendum: NaN and Inf, however, are "interesting" numbers, and NaN behaves
much like the proposed unknown in numeric comparisons: $nan > 3 and $nan <=
3 are both false, but it would hit the else block.

-Dan

Re: Pre-RFC: `unknown` versus `undef` [ In reply to ]

darren at darrenduncan

Dec 19, 2021, 11:20 AM

Post #30 of 38 (872 views)

Permalink

On 2021-12-19 4:07 a.m., Oodler 577 via perl5-porters wrote:
> The closest concept I can think that I might be familiar with is Fortran's
> NaN, which is part of the IEEE arithmetic standards. It is composed of a
> special values that can be detected in hardware or software.

This again, the NaN concept of Fortran or IEEE floating point numbers, is
another reason I actually advocate for a concept like I expressed with Excuse,
where an explicit reason for a missing regular value is encoded in the
not-regular-value result. An Excuse duals for the concepts of unthrown
exception as well as the concept of a NaN, of which there are several kinds,
such as divide by zero, underflow, etc. -- Darren Duncan

Re: Pre-RFC: `unknown` versus `undef` [ In reply to ]

darren at darrenduncan

Dec 19, 2021, 1:13 PM

Post #31 of 38 (872 views)

Permalink

On 2021-12-19 4:13 a.m., Ovid via perl5-porters wrote:
> In Perl, $var might be undef and be evaluated as less than three. However, $var might be the string "Hello, World". $var might also be a reference to a hash, we get absolutely no warning, and we hit our else block with an assumption that is probably true ($var <= 3), but not in this particular case. We _should_ be verifying what kind of data that $var holds, but usually we don't.
>
> Thus, in a dynamic language like Perl, barring validating our data up front, the else block very often makes no guarantees about what kinds of data that we have.

It seems to me that the real problem here is more that we're seeing a
consequence of Perl's weak typing, where values of one type are automatically
cast as values of another type when we try to treat them as the latter. The
"undef" problem is a key example of this but there is also the other examples
you cite.

I feel that a much better solution to the real problem is to support stronger
typing in Perl, make it possible for values to NOT automatically convert to
other types, and instead raise an error.

Regardless, I feel that the best way to implement a solution, whether the things
I've proposed or the thing you proposed, is over top of and/or as an extension
of Corinna rather than as some other independent thing.

So your Unknown should be a Corinna object, thus leveraging this more generic
fundamental type Corinna is introducing to the Perl core and exploiting its
power, rather than being yet another orthogonal thing.

Especially as whatever bit flags or whatever for the underlying implementation
of Perl values are of very limited supply, I feel it would be a bad idea to
waste one on this Unknown concept.

The Unknown concept should be hoisted as a higher level concept implemented in
terms of other things, such as being a Corinna object, and not be fundamental.

This would not be part of the minimal-MVC of Corinna, but the minimal-MVC of
Corinna can be designed in such a way that support for
Unknown/Excuse/types/whatever can be added later in a non-compatibility-breaking
way.

I also see that when a later extension to Perl to support the option of
explicitly typing variables/fields/parameters/etc, which also helps the ability
to compile the code for speed, this Unknown etc concept could be added THEN, as
more of a set of complimentary strict types that the Perl core defines, besides
just-an-integer, just-a-float, just-a-string, etc, and I would note those strict
types EXCLUDE undef.

What do you think about that? Putting Unknown/etc off to be a post-Corinna thing?

-- Darren Duncan

Re: Pre-RFC: `unknown` versus `undef` [ In reply to ]

leonerd at leonerd

Dec 19, 2021, 3:59 PM

Post #32 of 38 (872 views)

Permalink

On Sun, 19 Dec 2021 13:13:28 -0800
Darren Duncan <darren@darrenduncan.net> wrote:

> On 2021-12-19 4:13 a.m., Ovid via perl5-porters wrote:
> > In Perl, $var might be undef and be evaluated as less than three.
> > However, $var might be the string "Hello, World". $var might also
> > be a reference to a hash, we get absolutely no warning, and we hit
> > our else block with an assumption that is probably true ($var <=
> > 3), but not in this particular case. We _should_ be verifying what
> > kind of data that $var holds, but usually we don't.
> >
> > Thus, in a dynamic language like Perl, barring validating our data
> > up front, the else block very often makes no guarantees about what
> > kinds of data that we have.
>
> It seems to me that the real problem here is more that we're seeing a
> consequence of Perl's weak typing, where values of one type are
> automatically cast as values of another type when we try to treat
> them as the latter. The "undef" problem is a key example of this but
> there is also the other examples you cite.
>
> I feel that a much better solution to the real problem is to support
> stronger typing in Perl, make it possible for values to NOT
> automatically convert to other types, and instead raise an error.

Such as my "no stringification" idea, and also a similar one for
numbers.

Yes that's on my list to think about properly. Maybe sometime in the
5.37 series...

--
Paul "LeoNerd" Evans

leonerd@leonerd.org.uk | https://metacpan.org/author/PEVANS
http://www.leonerd.org.uk/ | https://www.tindie.com/stores/leonerd/

Re: Pre-RFC: `unknown` versus `undef` [ In reply to ]

perl5-porters at perl

Dec 20, 2021, 3:00 AM

Post #33 of 38 (872 views)

Permalink

On Sunday, 19 December 2021, 22:13:53 CET, Darren Duncan <darren@darrenduncan.net> wrote:

> I feel that a much better solution to the real problem is to support stronger
> typing in Perl, make it possible for values to NOT automatically convert to
> other types, and instead raise an error.

I think there are two problems with that.

1. Support for stronger typing in Perl is years away and we don't even know if we'll get it.
2. 3VL logic works with both static (throw an exception) and dynamic (apply Kleene's 3VL) typing.

So unknown values are orthogonal to how a type system is implemented.

So my argument is:

1. Adding unknown values is a relatively small change (compared to adding a type system or building Corinna)
2. Its behavior is generally decoupled from current features, making it safer to implement
3. It has a working prototype and test suite on the CPAN
4. It has a high-value win in eliminating common types of errors we currently deal with

I think trying to add more to this simple idea to perfect it is premature (though I like your concept of Excuse). As an MVP, you can start using it today. If unknown values can be implemented in Perl, then we can see how it's actually being used and decide when (or if) we want to expand on it.

Best,
Ovid

Re: Pre-RFC: `unknown` versus `undef` [ In reply to ]

darren at darrenduncan

Dec 20, 2021, 3:23 AM

Post #34 of 38 (872 views)

Permalink

I feel that you underestimate how complicated your 3VL project would be to
thoroughly implement as a built-in native core type, at least if you want the
behavior of all the core operators and routines to change to respect it.

In particular I feel that the Corinna MVC is actually simpler than what the 3VL
would entail.

Also, the response I have seen so far to the idea, not just from me but from
others, seems to be quite mixed, and it seems considerably less of a sure thing
to be built-in than is Corinna.

I do not see unknown values as orthogonal to a type system, rather they are an
element of and inseparable from a type system, though multiple type systems can
have them.

Unlike Corinna, which I see it is best for being built-in to Perl at a low
level, I believe that your Unknown proposal is best kept in the form of a CPAN
module and not be something built-in to the core language. That is, the final
form can be the same form as the prototype, which is that Unknown is an object
of a specific class, and it defines its own versions of operators or subs that
it wants special behavior for.

I strongly support Corinna, at best I only weakly support the Unknown proposal.

If time is limited then Corinna should be prioritized for landing and this other
thing wait until after it has before Unknown takes resources away from Corinna.

-- Darren Duncan

On 2021-12-20 3:00 a.m., Ovid via perl5-porters wrote:
> On Sunday, 19 December 2021, 22:13:53 CET, Darren Duncan <darren@darrenduncan.net> wrote:
>
>> I feel that a much better solution to the real problem is to support stronger
>> typing in Perl, make it possible for values to NOT automatically convert to
>> other types, and instead raise an error.
>
> I think there are two problems with that.
>
> 1. Support for stronger typing in Perl is years away and we don't even know if we'll get it.
> 2. 3VL logic works with both static (throw an exception) and dynamic (apply Kleene's 3VL) typing.
>
> So unknown values are orthogonal to how a type system is implemented.
>
> So my argument is:
>
>
> 1. Adding unknown values is a relatively small change (compared to adding a type system or building Corinna)
> 2. Its behavior is generally decoupled from current features, making it safer to implement
> 3. It has a working prototype and test suite on the CPAN
> 4. It has a high-value win in eliminating common types of errors we currently deal with
>
> I think trying to add more to this simple idea to perfect it is premature (though I like your concept of Excuse). As an MVP, you can start using it today. If unknown values can be implemented in Perl, then we can see how it's actually being used and decide when (or if) we want to expand on it.
>
>
> Best,
> Ovid

Re: Pre-RFC: `unknown` versus `undef` [ In reply to ]

nick at ccl4

Dec 20, 2021, 4:21 AM

Post #35 of 38 (872 views)

Permalink

On Mon, Dec 20, 2021 at 11:00:16AM +0000, Ovid via perl5-porters wrote:
> On Sunday, 19 December 2021, 22:13:53 CET, Darren Duncan <darren@darrenduncan.net> wrote:
>
> > I feel that a much better solution to the real problem is to support stronger
> > typing in Perl, make it possible for values to NOT automatically convert to
> > other types, and instead raise an error.
>
> I think there are two problems with that.
>
> 1. Support for stronger typing in Perl is years away and we don't even know if we'll get it.
> 2. 3VL logic works with both static (throw an exception) and dynamic (apply Kleene's 3VL) typing.
>
> So unknown values are orthogonal to how a type system is implemented.
>
> So my argument is:
>
>
> 1. Adding unknown values is a relatively small change (compared to adding a type system or building Corinna)
> 2. Its behavior is generally decoupled from current features, making it safer to implement
> 3. It has a working prototype and test suite on the CPAN

Limited by what you can do by overloading.

So no way of prototyping things like my question of how hashes behave when
unknown values are used as keys in RVALUE context, and in LVALUE context.

And I realised that even that what I suggested wasn't complete. For this:

$bar = $hash{$foo};

If $foo happens to be unknown, is $bar always unknown?
Or is it undef if and only if %hash is empty?

That behaviour is arguably more consistent with what unknown means than the
"always return unknown".

And the thought experiment about what ternaries and hence if/else should do.

There are a *lot* of operations that aren't prototyped yet.

For example,

$$foo

dereferencing an unknown is what? An unknown? An exception?
Either answer seems reasonable.

eval $foo

trying to pass it to eval is what? An unknown? An exception?

Does trying to use an unknown value as a file handle trigger an exception?
Or an infinite stream of unknowns on read attempts?

What is the range [$foo .. 42] where $foo is unknown?

I *think* that most logically it's an empty list, but that does seem to
end up eliminating unknown-ness. Hence if we have first class unknowns,
should we be able to have arrays of unknown length?

> 4. It has a high-value win in eliminating common types of errors we currently deal with

And massive risk in introducing a lot of surprises in code not written to
expect unknowns, that is passed one within a data structure.

Basically all of CPAN.

This really (still) feels to me like it's currently a research project, not a
feature to add to a mature language used in production.

Only when there's a working prototype that can explore most questions about
"how does X behave with unknowns?"

There are about 400 opcodes in perl. I suspect that >90% are easy to figure
out for "unknown" (for example as "what would a NaN do here?") but a few
really aren't going to be obvious, or end up being trade offs between
conceptual correctness and what it's actually possible to implement.

Nicholas Clark

Re: Pre-RFC: `unknown` versus `undef` [ In reply to ]

perl5-porters at perl

Dec 20, 2021, 7:20 AM

Post #36 of 38 (872 views)

Permalink

OK, I can't argue against Nicholas' points.

Best,
Ovid
--
IT consulting, training, specializing in Perl, databases, and agile development
http://www.allaroundtheworld.fr/.

Buy my book! - http://bit.ly/beginning_perl

On Monday, 20 December 2021, 13:21:20 CET, Nicholas Clark <nick@ccl4.org> wrote:

On Mon, Dec 20, 2021 at 11:00:16AM +0000, Ovid via perl5-porters wrote:
> On Sunday, 19 December 2021, 22:13:53 CET, Darren Duncan <darren@darrenduncan.net> wrote:
>
> > I feel that a much better solution to the real problem is to support stronger
> > typing in Perl, make it possible for values to NOT automatically convert to
> > other types, and instead raise an error.
>
> I think there are two problems with that.
>
> 1. Support for stronger typing in Perl is years away and we don't even know if we'll get it.
> 2. 3VL logic works with both static (throw an exception) and dynamic (apply Kleene's 3VL) typing.
>
> So unknown values are orthogonal to how a type system is implemented.
>
> So my argument is:
>
>
> 1. Adding unknown values is a relatively small change (compared to adding a type system or building Corinna)
> 2. Its behavior is generally decoupled from current features, making it safer to implement
> 3. It has a working prototype and test suite on the CPAN

Limited by what you can do by overloading.

So no way of prototyping things like my question of how hashes behave when
unknown values are used as keys in RVALUE context, and in LVALUE context.

And I realised that even that what I suggested wasn't complete. For this:

$bar = $hash{$foo};

If $foo happens to be unknown, is $bar always unknown?
Or is it undef if and only if %hash is empty?

That behaviour is arguably more consistent with what unknown means than the
"always return unknown".

And the thought experiment about what ternaries and hence if/else should do.

There are a *lot* of operations that aren't prototyped yet.

For example,

$$foo

dereferencing an unknown is what? An unknown? An exception?
Either answer seems reasonable.

eval $foo

trying to pass it to eval is what? An unknown? An exception?

Does trying to use an unknown value as a file handle trigger an exception?
Or an infinite stream of unknowns on read attempts?

What is the range [$foo .. 42] where $foo is unknown?

I *think* that most logically it's an empty list, but that does seem to
end up eliminating unknown-ness. Hence if we have first class unknowns,
should we be able to have arrays of unknown length?

> 4. It has a high-value win in eliminating common types of errors we currently deal with

And massive risk in introducing a lot of surprises in code not written to
expect unknowns, that is passed one within a data structure.

Basically all of CPAN.

This really (still) feels to me like it's currently a research project, not a
feature to add to a mature language used in production.

Only when there's a working prototype that can explore most questions about
"how does X behave with unknowns?"

There are about 400 opcodes in perl. I suspect that >90% are easy to figure
out for "unknown" (for example as "what would a NaN do here?") but a few
really aren't going to be obvious, or end up being trade offs between
conceptual correctness and what it's actually possible to implement.

Nicholas Clark

Re: Pre-RFC: `unknown` versus `undef` [ In reply to ]

davidnicol at gmail

Dec 20, 2021, 8:30 AM

Post #37 of 38 (872 views)

Permalink

I would write that, clearly and efficiently I hope, using a low priority
short-circuiting boolean operator:

my $total = grep { defined ( $_->value ) and $_->value > $limit }
@things;

no reason to abuse the ternary op like that.

On Sat, Dec 18, 2021 at 5:09 AM Ovid via perl5-porters <
perl5-porters@perl.org> wrote:

>
> I, for one, am tired of writing code like this:
>
> my $total = grep { defined $_->value ? $_->value > $limit : 0 }
> @things;
>
> Note: the following is *not* equivalent to the above:
>
> my $total = grep { ( $_->value // 0 ) > $limit } @things;
>
> I mean, it *looks* correct, but what if the value can be a negative number
> and the limit can be negative? You probably than want this:
>
> my $total = grep { ( $_->value // ( $limit - 1 ) ) > $limit } @things;
>
> Which arguably might be more confusing than using defined. With 3VL, we
> have this:
>
> my $total = grep { $_->resolution < $limit } @things;
>
> Worse, I'm tired of tracking down bugs caused by this.
>

--
"Lay off that whiskey, and let that cocaine be!" -- Johnny Cash

Re: Pre-RFC: `unknown` versus `undef` [ In reply to ]

neilb at neilb

Jun 10, 2022, 10:34 AM

Post #38 of 38 (677 views)

Permalink

This Pre-RFC had slipped through the net, until Tomasz caught it, so I’ve just added it to the tracker.

Given Nicholas's comments and Curtis’s response of "OK, I can't argue against Nicholas' points", I have marked it as rejected.

Neil