Mailing List Archive: Pre-RFC: `unknown` versus `undef`

Pre-RFC: `unknown` versus `undef`

perl5-porters at perl

Dec 18, 2021, 12:57 AM

Post #1 of 38 (1988 views)

Hi there,

As most of you know, "undef" values often cause all sorts of interesting bugs in Perl. I wrote https://metacpan.org/pod/Unknown::Values to address this. Instead of the 2VL that undef uses, it uses Kleene's traditional 3VL (three-value logic) akin to SQL's NULL.

Basic usage looks like this:

use Unknown::Values;

  my $value = unknown;
  my @array = ( 1, 2, 3, $value, 4, 5 );
  my @less = grep { $_ < 4 } @array; # (1,2,3)
  my @greater = grep { $_ > 3 } @array; # (4,5)

  my @underpaid;
  foreach my $employee (@employees) {

  # this will never return true if salary is "unknown"
  if ( $employee->salary < $threshold ) {
      push @underpaid => $employee;
  }
  }

I've also written about this here: http://blogs.perl.org/users/ovid/2013/02/three-value-logic-in-perl.html

I've always thought this belongs directly in a programming language, but never suggested this because I assumed there would be no interest

To my surprise, brian d foy suggested it be in the core (https://twitter.com/briandfoy_perl/status/1471684211602042880)

He wrote: "Unknown::Value from @OvidPerl looks very interesting. These objects can't compare, do math, or most of the other default behavior that undef allows. This would be awesome in core."

Would there be interest?

Best,
Ovid
--
IT consulting, training, specializing in Perl, databases, and agile development
http://www.allaroundtheworld.fr/.

Buy my book! - http://bit.ly/beginning_perl

Re: Pre-RFC: `unknown` versus `undef` [ In reply to ]

perl5-porters at perl

Dec 18, 2021, 1:06 AM

Post #2 of 38 (1988 views)

[top-posting]

I should add that what would make this even more interesting if we could do this:

use feature '3vl_undef'; # terrible name

And all undef values in the current lexical scope would use 3vl logic instead of 2vl. This would like "use strict for undef". I firmly believe many bugs could be avoided this way.

Best,
Ovid
--
IT consulting, training, specializing in Perl, databases, and agile development
http://www.allaroundtheworld.fr/.

Buy my book! - http://bit.ly/beginning_perl

On Saturday, 18 December 2021, 09:57:32 CET, Ovid via perl5-porters <perl5-porters@perl.org> wrote:

Hi there,

As most of you know, "undef" values often cause all sorts of interesting bugs in Perl. I wrote https://metacpan.org/pod/Unknown::Values to address this. Instead of the 2VL that undef uses, it uses Kleene's traditional 3VL (three-value logic) akin to SQL's NULL.

Basic usage looks like this:

use Unknown::Values;

  my $value = unknown;
  my @array = ( 1, 2, 3, $value, 4, 5 );
  my @less = grep { $_ < 4 } @array; # (1,2,3)
  my @greater = grep { $_ > 3 } @array; # (4,5)

  my @underpaid;
  foreach my $employee (@employees) {

  # this will never return true if salary is "unknown"
  if ( $employee->salary < $threshold ) {
      push @underpaid => $employee;
  }
  }

I've also written about this here: http://blogs.perl.org/users/ovid/2013/02/three-value-logic-in-perl.html

I've always thought this belongs directly in a programming language, but never suggested this because I assumed there would be no interest

To my surprise, brian d foy suggested it be in the core (https://twitter.com/briandfoy_perl/status/1471684211602042880)

He wrote: "Unknown::Value from @OvidPerl looks very interesting. These objects can't compare, do math, or most of the other default behavior that undef allows. This would be awesome in core."

Would there be interest?

Best,
Ovid
--
IT consulting, training, specializing in Perl, databases, and agile development
http://www.allaroundtheworld.fr/.

Buy my book! - http://bit.ly/beginning_perl

Re: Pre-RFC: `unknown` versus `undef` [ In reply to ]

Dec 18, 2021, 1:43 AM

Post #3 of 38 (1988 views)

On Sat, 18 Dec 2021 08:57:05 +0000 (UTC), Ovid via perl5-porters <perl5-porters@perl.org> wrote:

> Hi there,
>
> As most of you know, "undef" values often cause all sorts of interesting bugs in Perl. I wrote https://metacpan.org/pod/Unknown::Values to address this. Instead of the 2VL that undef uses, it uses Kleene's traditional 3VL (three-value logic) akin to SQL's NULL.

1. Known defined value
2. Known undefined value
3. Unknown value
4. Unauthorized to get the value
5. Value is defined but unauthorized to get it

When doing 3VL, number 4 is essential

> Basic usage looks like this:
>
> use Unknown::Values;
>
>   my $value = unknown;
>   my @array = ( 1, 2, 3, $value, 4, 5 );
>   my @less = grep { $_ < 4 } @array; # (1,2,3)
>   my @greater = grep { $_ > 3 } @array; # (4,5)
>
>   my @underpaid;
>   foreach my $employee (@employees) {
>
>   # this will never return true if salary is "unknown"
>   if ($employee->salary < $threshol ) {
>       push @underpaid => $employee;
>   }
> }
>
> I've also written about this here: http://blogs.perl.org/users/ovid/2013/02/three-value-logic-in-perl.html
>
> I've always thought this belongs directly in a programming language, but never suggested this because I assumed there would be no interest
>
> To my surprise, brian d foy suggested it be in the core (https://twitter.com/briandfoy_perl/status/1471684211602042880)
>
> He wrote: "Unknown::Value from @OvidPerl looks very interesting. These objects can't compare, do math, or most of the other default behavior that undef allows. This would be awesome in core."
>
> Would there be interest?

Yes, when 4VL (or 5VL)

Thinking about it, there might be a hook to add 5, 6, 7, 8, 9 etc :)

> Ovid

--
H.Merijn Brand https://tux.nl Perl Monger http://amsterdam.pm.org/
using perl5.00307 .. 5.33 porting perl5 on HP-UX, AIX, and Linux
https://tux.nl/email.html http://qa.perl.org https://www.test-smoke.org

Re: Pre-RFC: `unknown` versus `undef` [ In reply to ]

darren at darrenduncan

Dec 18, 2021, 2:19 AM

Post #4 of 38 (1988 views)

For the record, which I partially discussed on a related Twitter thread a few
days ago, I feel that using anything other than 2VL in a fundamental capacity is
a serious mistake, and if anyone is considering 3VL, 4VL, etc then they probably
have a design flaw that should be corrected in some other way.

All the regular types and operators should operate with pure 2VL, including all
the regular equality or comparison or sorting operators. The behavior of
regular operators should not be overloaded or overridden, lexically or
otherwise, so that they behave in a 3+VL manner. This would be a huge source of
bugs where people look at code expecting certain behavior and getting something
else.

How something like a special Unknown value should work is that it provides a set
of operators/subs with DIFFERENT NAMES that provide the 3VL etc logic, and so
for example one writes:

eq_3vl($x,$y)
lt_3vl($x,$y)
grep_3vl(...)

Or for simple 3VL you don't even need the special Unknown value, instead these
operators can treat the standard undef that way. The Unknown value is more
useful if you want to override the behavior of existing operators.

As for 4VL, 5VL, etc, once you even start thinking about that, there's an even
stronger case that what you really should be using is 2VL with a bunch of
singleton types, where each singleton represents a specific reason a normal
value is missing, such as:

Unknown
Not Applicable
Permission Denied
Record Not Found
etc

The idea of changing the behavior of undef even lexically with a feature is
problematic. What if someone sees code in such a file and copies it into
another file, or in reverse, where the other doesn't have that feature declared,
then code which is exactly the same has changed behavior.

As a compromise, I would find either of these 2 things acceptable:

1. Have an Unknown::Values or whatever singleton class which overrides built-in
operators/subs but its effects are tightly bound to instances of that class.

2. Declare new operators/subs with new names that provide 3VL with standard undefs.

Those provide this 3VL opt-in and explicitly if users want it, and its
relatively easy for users reading the code later to know its using 3VL.

But behavior of built-in operators or undefs should never change as the result
of a feature pragma or such.

Also, SQL NULLs are not actually 3VL, they are much more complicated than that,
and we don't want to try and imitate SQL if we want to provide 3VL.

-- Darren Duncan

On 2021-12-18 1:43 a.m., H.Merijn Brand wrote:
> On Sat, 18 Dec 2021 08:57:05 +0000 (UTC), Ovid via perl5-porters <perl5-porters@perl.org> wrote:
>
>> Hi there,
>>
>> As most of you know, "undef" values often cause all sorts of interesting bugs in Perl. I wrote https://metacpan.org/pod/Unknown::Values to address this. Instead of the 2VL that undef uses, it uses Kleene's traditional 3VL (three-value logic) akin to SQL's NULL.
>
> 1. Known defined value
> 2. Known undefined value
> 3. Unknown value
> 4. Unauthorized to get the value
> 5. Value is defined but unauthorized to get it
>
> When doing 3VL, number 4 is essential
>
>> Basic usage looks like this:
>>
>> use Unknown::Values;
>>
>>   my $value = unknown;
>>   my @array = ( 1, 2, 3, $value, 4, 5 );
>>   my @less = grep { $_ < 4 } @array; # (1,2,3)
>>   my @greater = grep { $_ > 3 } @array; # (4,5)
>>
>>   my @underpaid;
>>   foreach my $employee (@employees) {
>>
>>   # this will never return true if salary is "unknown"
>>   if ($employee->salary < $threshol ) {
>>       push @underpaid => $employee;
>>   }
>> }
>>
>> I've also written about this here: http://blogs.perl.org/users/ovid/2013/02/three-value-logic-in-perl.html
>>
>> I've always thought this belongs directly in a programming language, but never suggested this because I assumed there would be no interest
>>
>> To my surprise, brian d foy suggested it be in the core (https://twitter.com/briandfoy_perl/status/1471684211602042880)
>>
>> He wrote: "Unknown::Value from @OvidPerl looks very interesting. These objects can't compare, do math, or most of the other default behavior that undef allows. This would be awesome in core."
>>
>> Would there be interest?
>
> Yes, when 4VL (or 5VL)
>
> Thinking about it, there might be a hook to add 5, 6, 7, 8, 9 etc :)
>
>> Ovid
>

Re: Pre-RFC: `unknown` versus `undef` [ In reply to ]

perl5-porters at perl

Dec 18, 2021, 3:09 AM

Post #5 of 38 (1988 views)

Yes, SQL NULL is broken in fundamental ways that CJ Date shows here: https://www.oreilly.com/library/view/sql-and-relational/9781449319724/ch04s04.html

And yes, I've been bitten by that bug in SQL in real-world code. Once. In over two decades. And I write lots of SQL. *Most* of the time, however, the 3VL NULL is what we need. Can you imagine if NULL followed "undef" behavior?

SELECT count(*) FROM things WHERE value > ?;

That would be a disaster and it's easily replicable in Perl:

my $total = grep { $_->value > $limit } @things;

I, for one, am tired of writing code like this:

my $total = grep { defined $_->value ? $_->value > $limit : 0 } @things;

Note: the following is *not* equivalent to the above:

my $total = grep { ( $_->value // 0 ) > $limit } @things;

I mean, it *looks* correct, but what if the value can be a negative number and the limit can be negative? You probably than want this:

my $total = grep { ( $_->value // ( $limit - 1 ) ) > $limit } @things;

Which arguably might be more confusing than using defined. With 3VL, we have this:

my $total = grep { $_->resolution < $limit } @things;

Worse, I'm tired of tracking down bugs caused by this.

2VL logic on undef/null values been broken for a long time and forces developers to remember to always write special case code to handle this.

However, while we could correct the underlying issue, going further into 4VL or 5VL adds complications that I doubt most developers are going to understand. In other words, SIMPLICITY IS YOUR FRIEND.

We don't need "perfect" because making something that covers all possible cases is simply going to be a mess and might even be counter-productive. For example, if you're unauthorized to get a value but you see that it's a "known defined value", that's an information leak. Also, given Merijn's original list:

1. Known defined value
2. Known undefined value
3. Unknown value
4. Unauthorized to get the value
5. Value is defined but unauthorized to get it

I don't see how 4+1 is different from 5. So we can bikeshed this to death, or fix the major underlying problem: $salary += 1000. Congrats. You've just given a raise to an unpaid volunteer.

Best,
Ovid
--
IT consulting, training, specializing in Perl, databases, and agile development
http://www.allaroundtheworld.fr/.

Buy my book! - http://bit.ly/beginning_perl

On Saturday, 18 December 2021, 11:20:06 CET, Darren Duncan <darren@darrenduncan.net> wrote:

For the record, which I partially discussed on a related Twitter thread a few
days ago, I feel that using anything other than 2VL in a fundamental capacity is
a serious mistake, and if anyone is considering 3VL, 4VL, etc then they probably
have a design flaw that should be corrected in some other way.

All the regular types and operators should operate with pure 2VL, including all
the regular equality or comparison or sorting operators. The behavior of
regular operators should not be overloaded or overridden, lexically or
otherwise, so that they behave in a 3+VL manner. This would be a huge source of
bugs where people look at code expecting certain behavior and getting something
else.

How something like a special Unknown value should work is that it provides a set
of operators/subs with DIFFERENT NAMES that provide the 3VL etc logic, and so
for example one writes:

eq_3vl($x,$y)
lt_3vl($x,$y)
grep_3vl(...)

Or for simple 3VL you don't even need the special Unknown value, instead these
operators can treat the standard undef that way. The Unknown value is more
useful if you want to override the behavior of existing operators.

As for 4VL, 5VL, etc, once you even start thinking about that, there's an even
stronger case that what you really should be using is 2VL with a bunch of
singleton types, where each singleton represents a specific reason a normal
value is missing, such as:

Unknown
Not Applicable
Permission Denied
Record Not Found
etc

The idea of changing the behavior of undef even lexically with a feature is
problematic. What if someone sees code in such a file and copies it into
another file, or in reverse, where the other doesn't have that feature declared,
then code which is exactly the same has changed behavior.

As a compromise, I would find either of these 2 things acceptable:

1. Have an Unknown::Values or whatever singleton class which overrides built-in
operators/subs but its effects are tightly bound to instances of that class.

2. Declare new operators/subs with new names that provide 3VL with standard undefs.

Those provide this 3VL opt-in and explicitly if users want it, and its
relatively easy for users reading the code later to know its using 3VL.

But behavior of built-in operators or undefs should never change as the result
of a feature pragma or such.

Also, SQL NULLs are not actually 3VL, they are much more complicated than that,
and we don't want to try and imitate SQL if we want to provide 3VL.

-- Darren Duncan

On 2021-12-18 1:43 a.m., H.Merijn Brand wrote:
> On Sat, 18 Dec 2021 08:57:05 +0000 (UTC), Ovid via perl5-porters <perl5-porters@perl.org> wrote:
>
>> Hi there,
>>
>> As most of you know, "undef" values often cause all sorts of interesting bugs in Perl. I wrote https://metacpan.org/pod/Unknown::Values to address this. Instead of the 2VL that undef uses, it uses Kleene's traditional 3VL (three-value logic) akin to SQL's NULL.
>
> 1. Known defined value
> 2. Known undefined value
> 3. Unknown value
> 4. Unauthorized to get the value
> 5. Value is defined but unauthorized to get it
>
> When doing 3VL, number 4 is essential
>
>> Basic usage looks like this:
>>
>> use Unknown::Values;
>>
>>   my $value = unknown;
>>   my @array = ( 1, 2, 3, $value, 4, 5 );
>>   my @less = grep { $_ < 4 } @array; # (1,2,3)
>>   my @greater = grep { $_ > 3 } @array; # (4,5)
>>
>>   my @underpaid;
>>   foreach my $employee (@employees) {
>>
>>   # this will never return true if salary is "unknown"
>>   if ($employee->salary < $threshol ) {
>>       push @underpaid => $employee;
>>   }
>>    }
>>
>> I've also written about this here: http://blogs.perl.org/users/ovid/2013/02/three-value-logic-in-perl.html
>>
>> I've always thought this belongs directly in a programming language, but never suggested this because I assumed there would be no interest
>>
>> To my surprise, brian d foy suggested it be in the core (https://twitter.com/briandfoy_perl/status/1471684211602042880)
>>
>> He wrote: "Unknown::Value from @OvidPerl looks very interesting. These objects can't compare, do math, or most of the other default behavior that undef allows. This would be awesome in core."
>>
>> Would there be interest?
>
> Yes, when 4VL (or 5VL)
>
> Thinking about it, there might be a hook to add 5, 6, 7, 8, 9 etc :)
>
>> Ovid
>

Re: Pre-RFC: `unknown` versus `undef` [ In reply to ]

perl5-porters at perl

Dec 18, 2021, 4:20 AM

Post #6 of 38 (1988 views)

Top posting a few points:

1. in DBI:: modules, NULL is equivalent to undef
2. in real world tables, "employee" usually has a "type" field (or external way to derive this attribute); volunteers (interns?), salaried, and hourlies are not differentiated by the value of the "salary" field
3. seems like the polymorphic OOP solution is to just have:

* a way to "get_employee_type" on the employee instance, and
* throw an exception if $emp->get_salary is called on $emp,
C<if ($emp->get_employee_type == VOLUNTOLD)>.

I can add that the nuances between undef and q{} have caused me confustion
in the past; but adding another special value to mean a type of "nothing"
could be problematic. For example, to point #1 above, how is a DBI call to
know what you mean when it already treats undef as NULL when a) replacing
place holders (used almost universally), or b) turn NULLs into undef when
returning results from a C<select_*> call?

With C<use warnings>,

TRUE:
(q{} == 0)
(undef == 0)

FALSE
(q{} eq 0)
(undef eq 0)

What makes matters somewhat more confusing is that C<int> returns 0 for
both, even with C<use warnings>:

int undef
int q{}

Though, this is implied in the TRUE/FALSE examples provided above. In
addition to this, it seems like this would mess with C<defined>; would
this add a possible answer to that? I'm just now able to remember that
C<defined undef> is rightly different than C<defined 0> or C<defined q{}>.

Anyway, I can't say I've ever wanted more out of a numerical field
other than its value. I'd never use undef or NULL to indicate anything
beyond that this field was not set with an actual value. And if I wanted
to know of $employee was a volunteer, I'd consult the "employee_type"
column.

Seems like this is best left to a module that overrides operators; but
the question for me remains - how would you represent this in a traditional
database other than storing either as a separate column or some sort
of composite value (e.g,; "0;volunteer") that this value is not to be
taken like the rest of the numbers?

Cheers,
Brett

* Ovid via perl5-porters <perl5-porters@perl.org> [2021-12-18 11:09:02 +0000]:

> Yes, SQL NULL is broken in fundamental ways that CJ Date shows here: https://www.oreilly.com/library/view/sql-and-relational/9781449319724/ch04s04.html
>
> And yes, I've been bitten by that bug in SQL in real-world code. Once. In over two decades. And I write lots of SQL. *Most* of the time, however, the 3VL NULL is what we need. Can you imagine if NULL followed "undef" behavior?
>
> ? ? SELECT count(*) FROM things WHERE value > ?;
>
> That would be a disaster and it's easily replicable in Perl:
>
> ? ? my $total = grep { $_->value > $limit } @things;
>
> I, for one, am tired of writing code like this:
>
> ? ? my $total = grep { defined $_->value ? $_->value > $limit : 0 } @things;
>
> Note: the following is *not* equivalent to the above:
>
> ? ? my $total = grep { ( $_->value // 0 )? > $limit } @things;
>
> I mean, it *looks* correct, but what if the value can be a negative number and the limit can be negative? You probably than want this:
>
> ? ? my $total = grep { ( $_->value // ( $limit - 1 ) )? > $limit } @things;
>
> Which arguably might be more confusing than using defined. With 3VL, we have this:
>
> ? ? my $total = grep { $_->resolution < $limit } @things;
>
> Worse, I'm tired of tracking down bugs caused by this.
>
> 2VL logic on undef/null values been broken for a long time and forces developers to remember to always write special case code to handle this.
>
> However, while we could correct the underlying issue, going further into 4VL or 5VL adds complications that I doubt most developers are going to understand. In other words, SIMPLICITY IS YOUR FRIEND.
>
> We don't need "perfect" because making something that covers all possible cases is simply going to be a mess and might even be counter-productive. For example, if you're unauthorized to get a value but you see that it's a "known defined value", that's an information leak. Also, given Merijn's original list:
>
> 1. Known defined value
> 2. Known undefined value
> 3. Unknown value
> 4. Unauthorized to get the value
> 5. Value is defined but unauthorized to get it
>
> I don't see how 4+1 is different from 5. So we can bikeshed this to death, or fix the major underlying problem: $salary += 1000. Congrats. You've just given a raise to an unpaid volunteer.
>
> Best,
> Ovid
> --?
> IT consulting, training, specializing in Perl, databases, and agile development
> http://www.allaroundtheworld.fr/.?
>
> Buy my book! - http://bit.ly/beginning_perl
>
>
>
>
>
>
> On Saturday, 18 December 2021, 11:20:06 CET, Darren Duncan <darren@darrenduncan.net> wrote:
>
>
>
>
>
> For the record, which I partially discussed on a related Twitter thread a few
> days ago, I feel that using anything other than 2VL in a fundamental capacity is
> a serious mistake, and if anyone is considering 3VL, 4VL, etc then they probably
> have a design flaw that should be corrected in some other way.
>
> All the regular types and operators should operate with pure 2VL, including all
> the regular equality or comparison or sorting operators.? The behavior of
> regular operators should not be overloaded or overridden, lexically or
> otherwise, so that they behave in a 3+VL manner.? This would be a huge source of
> bugs where people look at code expecting certain behavior and getting something
> else.
>
> How something like a special Unknown value should work is that it provides a set
> of operators/subs with DIFFERENT NAMES that provide the 3VL etc logic, and so
> for example one writes:
>
> ? eq_3vl($x,$y)
> ? lt_3vl($x,$y)
> ? grep_3vl(...)
>
> Or for simple 3VL you don't even need the special Unknown value, instead these
> operators can treat the standard undef that way.? The Unknown value is more
> useful if you want to override the behavior of existing operators.
>
> As for 4VL, 5VL, etc, once you even start thinking about that, there's an even
> stronger case that what you really should be using is 2VL with a bunch of
> singleton types, where each singleton represents a specific reason a normal
> value is missing, such as:
>
> ? Unknown
> ? Not Applicable
> ? Permission Denied
> ? Record Not Found
> ? etc
>
> The idea of changing the behavior of undef even lexically with a feature is
> problematic.? What if someone sees code in such a file and copies it into
> another file, or in reverse, where the other doesn't have that feature declared,
> then code which is exactly the same has changed behavior.
>
> As a compromise, I would find either of these 2 things acceptable:
>
> 1. Have an Unknown::Values or whatever singleton class which overrides built-in
> operators/subs but its effects are tightly bound to instances of that class.
>
> 2. Declare new operators/subs with new names that provide 3VL with standard undefs.
>
> Those provide this 3VL opt-in and explicitly if users want it, and its
> relatively easy for users reading the code later to know its using 3VL.
>
> But behavior of built-in operators or undefs should never change as the result
> of a feature pragma or such.
>
> Also, SQL NULLs are not actually 3VL, they are much more complicated than that,
> and we don't want to try and imitate SQL if we want to provide 3VL.
>
> -- Darren Duncan
>
> On 2021-12-18 1:43 a.m., H.Merijn Brand wrote:
> > On Sat, 18 Dec 2021 08:57:05 +0000 (UTC), Ovid via perl5-porters <perl5-porters@perl.org> wrote:
> >
> >> Hi there,
> >>
> >> As most of you know, "undef" values often cause all sorts of interesting bugs in Perl. I wrote?https://metacpan.org/pod/Unknown::Values?to address this. Instead of the 2VL that undef uses, it uses?Kleene's traditional 3VL?(three-value logic) akin to SQL's NULL.
> >
> >? 1. Known defined value
> >? 2. Known undefined value
> >? 3. Unknown value
> >? 4. Unauthorized to get the value
> >? 5. Value is defined but unauthorized to get it
> >
> > When doing 3VL, number 4 is essential
> >
> >> Basic usage looks like this:
> >>
> >>? ? ? use Unknown::Values;
> >>?
> >>? ? ??my $value? = unknown;
> >>? ? ??my @array? = ( 1, 2, 3, $value, 4, 5 );
> >>? ? ??my @less? ? = grep { $_ < 4 } @array;? ?# (1,2,3)
> >>? ? ??my @greater = grep { $_ > 3 } @array;? ?# (4,5)
> >>?
> >>? ? ??my @underpaid;
> >>? ? ??foreach my $employee (@employees) {
> >>? ? ?
> >>? ? ? ? ??# this will never return true if salary is "unknown"
> >>? ? ? ? ??if ($employee->salary < $threshol ) {
> >>? ? ? ? ?? ?? push @underpaid => $employee;
> >>? ? ? ? ? ? ??}
> >>? ? ?? ? ?}
> >>
> >> I've also written about this here:?http://blogs.perl.org/users/ovid/2013/02/three-value-logic-in-perl.html
> >>
> >> I've always thought this belongs directly in a programming language, but never suggested this because I assumed there would be no interest
> >>
> >> To my surprise, brian d foy suggested it be in the core (https://twitter.com/briandfoy_perl/status/1471684211602042880)
> >>
> >> He wrote: "Unknown::Value from @OvidPerl looks very interesting. These objects can't compare, do math, or most of the other default behavior that undef allows. This would be awesome in core."
> >>
> >> Would there be interest?
> >
> > Yes, when 4VL (or 5VL)
> >
> > Thinking about it, there might be a hook to add 5, 6, 7, 8, 9 etc :)
> >
> >> Ovid
> >
>
>

--
--
oodler@cpan.org
oodler577@sdf-eu.org
SDF-EU Public Access UNIX System - http://sdfeu.org
irc.perl.org #openmp #pdl #native

Re: Pre-RFC: `unknown` versus `undef` [ In reply to ]

perl5-porters at perl

Dec 18, 2021, 4:27 AM

Post #7 of 38 (1988 views)

* Oodler 577 via perl5-porters <perl5-porters@perl.org> [2021-12-18 12:20:35 +0000]:

> Top posting a few points:
>
> 1. in DBI:: modules, NULL is equivalent to undef
> 2. in real world tables, "employee" usually has a "type" field (or external way to derive this attribute); volunteers (interns?), salaried, and hourlies are not differentiated by the value of the "salary" field
> 3. seems like the polymorphic OOP solution is to just have:
>
> * a way to "get_employee_type" on the employee instance, and
> * throw an exception if $emp->get_salary is called on $emp,
> C<if ($emp->get_employee_type == VOLUNTOLD)>.
>
> I can add that the nuances between undef and q{} have caused me confustion
> in the past; but adding another special value to mean a type of "nothing"
> could be problematic. For example, to point #1 above, how is a DBI call to
> know what you mean when it already treats undef as NULL when a) replacing
> place holders (used almost universally), or b) turn NULLs into undef when
> returning results from a C<select_*> call?
>
> With C<use warnings>,

oof - forget to add; with warnings on the followings all yell at you; but
even if not enabled; C<q{}> and C<undef> get coerced to C<0> in the numerical
context; and in the string context C<undef> gets coerced to C<q{}>.

>
> TRUE:
> (q{} == 0)
> (undef == 0)
>
> FALSE
> (q{} eq 0)
> (undef eq 0)
>
> What makes matters somewhat more confusing is that C<int> returns 0 for
> both, even with C<use warnings>:
>
> int undef
> int q{}
>
> Though, this is implied in the TRUE/FALSE examples provided above. In
> addition to this, it seems like this would mess with C<defined>; would
> this add a possible answer to that? I'm just now able to remember that
> C<defined undef> is rightly different than C<defined 0> or C<defined q{}>.
>
> Anyway, I can't say I've ever wanted more out of a numerical field
> other than its value. I'd never use undef or NULL to indicate anything
> beyond that this field was not set with an actual value. And if I wanted
> to know of $employee was a volunteer, I'd consult the "employee_type"
> column.
>
> Seems like this is best left to a module that overrides operators; but
> the question for me remains - how would you represent this in a traditional
> database other than storing either as a separate column or some sort
> of composite value (e.g,; "0;volunteer") that this value is not to be
> taken like the rest of the numbers?
>
> Cheers,
> Brett
>
> * Ovid via perl5-porters <perl5-porters@perl.org> [2021-12-18 11:09:02 +0000]:
>
> > Yes, SQL NULL is broken in fundamental ways that CJ Date shows here: https://www.oreilly.com/library/view/sql-and-relational/9781449319724/ch04s04.html
> >
> > And yes, I've been bitten by that bug in SQL in real-world code. Once. In over two decades. And I write lots of SQL. *Most* of the time, however, the 3VL NULL is what we need. Can you imagine if NULL followed "undef" behavior?
> >
> > ? ? SELECT count(*) FROM things WHERE value > ?;
> >
> > That would be a disaster and it's easily replicable in Perl:
> >
> > ? ? my $total = grep { $_->value > $limit } @things;
> >
> > I, for one, am tired of writing code like this:
> >
> > ? ? my $total = grep { defined $_->value ? $_->value > $limit : 0 } @things;
> >
> > Note: the following is *not* equivalent to the above:
> >
> > ? ? my $total = grep { ( $_->value // 0 )? > $limit } @things;
> >
> > I mean, it *looks* correct, but what if the value can be a negative number and the limit can be negative? You probably than want this:
> >
> > ? ? my $total = grep { ( $_->value // ( $limit - 1 ) )? > $limit } @things;
> >
> > Which arguably might be more confusing than using defined. With 3VL, we have this:
> >
> > ? ? my $total = grep { $_->resolution < $limit } @things;
> >
> > Worse, I'm tired of tracking down bugs caused by this.
> >
> > 2VL logic on undef/null values been broken for a long time and forces developers to remember to always write special case code to handle this.
> >
> > However, while we could correct the underlying issue, going further into 4VL or 5VL adds complications that I doubt most developers are going to understand. In other words, SIMPLICITY IS YOUR FRIEND.
> >
> > We don't need "perfect" because making something that covers all possible cases is simply going to be a mess and might even be counter-productive. For example, if you're unauthorized to get a value but you see that it's a "known defined value", that's an information leak. Also, given Merijn's original list:
> >
> > 1. Known defined value
> > 2. Known undefined value
> > 3. Unknown value
> > 4. Unauthorized to get the value
> > 5. Value is defined but unauthorized to get it
> >
> > I don't see how 4+1 is different from 5. So we can bikeshed this to death, or fix the major underlying problem: $salary += 1000. Congrats. You've just given a raise to an unpaid volunteer.
> >
> > Best,
> > Ovid
> > --?
> > IT consulting, training, specializing in Perl, databases, and agile development
> > http://www.allaroundtheworld.fr/.?
> >
> > Buy my book! - http://bit.ly/beginning_perl
> >
> >
> >
> >
> >
> >
> > On Saturday, 18 December 2021, 11:20:06 CET, Darren Duncan <darren@darrenduncan.net> wrote:
> >
> >
> >
> >
> >
> > For the record, which I partially discussed on a related Twitter thread a few
> > days ago, I feel that using anything other than 2VL in a fundamental capacity is
> > a serious mistake, and if anyone is considering 3VL, 4VL, etc then they probably
> > have a design flaw that should be corrected in some other way.
> >
> > All the regular types and operators should operate with pure 2VL, including all
> > the regular equality or comparison or sorting operators.? The behavior of
> > regular operators should not be overloaded or overridden, lexically or
> > otherwise, so that they behave in a 3+VL manner.? This would be a huge source of
> > bugs where people look at code expecting certain behavior and getting something
> > else.
> >
> > How something like a special Unknown value should work is that it provides a set
> > of operators/subs with DIFFERENT NAMES that provide the 3VL etc logic, and so
> > for example one writes:
> >
> > ? eq_3vl($x,$y)
> > ? lt_3vl($x,$y)
> > ? grep_3vl(...)
> >
> > Or for simple 3VL you don't even need the special Unknown value, instead these
> > operators can treat the standard undef that way.? The Unknown value is more
> > useful if you want to override the behavior of existing operators.
> >
> > As for 4VL, 5VL, etc, once you even start thinking about that, there's an even
> > stronger case that what you really should be using is 2VL with a bunch of
> > singleton types, where each singleton represents a specific reason a normal
> > value is missing, such as:
> >
> > ? Unknown
> > ? Not Applicable
> > ? Permission Denied
> > ? Record Not Found
> > ? etc
> >
> > The idea of changing the behavior of undef even lexically with a feature is
> > problematic.? What if someone sees code in such a file and copies it into
> > another file, or in reverse, where the other doesn't have that feature declared,
> > then code which is exactly the same has changed behavior.
> >
> > As a compromise, I would find either of these 2 things acceptable:
> >
> > 1. Have an Unknown::Values or whatever singleton class which overrides built-in
> > operators/subs but its effects are tightly bound to instances of that class.
> >
> > 2. Declare new operators/subs with new names that provide 3VL with standard undefs.
> >
> > Those provide this 3VL opt-in and explicitly if users want it, and its
> > relatively easy for users reading the code later to know its using 3VL.
> >
> > But behavior of built-in operators or undefs should never change as the result
> > of a feature pragma or such.
> >
> > Also, SQL NULLs are not actually 3VL, they are much more complicated than that,
> > and we don't want to try and imitate SQL if we want to provide 3VL.
> >
> > -- Darren Duncan
> >
> > On 2021-12-18 1:43 a.m., H.Merijn Brand wrote:
> > > On Sat, 18 Dec 2021 08:57:05 +0000 (UTC), Ovid via perl5-porters <perl5-porters@perl.org> wrote:
> > >
> > >> Hi there,
> > >>
> > >> As most of you know, "undef" values often cause all sorts of interesting bugs in Perl. I wrote?https://metacpan.org/pod/Unknown::Values?to address this. Instead of the 2VL that undef uses, it uses?Kleene's traditional 3VL?(three-value logic) akin to SQL's NULL.
> > >
> > >? 1. Known defined value
> > >? 2. Known undefined value
> > >? 3. Unknown value
> > >? 4. Unauthorized to get the value
> > >? 5. Value is defined but unauthorized to get it
> > >
> > > When doing 3VL, number 4 is essential
> > >
> > >> Basic usage looks like this:
> > >>
> > >>? ? ? use Unknown::Values;
> > >>?
> > >>? ? ??my $value? = unknown;
> > >>? ? ??my @array? = ( 1, 2, 3, $value, 4, 5 );
> > >>? ? ??my @less? ? = grep { $_ < 4 } @array;? ?# (1,2,3)
> > >>? ? ??my @greater = grep { $_ > 3 } @array;? ?# (4,5)
> > >>?
> > >>? ? ??my @underpaid;
> > >>? ? ??foreach my $employee (@employees) {
> > >>? ? ?
> > >>? ? ? ? ??# this will never return true if salary is "unknown"
> > >>? ? ? ? ??if ($employee->salary < $threshol ) {
> > >>? ? ? ? ?? ?? push @underpaid => $employee;
> > >>? ? ? ? ? ? ??}
> > >>? ? ?? ? ?}
> > >>
> > >> I've also written about this here:?http://blogs.perl.org/users/ovid/2013/02/three-value-logic-in-perl.html
> > >>
> > >> I've always thought this belongs directly in a programming language, but never suggested this because I assumed there would be no interest
> > >>
> > >> To my surprise, brian d foy suggested it be in the core (https://twitter.com/briandfoy_perl/status/1471684211602042880)
> > >>
> > >> He wrote: "Unknown::Value from @OvidPerl looks very interesting. These objects can't compare, do math, or most of the other default behavior that undef allows. This would be awesome in core."
> > >>
> > >> Would there be interest?
> > >
> > > Yes, when 4VL (or 5VL)
> > >
> > > Thinking about it, there might be a hook to add 5, 6, 7, 8, 9 etc :)
> > >
> > >> Ovid
> > >
> >
> >
>
> --
> --
> oodler@cpan.org
> oodler577@sdf-eu.org
> SDF-EU Public Access UNIX System - http://sdfeu.org
> irc.perl.org #openmp #pdl #native
>

--
--
oodler@cpan.org
oodler577@sdf-eu.org
SDF-EU Public Access UNIX System - http://sdfeu.org
irc.perl.org #openmp #pdl #native

Re: Pre-RFC: `unknown` versus `undef` [ In reply to ]

perl5-porters at perl

Dec 18, 2021, 6:15 AM

Post #8 of 38 (1988 views)

On Saturday, 18 December 2021, 13:21:03 CET, Oodler 577 via perl5-porters <perl5-porters@perl.org> wrote:

> 1. in DBI:: modules, NULL is equivalent to undef

I wish that was true, but the behavior of NULL and undef are very different.

I've thought about writing a component for this for DBIx::Class, if `undef` values returned `unknown` instead, then the semantics of your Perl would behave much closer to the semantics of the database, eliminating another aspect of the object-relational impedance mismatch. You couldn't catch all of the cases, but there are many you would.

Heck, can we even force a warning on `my $foo; $foo++`? I don't think so because I thought that was special-cased in Perl. `my $foo = unknown; $foo++` issues a warning we might often like to see.

I'm not saying that `unknown` is always superior (and overloaded objects are *slow*), but in most code dealing with undef that I see, 3VL makes the code both shorter and more correct.

I admit that this would probably make Perl radically depart from some other languages, but I'm unconvinced that this would be a Bad Thing.

Best,
Ovid
--
IT consulting, training, specializing in Perl, databases, and agile development
http://www.allaroundtheworld.fr/.

Buy my book! - http://bit.ly/beginning_perl

Re: Pre-RFC: `unknown` versus `undef` [ In reply to ]

Dec 18, 2021, 8:55 AM

Post #9 of 38 (1988 views)

On Sat, 18 Dec 2021 11:09:02 +0000 (UTC), Ovid via perl5-porters <perl5-porters@perl.org> wrote:

> 2VL logic on undef/null values been broken for a long time and forces
> developers to remember to always write special case code to handle
> this.
>
> However, while we could correct the underlying issue, going further
> into 4VL or 5VL adds complications that I doubt most developers are
> going to understand. In other words, SIMPLICITY IS YOUR FRIEND.
>
> We don't need "perfect" because making something that covers all
> possible cases is simply going to be a mess and might even be
> counter-productive. For example, if you're unauthorized to get a
> value but you see that it's a "known defined value", that's an
> information leak. Also, given Merijn's original list:
>
> 1. Known defined value
> 2. Known undefined value
> 3. Unknown value
> 4. Unauthorized to get the value
> 5. Value is defined but unauthorized to get it
>
> I don't see how 4+1 is different from 5. So we can bikeshed this to
> death, or fix the major underlying problem: $salary += 1000.
> Congrats. You've just given a raise to an unpaid volunteer.

I've met one case where the design was authorized to give me back the
fact that the value is known but that I was not not authorized to get
the value by this interface. I needed higher authorization to get the
actual value. The more-value in this is that one can decide to pursue
getting the value if required for a job which can be omitted if the
feedback is that the value is unknown anyway. Edge cases, but useful.

In my example 1 implied also getting the actual value

> Ovid

--
H.Merijn Brand https://tux.nl Perl Monger http://amsterdam.pm.org/
using perl5.00307 .. 5.33 porting perl5 on HP-UX, AIX, and Linux
https://tux.nl/email.html http://qa.perl.org https://www.test-smoke.org

Re: Pre-RFC: `unknown` versus `undef` [ In reply to ]

Dec 18, 2021, 12:05 PM

Post #10 of 38 (1988 views)

On Sat, Dec 18, 2021 at 08:57:05AM +0000, Ovid via perl5-porters wrote:

> He wrote: "Unknown::Value from @OvidPerl looks very interesting. These objects can't compare, do math, or most of the other default behavior that undef allows. This would be awesome in core."
>
> Would there be interest?

So, to try to clarify, an RFC for this would need to be a design for how
`unknown` values interact with all parts of the language?

So, I think most unary and binary operators are fairly clear.

But it would also have to figure out everything else. So, say this:

$bar = $hash{$foo};

we need a spec for what $bar is, if $foo is unknown?

Which, I think, makes most sense as $bar also is unknown.

So continuing this thought experiment:

$hash{$foo} = $bar;

Where $foo is unknown, what happens? We have to specify that too...
I *guess* that this has to be a runtime error - "attempt to assign to an
unknown hash element"

etc.

Likewise:

$result = $foo ? bar() : baz();

I had thought - ternary, where $foo is unknown - can't take the true side,
can't take the false side, so logically surely it short-circuits to assign
$result as unknown.

But then that means that to be consistent with that, here:

if ($foo) {
bar();
} else {
baz();
}

if $foo is unknown we can't enter either block

and then to be consistent with that, this:

if ($foo) {
bar();
} elsif (rand < .5) {
baz();
} else {
...
}

should be treated like this:

if ($foo) {
bar();
} else {
if (rand < .5) {
baz();
} else {
...
}
}

as soon as $foo is unknown, then we can't take the if block, and we can't
take the either side of the elsif,

and this seems nuts, hence ? : ternaries can't behave the way I first
suggested...

So an RFC would have to go down all these paths, to figure out how to make
a language that is consistent in how it handles unknown in all places?

Nicholas Clark

Re: Pre-RFC: `unknown` versus `undef` [ In reply to ]

darren at darrenduncan

Dec 18, 2021, 3:02 PM

Post #11 of 38 (1988 views)

On 2021-12-18 12:05 p.m., Nicholas Clark wrote:
> Likewise:
>
> $result = $foo ? bar() : baz();
>
> I had thought - ternary, where $foo is unknown - can't take the true side,
> can't take the false side, so logically surely it short-circuits to assign
> $result as unknown.

I would agree with that.

> But then that means that to be consistent with that, here:
>
> if ($foo) {
> bar();
> } else {
> baz();
> }
>
> if $foo is unknown we can't enter either block

Yes, and that's the answer. I would argue that when given an unknown input the
effect of this block would be a no-op.

Speaking more generally, the effect of unknown could be what Perl does with an
empty statement block, which as far as I recall is do nothing and return
undef/unknown.

> So an RFC would have to go down all these paths, to figure out how to make
> a language that is consistent in how it handles unknown in all places?

Yes we do.

-- Darren Duncan

Re: Pre-RFC: `unknown` versus `undef` [ In reply to ]

leonerd at leonerd

Dec 18, 2021, 3:18 PM

Post #12 of 38 (1988 views)

On Sat, 18 Dec 2021 15:02:05 -0800
Darren Duncan <darren@darrenduncan.net> wrote:

> > But then that means that to be consistent with that, here:
> >
> > if ($foo) {
> > bar();
> > } else {
> > baz();
> > }
> >
> > if $foo is unknown we can't enter either block
>
> Yes, and that's the answer. I would argue that when given an unknown
> input the effect of this block would be a no-op.

I think if you were to present this if/else code to anyone not
intimately familiar with this three-value logic idea, and tell them
there is a value you can put in $foo which makes neither branch
execute, they would look at you strangely and consider you quite mad.

There may be a Lewis Carroll quote applicable here...

Suffice to say, at this point I really don't feel like it's something
we want to be entertaining in a regular language with regular keywords.
If such a 3VL system were added, it really wants to have its own
keywords and syntax, so people are appropriately warned that Weird
Things are afoot.

--
Paul "LeoNerd" Evans

leonerd@leonerd.org.uk | https://metacpan.org/author/PEVANS
http://www.leonerd.org.uk/ | https://www.tindie.com/stores/leonerd/

Re: Pre-RFC: `unknown` versus `undef` [ In reply to ]

darren at darrenduncan

Dec 18, 2021, 4:44 PM

Post #13 of 38 (1988 views)

I want to try and clarify my position here.

The main thing is I feel that it is important in this discussion to treat some
matters separately that I consider very distinct:

1. Plain and focused 3-valued-logic which seems to be what Ovid is actually
advocating for, we have exactly 1 special value Unknown, and we deal with how
its appearance in every situation in place of any other value affects the behavior.

2. The idea of N-valued logic, referred to by H.Merijn Brand, where you have an
arbitrary number of special values where each adds a different dimension to the
logic.

My position is that it is N-valued logic and its arbitrary count of logical
dimensions that is the largest problem here. In general its complexity is
exponential, doubling with each new dimension, as each one needs to specifically
define its interactions with the others. For example, what does "unknown value"
plus "unauthorized" return?

So I would hope that one thing we can all agree on is that the idea of N-valued
logic is excluded from the Pre-RFC, and it sticks to pure 3-valued logic.

I can find acceptable plain 3-valued logic with clearly defined rules such that
every possible scenario involving one is completely deterministic with fully
defined behavior.

In fact I will pivot and semi-endorse Ovid's proposal for some kind of support
of 3-valued-logic in principle, and that it is mainly down to the details to
discuss.

I say semi-endorse because I recognize that when done well it can be very
helpful as Ovid has illustrated, but at the same time I don't consider it a
silver bullet and I feel that having it can introduce new problems when you
consider all of its implications. For one thing, automatic refactoring or
optimizations for performance could be a lot more complicated because some
assumptions on what are safe under 2VL may not be under 3VL.

The most important thing is that this new feature is fully backwards-compatible,
meaning that all of the 3-valued logic behavior is strictly tied to instances of
the new special unknown value, and that all Perl code will behave identically to
before in a Perl supporting the special unknown value where the special unknown
value isn't explicitly made present. So there are NO changes at all to how Perl
undef is treated in the absense of the special unknown value.

Since the presence of the special unknown value would fundamentally alter how a
lot of existing operators/builtins would behave, we would need to add some more
builtins to provide functionality that would then be missing. For example, if
we want to test whether something is or is not the special unknown value, we
would need new builtins that return true or false on that question, rather than
returning unknown. It must be possible to reason about anything involving the
special unknown value where the result is not that unknown value.

Given how the presence of "unknown" could make some workflows worse, or
alternately that in some workflows it should be encouraged, I feel that Perl
should provide mechanisms to flag when it is or isn't being used, similar to use
strict or use warnings. For example, have something that warns if unknown is
used as the input to any operation, similarly to what we have for undef.

Regarding full implications, what do we expect to happen if "unknown" is
interpolated into a string, is the whole strong become "unknown"? What happens
if one says "print unknown;"? What if one asks, does this array or hash contain
any unknown elements?

I also suggest that we consider some new term or keyword to refer to the special
value rather than "unknown", something that is easy to search for and not get
lost in a haystack of other uses of that word in English. Ideally not one that
matches one of many possible reasons we might not have a regular value.
Although just using "unknown" isn't terrible.

==========

Now going off on a tangent, I feel that something we may want to look at is
something analogous to Raku's Failure concept.

The way I see it, there is a generalized concept of "regular value is missing
with a specific reason" such that N-valued logic might try to address, and the
concept of an exception in a typical language that gives a reason for the problem.

So what I propose would be useful, and this can be implemented with either
2-valued logic or 3-valued logic, is that any Perl code which would otherwise
produce undef or unknown to represent that it is not giving a normal answer, it
instead returns a value of a dedicated type that represents a declaration that
we don't have a regular result and here is the specific reason why.

Providing that would be a similar complexity level to providing a formal
exception class hierarchy, including that Perl has some built-in and users can
and frequently would define their own.

I will refer to this concept for the moment with "Excuse".

So we have a distinct core data type called Excuse which is like the singleton
Unknown concept in some ways but that Excuse is basically an object-like
collection type whose contents specify the reason.

So rather than treating a singleton unknown as special, the 3-valued logic would
treat any instance of Excuse as special, but each reason does NOT add a logic
dimension, rather it is just further information "if you want to know".

Having this, we can both support logic like Ovid's demonstrations where we don't
want to have to care about a reason a regular value is missing and just do the
right thing when a regular value is missing, and we can support logic where we
also do want to know WHY the regular value is missing, we can ask that question,
and be able to distinguish say "nothing matched the search query" from "there
was a match but you don't have permission to see it" and so on.

-- Darren Duncan

On 2021-12-18 3:09 a.m., Ovid via perl5-porters wrote:
> Yes, SQL NULL is broken in fundamental ways that CJ Date shows here: https://www.oreilly.com/library/view/sql-and-relational/9781449319724/ch04s04.html
>
> And yes, I've been bitten by that bug in SQL in real-world code. Once. In over two decades. And I write lots of SQL. *Most* of the time, however, the 3VL NULL is what we need. Can you imagine if NULL followed "undef" behavior?
>
> SELECT count(*) FROM things WHERE value > ?;
>
> That would be a disaster and it's easily replicable in Perl:
>
> my $total = grep { $_->value > $limit } @things;
>
> I, for one, am tired of writing code like this:
>
> my $total = grep { defined $_->value ? $_->value > $limit : 0 } @things;
>
> Note: the following is *not* equivalent to the above:
>
> my $total = grep { ( $_->value // 0 ) > $limit } @things;
>
> I mean, it *looks* correct, but what if the value can be a negative number and the limit can be negative? You probably than want this:
>
> my $total = grep { ( $_->value // ( $limit - 1 ) ) > $limit } @things;
>
> Which arguably might be more confusing than using defined. With 3VL, we have this:
>
> my $total = grep { $_->resolution < $limit } @things;
>
> Worse, I'm tired of tracking down bugs caused by this.
>
> 2VL logic on undef/null values been broken for a long time and forces developers to remember to always write special case code to handle this.
>
> However, while we could correct the underlying issue, going further into 4VL or 5VL adds complications that I doubt most developers are going to understand. In other words, SIMPLICITY IS YOUR FRIEND.
>
> We don't need "perfect" because making something that covers all possible cases is simply going to be a mess and might even be counter-productive. For example, if you're unauthorized to get a value but you see that it's a "known defined value", that's an information leak. Also, given Merijn's original list:
>
> 1. Known defined value
> 2. Known undefined value
> 3. Unknown value
> 4. Unauthorized to get the value
> 5. Value is defined but unauthorized to get it
>
> I don't see how 4+1 is different from 5. So we can bikeshed this to death, or fix the major underlying problem: $salary += 1000. Congrats. You've just given a raise to an unpaid volunteer.

Re: Pre-RFC: `unknown` versus `undef` [ In reply to ]

darren at darrenduncan

Dec 18, 2021, 5:29 PM

Post #14 of 38 (1988 views)

Another implication we also have to think about how this would affect
interchange with other languages or serialization formats.

In particular, JSON has "null".

Before this proposal, it seemed fairly clear cut that a Perl "undef" would map
to JSON "null" bi-directionally and losslessly.

So if we also have "unknown", then what would a JSON "null" map to on importing
from JSON, and would both "undef" and "unknown" map to null on exporting to JSON?

I would suggest that both "unknown" and "undef" map to null when exporting to
JSON, and that when importing from JSON there is a configuration option to the
importer that says whether to use "unknown" or "undef".

-- Darren Duncan

Re: Pre-RFC: `unknown` versus `undef` [ In reply to ]

Dec 19, 2021, 2:01 AM

Post #15 of 38 (1988 views)

Good thinking points, but:

Op 19-12-2021 om 01:44 schreef Darren Duncan:
> My position is that it is N-valued logic and its arbitrary count of
> logical dimensions that is the largest problem here. In general its
> complexity is exponential, doubling with each new dimension, as each
> one needs to specifically define its interactions with the others.
> For example, what does "unknown value" plus "unauthorized" return?
>
...
>
> So rather than treating a singleton unknown as special, the 3-valued
> logic would treat any instance of Excuse as special, but each reason
> does NOT add a logic dimension, rather it is just further information
> "if you want to know".
>

So what does Excuse("unknown value") plus Excude("unauthorized) return?
You just reintroduced the same problem that made you decide against NVL,
unless I'm missing something.

HTH,

M4

Re: Pre-RFC: `unknown` versus `undef` [ In reply to ]

Dec 19, 2021, 2:13 AM

Post #16 of 38 (1988 views)

Op 19-12-2021 om 00:18 schreef Paul "LeoNerd" Evans:
> I think if you were to present this if/else code to anyone not
> intimately familiar with this three-value logic idea, and tell them
> there is a value you can put in $foo which makes neither branch
> execute, they would look at you strangely and consider you quite mad.
>
> There may be a Lewis Carroll quote applicable here...

My thoughts exactly. But even worse, this will introduce bugs in
existing code, unless you need a feature flag for this to happen, in
which case code without the feature flag (for instance a library called
from code that uses unknown) running against this situation probably
should die. Silently skipping both branches of an if is insane if not
guarded by a feature flag and only slightly less insane with.

So what to do? Make such code die? But that reintroduces the original
problem, then we have a new singleton value which avoids many calls to
defined(), just to replace them with calls to is_unknown() (in other
places, but still).

I do think there must be a satisfactory answer to this question before
we can proceed with implementation. It does not have to stand in the way
of an RFC.

HTH,

M4

Re: Pre-RFC: `unknown` versus `undef` [ In reply to ]

darren at darrenduncan

Dec 19, 2021, 2:41 AM

Post #17 of 38 (1988 views)

On 2021-12-19 2:01 a.m., Martijn Lievaart wrote:
> Op 19-12-2021 om 01:44 schreef Darren Duncan:
>> My position is that it is N-valued logic and its arbitrary count of logical
>> dimensions that is the largest problem here. In general its complexity is
>> exponential, doubling with each new dimension, as each one needs to
>> specifically define its interactions with the others. For example, what does
>> "unknown value" plus "unauthorized" return?
>>
>> So rather than treating a singleton unknown as special, the 3-valued logic
>> would treat any instance of Excuse as special, but each reason does NOT add a
>> logic dimension, rather it is just further information "if you want to know".
>
> So what does Excuse("unknown value") plus Excude("unauthorized) return? You just
> reintroduced the same problem that made you decide against NVL, unless I'm
> missing something.

The answer is that attempting to do that fails because addition is not defined
for Excuse values. For example, it could result in something like:

Excuse("No_Matching_Routine_Found", "plus", [@_])

That Excuse value can either be returned, or if desired, thrown as an exception.

A key part of what makes this NOT N-valued-logic is that the result of trying to
use some arbitrary non-existent routine with an Excuse value always returns the
same base result, an Excuse of subtype "No_Matching_Routine_Found", with added
info naming the name and arguments attempted, and that there is NOT logic to
test every possible Excuse subtype to determine what kind of result to give.

Basically an Excuse is an unthrown exception, and is treated similarly.

In this context, the version where there is exactly the 1 singleton Unknown etc
is analagous to die() always throwing the same generic exception "I died"
without any further information.

-- Darren Duncan

Re: Pre-RFC: `unknown` versus `undef` [ In reply to ]

Dec 19, 2021, 2:46 AM

Post #18 of 38 (1988 views)

Op 19-12-2021 om 11:41 schreef Darren Duncan:
> On 2021-12-19 2:01 a.m., Martijn Lievaart wrote:
>>
>> So what does Excuse("unknown value") plus Excude("unauthorized)
>> return? You just reintroduced the same problem that made you decide
>> against NVL, unless I'm missing something.
>
> The answer is that attempting to do that fails because addition is not
> defined for Excuse values. For example, it could result in something
> like:
>
> Excuse("No_Matching_Routine_Found", "plus", [@_])
>
> That Excuse value can either be returned, or if desired, thrown as an
> exception.
>
> A key part of what makes this NOT N-valued-logic is that the result of
> trying to use some arbitrary non-existent routine with an Excuse value
> always returns the same base result, an Excuse of subtype
> "No_Matching_Routine_Found", with added info naming the name and
> arguments attempted, and that there is NOT logic to test every
> possible Excuse subtype to determine what kind of result to give.
>
> Basically an Excuse is an unthrown exception, and is treated similarly.
>
> In this context, the version where there is exactly the 1 singleton
> Unknown etc is analagous to die() always throwing the same generic
> exception "I died" without any further information.
>

Ahh, makes sense. I'm still not completely convinced it's a good idea
(nor saying it isn't!), but I agree you solved the original problem.

HTH,

M4

Re: Pre-RFC: `unknown` versus `undef` [ In reply to ]

darren at darrenduncan

Dec 19, 2021, 2:48 AM

Post #19 of 38 (1988 views)

On 2021-12-19 2:13 a.m., Martijn Lievaart wrote:
> Op 19-12-2021 om 00:18 schreef Paul "LeoNerd" Evans:
>> I think if you were to present this if/else code to anyone not
>> intimately familiar with this three-value logic idea, and tell them
>> there is a value you can put in $foo which makes neither branch
>> execute, they would look at you strangely and consider you quite mad.
>>
>> There may be a Lewis Carroll quote applicable here...
>
> My thoughts exactly. But even worse, this will introduce bugs in existing code,
> unless you need a feature flag for this to happen, in which case code without
> the feature flag (for instance a library called from code that uses unknown)
> running against this situation probably should die. Silently skipping both
> branches of an if is insane if not guarded by a feature flag and only slightly
> less insane with.
>
> So what to do? Make such code die? But that reintroduces the original problem,
> then we have a new singleton value which avoids many calls to defined(), just to
> replace them with calls to is_unknown() (in other places, but still).
>
> I do think there must be a satisfactory answer to this question before we can
> proceed with implementation. It does not have to stand in the way of an RFC.

Well the other reasonable solution is to treat the pair as "if the expression
result is true" and "if the expression result is anything else". I actually
thought of proposing that and then I didn't.

This also means a ternary expression does NOT result in unknown if the test
expression is unknown, rather it returns the "else" expression.

However, doing that means these 2 things will behave differently in the presence
of unknown where they wouldn't otherwise:

if ($x) {
foo();
}
else {
bar();
}

if ($x) {
foo();
}
if (!$x) {
bar();
}

Now that may be less surprising and more ideal in the presence of unknown.

What do you think of this alternative?

-- Darren Duncan

Re: Pre-RFC: `unknown` versus `undef` [ In reply to ]

Dec 19, 2021, 2:53 AM

Post #20 of 38 (1988 views)

Op 19-12-2021 om 11:48 schreef Darren Duncan:
>
> Well the other reasonable solution is to treat the pair as "if the
> expression result is true" and "if the expression result is anything
> else". I actually thought of proposing that and then I didn't.
>
> This also means a ternary expression does NOT result in unknown if the
> test expression is unknown, rather it returns the "else" expression.
>
> However, doing that means these 2 things will behave differently in
> the presence of unknown where they wouldn't otherwise:
>
> if ($x) {
>     foo();
> }
> else {
>     bar();
> }
>
> if ($x) {
>     foo();
> }
> if (!$x) {
>     bar();
> }
>
> Now that may be less surprising and more ideal in the presence of
> unknown.
>
> What do you think of this alternative?
>

I like it from a conceptual point of view. It feels right. It feels
correct. But only for new code that is written with this 3VL in mind.
Otherwise I think it will open a can of bugs in existing code, so that
probably should still die.

HTH,

M4

Re: Pre-RFC: `unknown` versus `undef` [ In reply to ]

darren at darrenduncan

Dec 19, 2021, 3:34 AM

Post #21 of 38 (1988 views)

On 2021-12-19 2:53 a.m., Martijn Lievaart wrote:
> Op 19-12-2021 om 11:48 schreef Darren Duncan:
>> Well the other reasonable solution is to treat the pair as "if the expression
>> result is true" and "if the expression result is anything else". I actually
>> thought of proposing that and then I didn't.
>>
>> However, doing that means these 2 things will behave differently in the
>> presence of unknown where they wouldn't otherwise:
>>
>> if ($x) {
>>     foo();
>> }
>> else {
>>     bar();
>> }
>>
>> if ($x) {
>>     foo();
>> }
>> if (!$x) {
>>     bar();
>> }
>>
> I like it from a conceptual point of view. It feels right. It feels correct. But
> only for new code that is written with this 3VL in mind. Otherwise I think it
> will open a can of bugs in existing code, so that probably should still die.

I still think, however, my original proposal of neither if nor else running when
an unknown is input, is probably more correct and consistent with the wider context.

That interpretation means both of the above examples behave the same as each
other, same as would be the case under 2VL.

A stronger supporting point is that the do-nothing answer is more consistent
with the examples Ovid gave about desired behavior for "grep" or such, which is
that unknown are omitted from the result no matter whether the grep condition
tests for truthiness or falseiness, which is also the same as how SQL WHERE
behaves with NULL.

-- Darren Duncan

Re: Pre-RFC: `unknown` versus `undef` [ In reply to ]

perl5-porters at perl

Dec 19, 2021, 4:07 AM

Post #22 of 38 (1988 views)

* Darren Duncan <darren@darrenduncan.net> [2021-12-19 03:34:06 -0800]:

> On 2021-12-19 2:53 a.m., Martijn Lievaart wrote:
> > Op 19-12-2021 om 11:48 schreef Darren Duncan:
> > > Well the other reasonable solution is to treat the pair as "if the
> > > expression result is true" and "if the expression result is anything
> > > else".? I actually thought of proposing that and then I didn't.
> > >
> > > However, doing that means these 2 things will behave differently in
> > > the presence of unknown where they wouldn't otherwise:
> > >
> > > ? if ($x) {
> > > ??? foo();
> > > ? }
> > > ? else {
> > > ??? bar();
> > > ? }
> > >
> > > ? if ($x) {
> > > ??? foo();
> > > ? }
> > > ? if (!$x) {
> > > ??? bar();
> > > ? }
> > >
> > I like it from a conceptual point of view. It feels right. It feels
> > correct. But only for new code that is written with this 3VL in mind.
> > Otherwise I think it will open a can of bugs in existing code, so that
> > probably should still die.
>
> I still think, however, my original proposal of neither if nor else running
> when an unknown is input, is probably more correct and consistent with the
> wider context.
>
> That interpretation means both of the above examples behave the same as each
> other, same as would be the case under 2VL.
>
> A stronger supporting point is that the do-nothing answer is more consistent
> with the examples Ovid gave about desired behavior for "grep" or such, which
> is that unknown are omitted from the result no matter whether the grep
> condition tests for truthiness or falseiness, which is also the same as how
> SQL WHERE behaves with NULL.
>
> -- Darren Duncan

I think this concept is above my head. For example, looking at the examples on
Unknown::Values it's hard for me to figure out what's different. I get the idea
of an actual "unknown" or "unable to get" value, but I keep going to the notion
that this is an attribute of the data or access method; and therefore would be
more appropriate implemented as:

* a special "value" with a class
* something that could benefit from Tie
* something still that could benefit from a suite of overloaded ops
* something that could be execpected to throw an exception rather than affect
the semantics of ordinary looking code

The closest concept I can think that I might be familiar with is Fortran's
NaN, which is part of the IEEE arithmetic standards. It is composed of a
special values that can be detected in hardware or software.

If this was to be implemented as a special value, I do indeed think that the
existing operators and keywords should behave in a way that is consistent with
current coersion of undef, falsey things, etc. In addition to this, it makes
sense to me to expect there to be a keyword that checks explicitly for an
unknown value, in the same way C<defined> works with undef; which provides
an extra degree of precision when you really want to differentiate undef from
a falsey value (q{}, $x <=, e.g.).

# give a 5% raise to lowest wage earners who are not volunteers
if (defined $emp->salary and not unknown $emp->salary and $emp->salary < 4_000) {
$emp->salary($emp->salary + $emp->salary * 0.05);
}

Cheers,
Brett

--
--
oodler@cpan.org
oodler577@sdf-eu.org
SDF-EU Public Access UNIX System - http://sdfeu.org
irc.perl.org #openmp #pdl #native

Re: Pre-RFC: `unknown` versus `undef` [ In reply to ]

perl5-porters at perl

Dec 19, 2021, 4:13 AM

Post #23 of 38 (1988 views)

The if/else is actually pretty simple if we step back for a moment. I think the confusion is that we misunderstand what an "else" block means in Perl. Let's consider this:

if ( $var > 3 ) {
...
}
else {
...
}

In the above, in the else block, we mentally assume that "$var <= 3" holds. In many statically typed languages, that assumption might hold true.

In Perl, $var might be undef and be evaluated as less than three. However, $var might be the string "Hello, World". $var might also be a reference to a hash, we get absolutely no warning, and we hit our else block with an assumption that is probably true ($var <= 3), but not in this particular case. We _should_ be verifying what kind of data that $var holds, but usually we don't.

Thus, in a dynamic language like Perl, barring validating our data up front, the else block very often makes no guarantees about what kinds of data that we have. Thus, we have this (pseudo-code):

if I can verify some condition:
# take some action
else:
# condition not verified

An else block in Perl tends not to be the negation of the previous "if" so much as a "catch" for unverified conditions.

Following the principle of least surprise, an unknown value would hit the else branch in the above code because it matches the semantics of Perl. We didn't have guarantees before, we don't have guarantees now, but if they try to do something with that unknown value, they currently get plenty of warnings.

Side note: for the ultra-paranoid, I allow "use Unknown::Values 'fatal';" which tries to make any use of unknown values (aside from testing if they're unknown) a fatal error. So you need to do this to be safe:

my @filtered = grep { !is_unknown $_ && $_ > 3 } @list;

If unknown values had their own warnings category:

use feature 'unknown';
use warnings FATAL => 'unknown';

Devs could opt-in to the strictness they want.

Also, I would kill to be able to write a class and have attributes/slots/fields, whatever, default to `unknown` instead of `undef`. The "is this slot initialized" problem sorta goes away.

Best,
Ovid
--
IT consulting, training, specializing in Perl, databases, and agile development
http://www.allaroundtheworld.fr/.

Buy my book! - http://bit.ly/beginning_perl

Re: Pre-RFC: `unknown` versus `undef` [ In reply to ]

perl5-porters at perl

Dec 19, 2021, 4:28 AM

Post #24 of 38 (1988 views)

Thank you, this is super helpful. My final comment is just to
reiterate what I most recently said; as long as this doesn't
affect how things currently work with undef/q{}/0 and existing
built-ins/ops; and we get a C<unknown> built-in that does for
unknown values what C<defined> does for undef'd values, then I
can see value in it...particularly for differentiating uniniti-
alized data. And if the built-in came to pass, I suppose there'd
be some sort of need for the equivalent of //, for unknown.

As an exercise, I wonder how many use cases for undef would remain
if unknown was available. If the answer is "not many", then maybe
the answer would be a compatible tweak to undef and not the
creation of a new special value. Just a thought...

Brett

* Ovid via perl5-porters <perl5-porters@perl.org> [2021-12-19 12:13:34 +0000]:

> The if/else is actually pretty simple if we step back for a moment. I think the confusion is that we misunderstand what an "else" block means in Perl. Let's consider this:
>
> ? ? if ( $var > 3 ) {
> ? ? ? ? ...
> ? ? }
> ? ? else {
> ? ? ? ? ...
> ? ? }
>
> In the above, in the else block, we mentally assume that "$var <= 3" holds. In many statically typed languages, that assumption might hold true.
>
> In Perl, $var might be undef and be evaluated as less than three. However, $var might be the string "Hello, World". $var might also be a reference to a hash, we get absolutely no warning, and we hit our else block with an assumption that is probably true ($var <= 3), but not in this particular case. We _should_ be verifying what kind of data that $var holds, but usually we don't.?
>
> Thus, in a dynamic language like Perl, barring validating our data up front, the else block very often makes no guarantees about what kinds of data that we have. Thus, we have this (pseudo-code):
>
> ? ? if I can verify some condition:
> ? ? ? ? # take some action
> ? ? else:
> ? ? ? ? # condition not verified
>
> An else block in Perl tends not to be the negation of the previous "if" so much as a "catch" for unverified conditions.
>
> Following the principle of least surprise, an unknown value would hit the else branch in the above code because it matches the semantics of Perl. We didn't have guarantees before, we don't have guarantees now, but if they try to do something with that unknown value, they currently get plenty of warnings.
>
> Side note: for the ultra-paranoid, I allow "use Unknown::Values 'fatal';" which tries to make any use of unknown values (aside from testing if they're unknown) a fatal error. So you need to do this to be safe:
>
> ? ? my @filtered = grep { !is_unknown $_ && $_ > 3 } @list;
>
> If unknown values had their own warnings category:
>
> ? ? use feature 'unknown';
> ? ? use warnings FATAL => 'unknown';
>
> Devs could opt-in to the strictness they want.
>
> Also, I would kill to be able to write a class and have attributes/slots/fields, whatever, default to `unknown` instead of `undef`. The "is this slot initialized" problem sorta goes away.
>
> Best,
> Ovid
> --?
> IT consulting, training, specializing in Perl, databases, and agile development
> http://www.allaroundtheworld.fr/.?
>
> Buy my book! - http://bit.ly/beginning_perl
>

--
--
oodler@cpan.org
oodler577@sdf-eu.org
SDF-EU Public Access UNIX System - http://sdfeu.org
irc.perl.org #openmp #pdl #native

Re: Pre-RFC: `unknown` versus `undef` [ In reply to ]

perl5-porters at perl

Dec 19, 2021, 4:45 AM

Post #25 of 38 (1988 views)

On Sunday, 19 December 2021, 13:28:23 CET, Oodler 577 <oodler577@sdf-eu.org> wrote:

> Thank you, this is super helpful. My final comment is just to
> reiterate what I most recently said; as long as this doesn't
> affect how things currently work with undef/q{}/0 and existing
> built-ins/ops; and we get a C<unknown> built-in that does for
> unknown values what C<defined> does for undef'd values,

For interpolation, I would suggest it behave like undef, but with a warning. I would (only half-joking here), also consider it to stringify to U+FFFD REPLACEMENT CHARACTER.

my $name = unknown;
say "Hello, $name!";

Output:

Use of unknown value $name in say at ...
Hello, ?!

> As an exercise, I wonder how many use cases for undef would remain
> if unknown was available. If the answer is "not many", then maybe
> the answer would be a compatible tweak to undef and not the
> creation of a new special value. Just a thought...

I would not recommend changing current behavior of undef. That would be widespread carnage.

Best,
Ovid
--
IT consulting, training, specializing in Perl, databases, and agile development
http://www.allaroundtheworld.fr/.

Buy my book! - http://bit.ly/beginning_perl