Mailing List Archive: A troubling thought

A troubling thought - smartmatch reïmagined

Jun 25, 2022, 1:13 PM

Post #1 of 13 (611 views)

Preface: This is less a concrete idea, and more a rambling set of
thoughts that lead me to a somewhat awkward place. I'm writing it out
here in the hope that others can lend suggestions and ideas, and see
if we can arrive at a better place.

This started off as a quick email, that turned into a long email, that
became a *very* long email that I eventually decided to turn into a
blog post.

For the full content, you can see

https://leonerds-code.blogspot.com/2022/06/a-troubling-thought-smartmatch.html

but as a summary:

I wonder if we're going about things the wrong way, regarding strings
vs numbers, equality test operators, smartmatch, and a bunch of other
ideas. Maybe - maybe - now that we can distinguish strings vs numbers
a little better, there's a way we can make a single operator Do The
Right Thing.

Comments welcome - on blog or replied here.

--
Paul "LeoNerd" Evans

leonerd@leonerd.org.uk | https://metacpan.org/author/PEVANS
http://www.leonerd.org.uk/ | https://www.tindie.com/stores/leonerd/

Re: A troubling thought - smartmatch reïmagined [ In reply to ]

leonerd at leonerd

Jun 25, 2022, 1:23 PM

Post #2 of 13 (611 views)

Permalink

By way of an addendum, here's an entire additional section I was going
to write in the blog post. It doesn't directly relate to the thoughts
presented there, but it's another perspective on things:

## Comparisons on User-defined Classes

Specifically, lets think about an object type like String::Tagged. In
brief for people not familiar with it, it's an object class that
presents string-like behaviour, but with extra data stored on extents
of the string buffer. For example, it might want to store formatting
information like colours and font-rendering attributes. There's various
subclasses of it for handling HTML, POD, etc...

Right now, it doesn't actually define an `eq` operator, so the usual
fallback logic will apply and just compare the individual characters in
the actual string buffer, ignoring any of the extra (formatting) tags.
E.g.

my $x = String::Tagged->new_tagged( "This is my message",
fg => "red"
);
my $y = String::Tagged->new_tagged( "This is my message",
fg => "green"
);

say "Equal" if $x eq $y;

will claim the two strings are equal. This probably isn't what the user
wants. Probably what should happen is that the String::Tagged class
should provide an overloaded `eq` operator that compares not only that
the string buffers are equal, but also that any additional tags are all
equal as well. Thus, given the above example, the operator would
conclude the instances are not equal, because they have a different
value for the 'fg' tag across the whole range.

This is where the problem starts to become apparent. By what choice of
comparison operator should that nested test operate? In particular,
lets think about the following pair of instances:

my $x = String::Tagged->new_tagged( "message",
fg => "red", weight => "1.0"
);
my $y = String::Tagged->new_tagged( "message",
fg => "red", weight => 1.0
);

It's fairly obvious to the casual human observer that these ought to be
equal, yes? Same string, same set of tags with the same values...

But wait (pardon the phrasing) - what about the value of the 'weight'
tag here? It has the string value "1.0" in the first instance, but the
numerical value 1.0 in the second. A test performed by `==` would
compare these values equal, but by `eq` they would not.

Superficially, the thought might appear that some specific subclass of
String::Tagged, built for a given use-case, might want to provide
metadata about each of its possible tag types, to say how to compare
them; but that has a lot of troubling consequences. Plus it won't work
nicely for ad-hoc uses.

It sortof feels like this is another case where if we had a single
equality-test operator that could better pick its comparison semantics,
it would be much easier to distribute downwards through these objects
without declaring that sort of typing information upfront.

--
Paul "LeoNerd" Evans

leonerd@leonerd.org.uk | https://metacpan.org/author/PEVANS
http://www.leonerd.org.uk/ | https://www.tindie.com/stores/leonerd/

Re: A troubling thought - smartmatch reimagined [ In reply to ]

me at xenu

Jun 25, 2022, 9:46 PM

Post #3 of 13 (611 views)

Permalink

On Sat, 25 Jun 2022 21:13:17 +0100
"Paul \"LeoNerd\" Evans" <leonerd@leonerd.org.uk> wrote:

> Preface: This is less a concrete idea, and more a rambling set of
> thoughts that lead me to a somewhat awkward place. I'm writing it out
> here in the hope that others can lend suggestions and ideas, and see
> if we can arrive at a better place.
>
> This started off as a quick email, that turned into a long email, that
> became a *very* long email that I eventually decided to turn into a
> blog post.
>
> For the full content, you can see
>
> https://leonerds-code.blogspot.com/2022/06/a-troubling-thought-smartmatch.html
>
> but as a summary:
>
> I wonder if we're going about things the wrong way, regarding strings
> vs numbers, equality test operators, smartmatch, and a bunch of other
> ideas. Maybe - maybe - now that we can distinguish strings vs numbers
> a little better, there's a way we can make a single operator Do The
> Right Thing.
>
> Comments welcome - on blog or replied here.

It turns out I was right to worry that the scalar flags changes will
give people the wrong idea that Perl has types. It doesn't and it
shouldn't. Going that way is IMO a big mistake.

Sure, the "types" are now less volatile, so it's tempting to make use of
them. But the problem is their volatility wasn't the only reason why we
shouldn't rely on them.

Explicitness is one of the Perl's great strengths. The main example of
that explicitness are our comparison operators. Each of the operators
explicitly casts both operands to some type; for example, "$foo eq $bar"
in Python would be expressed as "str(foo) == str(bar)".

Your proposal throws away that explicitness. I don't like this. Less
explicit code is harder to reason about. It requires more context to be
understood. Explicitness is what distinguishes Perl from Python,
JavaScript and PHP.

Now, to address your "undef" problem. I think your post lacks context.
If it was just about "match", it would've been trivial to fix, you could
just add "defined { }". But I believe the real problem is that you're
also planning an "in" operator ("VAR in:OP LIST", e.g.
"$foo in:eq @bar").

Let's review some of the potential solutions:

1. As mentioned in your post, new comparison operators that treat
"undef" specially, i.e. "equ", "===" and "==~". They solve the problem
well, they're explicit, but: they are a bit ugly and they suffer from
being the same thing that already exists but slightly different, which
is a cardinal programming language design sin.

2. Redefine "eq", "==", "=~" with a feature flag to treat "undef"
specially. IMO that's how they should've been designed from the
beginning, but it may be too drastic of a change now.

3. Special-case defined() like this: defined() in @bar. It's harder
to parse, but it *isn't* ambiguous, although it might feel too magical.
Also, it could be done without special syntax, with a function that
returns an object with overloaded "eq", but that would be a bit too
convoluted for my taste.

4. New operator that compares definedness: "undef in:definedness @bar".
Naturally, the name is just a placeholder :P The problem is that
operator isn't very useful outside "match" and "in". Also, it's hard to
come up with a good spelling for it.

5. Just ignore the problem for "in". "match" can have "defined {}",
while "in" won't handle undefs specially. Users can use any/grep
instead.

Personally, I don't have a strong preference for any of the options, but
I think 3. *might* be the best.

Re: A troubling thought - smartmatch reimagined [ In reply to ]

me at xenu

Jun 25, 2022, 9:51 PM

Post #4 of 13 (611 views)

Permalink

On Sun, 26 Jun 2022 06:46:58 +0200
Tomasz Konojacki <me@xenu.pl> wrote:

> Now, to address your "undef" problem. I think your post lacks context.
> If it was just about "match", it would've been trivial to fix, you could
> just add "defined { }".

I meant "undef {}". "defined {}", while not completely pointless, would
be much less useful.

Re: A troubling thought - smartmatch reïmagined [ In reply to ]

curtis.poe at gmail

Jun 26, 2022, 3:46 AM

Post #5 of 13 (611 views)

Permalink

On Sat, Jun 25, 2022 at 10:13 PM Paul "LeoNerd" Evans <
leonerd@leonerd.org.uk> wrote:

> I wonder if we're going about things the wrong way, regarding strings
> vs numbers, equality test operators, smartmatch, and a bunch of other
> ideas. Maybe - maybe - now that we can distinguish strings vs numbers
> a little better, there's a way we can make a single operator Do The
> Right Thing.
>
> Comments welcome - on blog or replied here.
>
>
I like where you're going. However, sooner or later, we're going to need to
address the type problem up front.

I suspect, however, that we're going to punt on the type problem and build
things into the language that are workarounds/heuristics. Those
workarounds/heuristics might preclude some later type opportunities, or we
might need to rollback some workarounds/heuristics. This is not a good
place to be.

Is there any interest in discussing the actual issue with the lack of
types? (by types, I mean the kinds of non-reference data a scalar can hold
and the operators allowed on them).

Best,
Curtis "Ovid" Poe
CTO, All Around the World
World-class software development and consulting
https://allaroundtheworld.fr/

Re: A troubling thought - smartmatch reïmagined [ In reply to ]

wlindley at wlindley

Jun 26, 2022, 4:06 AM

Post #6 of 13 (611 views)

Permalink

On 6/25/22 16:13, Paul "LeoNerd" Evans pointed to his article saying:
> no boolean is ever ? to any non-boolean

On this point I disagree.

Rant (the longer): The whole "let's spell the logical 0, basis of the
binary number system built on Charles Boole's logic, as the letter 'f'
the letter 'a' the letter 'l' the letter 's' the letter 'e' instead of
the number zero" thing is creeping up and its problems will only grow.
In actual machine code, as in C, there is only 0 and 1 for Boolean
logic. Representing a logical value which inside the computer is
identical to a number as an English word is a confusion both in data and
in programming, not to mention it uses 36 bits on average to represent a
1-bit value ('true' or 'false' instead of '0' or '1'). 1 should always
be true, 0 should always be false. That's how we have always written and
conceptualized text and data files, because of their numerical and
electronic (hardware) basis.

Rant (the shorter): Emulating Ruby where 0 is true is not going to do
Perl any favors. [1]

\\/

[1] In Ruby, everything is an object; thus 0 is an object; and all
objects are true. The twisted thought required for this construction
beggars belief.

Re: A troubling thought - smartmatch re?magined [ In reply to ]

nick at ccl4

Jun 26, 2022, 7:10 AM

Post #7 of 13 (611 views)

Permalink

On Sat, Jun 25, 2022 at 09:13:17PM +0100, Paul "LeoNerd" Evans wrote:
> Preface: This is less a concrete idea, and more a rambling set of
> thoughts that lead me to a somewhat awkward place. I'm writing it out
> here in the hope that others can lend suggestions and ideas, and see
> if we can arrive at a better place.
>
> This started off as a quick email, that turned into a long email, that
> became a *very* long email that I eventually decided to turn into a
> blog post.
>
> For the full content, you can see
>
> https://leonerds-code.blogspot.com/2022/06/a-troubling-thought-smartmatch.html
>
> but as a summary:
>
> I wonder if we're going about things the wrong way, regarding strings
> vs numbers, equality test operators, smartmatch, and a bunch of other
> ideas. Maybe - maybe - now that we can distinguish strings vs numbers
> a little better, there's a way we can make a single operator Do The
> Right Thing.
>
> Comments welcome - on blog or replied here.

You have this:

It is now possible to classify any given scalar value into exactly one
of the following five categories:

undef
boolean
initially string
initially number
reference

...

I'd also like to suggest a rule that given any pair of scalars of
different categories, the result is always false.

I don't think that that categorisation handles "big integers" well.
(Or bigrats, or similar).

In that BigInts are stored as references with overloading, and attempt to
behave like a scalar, even though they aren't really.

If I read your table/plan correctly, a BitInt would be matched as a reference,
hence a number "~~" BigInt matchup would fail.

I'm not sure if there's a slightly more complex bounded plan that can handle
this well. The start seems to be "take your 5 categories, and add a 6th which
is objects with overloading..." but it's then unclear if one is only
permitted to consider objects with overloaded "not-so-smartmatch", or also
objects that can overload C<eq> and C<==> comparison.

But I don't know if this start is a rabbit hole that never terminates. Or if
there is an almost-as-simple way of categorising scalars that is

* simple (enough)
* consistent
* handles "big" integers and "big" numbers as if they are numbers
* doesn't special case specific core classes of big integers and big numbers

Nicholas Clark

Re: A troubling thought - smartmatch reimagined [ In reply to ]

matthew.persico at gmail

Jun 27, 2022, 11:27 AM

Post #8 of 13 (611 views)

Permalink

TL;DR: It's time to stop trying to reinvent "case" in Perl.
Long story:
https://www.reddit.com/r/perl/comments/vkzawl/comment/idyjrbw/?utm_source=share&utm_medium=web2x&context=3

On Sun, Jun 26, 2022 at 12:52 AM Tomasz Konojacki <me@xenu.pl> wrote:

> On Sun, 26 Jun 2022 06:46:58 +0200
> Tomasz Konojacki <me@xenu.pl> wrote:
>
> > Now, to address your "undef" problem. I think your post lacks context.
> > If it was just about "match", it would've been trivial to fix, you could
> > just add "defined { }".
>
> I meant "undef {}". "defined {}", while not completely pointless, would
> be much less useful.
>

--
Matthew O. Persico

RE: A troubling thought - smartmatch re?magined [ In reply to ]

Vadim.Konovalov at dell

Jun 27, 2022, 9:30 PM

Post #9 of 13 (611 views)

Permalink

Usage of "?" in this context is obviously incorrect.
Not only you're messing with following search engines, you're also applying wrong meaning to the umlaut.

IMO adding such kind of jokes isn't improving overall impression.

This joke is stupid.

Internal Use - Confidential

-----Original Message-----
From: Paul "LeoNerd" Evans <leonerd@leonerd.org.uk>
Sent: Saturday, June 25, 2022 11:13 PM
To: Perl5 Porters
Subject: A troubling thought - smartmatch re?magined

[EXTERNAL EMAIL]

Preface: This is less a concrete idea, and more a rambling set of
thoughts that lead me to a somewhat awkward place. I'm writing it out
here in the hope that others can lend suggestions and ideas, and see
if we can arrive at a better place.

This started off as a quick email, that turned into a long email, that became a *very* long email that I eventually decided to turn into a blog post.

For the full content, you can see

https://urldefense.com/v3/__https://leonerds-code.blogspot.com/2022/06/a-troubling-thought-smartmatch.html__;!!LpKI!hbz9n_FFgEYwFXlW5jzZQXz-cVlpJV6QHRJ7p8nL2gAD3O5Qid8EcBzsNwafdt5YfqJ9g7zAU2kDl1wVbDa2O8k8$ [leonerds-code[.]blogspot[.]com]

but as a summary:

I wonder if we're going about things the wrong way, regarding strings
vs numbers, equality test operators, smartmatch, and a bunch of other
ideas. Maybe - maybe - now that we can distinguish strings vs numbers
a little better, there's a way we can make a single operator Do The
Right Thing.

Comments welcome - on blog or replied here.

--
Paul "LeoNerd" Evans

leonerd@leonerd.org.uk | https://urldefense.com/v3/__https://metacpan.org/author/PEVANS__;!!LpKI!hbz9n_FFgEYwFXlW5jzZQXz-cVlpJV6QHRJ7p8nL2gAD3O5Qid8EcBzsNwafdt5YfqJ9g7zAU2kDl1wVbLcwm1l7$ [metacpan[.]org]
https://urldefense.com/v3/__http://www.leonerd.org.uk/__;!!LpKI!hbz9n_FFgEYwFXlW5jzZQXz-cVlpJV6QHRJ7p8nL2gAD3O5Qid8EcBzsNwafdt5YfqJ9g7zAU2kDl1wVbAs8tX5t$ [leonerd[.]org[.]uk] | https://urldefense.com/v3/__https://www.tindie.com/stores/leonerd/__;!!LpKI!hbz9n_FFgEYwFXlW5jzZQXz-cVlpJV6QHRJ7p8nL2gAD3O5Qid8EcBzsNwafdt5YfqJ9g7zAU2kDl1wVbFvOJVAm$ [tindie[.]com]

Re: A troubling thought - smartmatch reïmagined [ In reply to ]

aaron at priven

Jun 27, 2022, 10:28 PM

Post #10 of 13 (611 views)

Permalink

> Maybe - maybe - now that we can distinguish strings vs numbers
> a little better,
Can we, though?

Even aside from the difficulty of determining whether a number can be compared to a string, there are lots of times when one might read booleans from a text file or database or something, where it’s not clear whether “0” is a boolean or a number or a string, or whether “” is a boolean or a string. It seems to me that if we want to be certain, then at best, we really only have three different types: undef, reference, and everything else.

Smartmatch was the previous decade’s attempt to get around Perl’s lack of types, and builtin::created_as_string etc. is this decade’s. In each case, it works sometimes, but not all the time, and it’s not always obvious to everyone what a “string-like nature” or “numerical nature” is, to quote from the builtin docs. It’s less obscure than smartmatch was, but only somewhat.

So I think you were right the first time. I like, in general, your "match (VALUE : OPERATOR)" syntax. I also think your proposed operators are very helpful. I strongly believe “any {$_ eq ‘A' } @arr” is, while more flexible and very valuable, less comprehensible to most people than “‘A’ elem @arr” (although “‘A’ in @arr” would be even more clear, and I believe that clarity would worth losing the parallelism with ?).

(I’d suggest <== as a way of spelling ? in 7-bit ASCII, if that doesn’t break anything. I admit I don’t know whether it would.)

I don’t love the word “match,” since most of the time when we use the word “match" in discussing Perl, we’re talking about regular expression matching. I would just as soon go back to “given.” While “when” is deeply problematic, now that there’s no lexical $_, “given” is basically the same as “for scalar”, if I understand correctly. But it’s not that important.

I would suggest a couple of things. First, match (or given) without an operator should just default to a Boolean test.

match ($x) { # or given ($x), both meaning “for (scalar $x)"
case ($_ eq ‘A’) {
…
}
case ($_ == 0) {
…
}
}

And this could work with “for” as well.

for ($x, $y) {
case ($_ eq ‘A’) {
…
}
case ($_ == 0) {
…
}
}

Admittedly, this doesn’t add that much, because “case” becomes just equivalent to

if (EXPR) {
…
next;
}

But it’s convenient to be able to encapsulate that in a single statement in one place

More interestingly though, it leads to the idea of extending for to include your operator:

for (@x, @q : == ) {
# I have not checked to see if this syntax could work
case (0) {
…
}
case (1) {
…
}
}

I don’t really know if that would get much use, but maybe it would. And I think making “for” and “match” (or “given”) so clearly parallel is appealing.

--
Aaron Priven, aaron@priven.com, www.priven.com/aaron

Re: A troubling thought - smartmatch reïmagined [ In reply to ]

me at xenu

Jun 28, 2022, 1:05 AM

Post #11 of 13 (611 views)

Permalink

On Tue, 28 Jun 2022, at 06:30, Konovalov, Vadim wrote:
> Usage of "ï" in this context is obviously incorrect.
> Not only you're messing with following search engines, you're also
> applying wrong meaning to the umlaut.
>
> IMO adding such kind of jokes isn't improving overall impression.
>
> This joke is stupid.

It isn't umlaut, and it isn't "obviously incorrect", just unusual:

https://www.merriam-webster.com/words-at-play/mary-norris-diaeresis

RE: A troubling thought - smartmatch reïmagined [ In reply to ]

Vadim.Konovalov at dell

Jun 28, 2022, 3:29 AM

Post #12 of 13 (611 views)

Permalink

> From: Tomasz Konojacki <me@xenu.pl>
> On Tue, 28 Jun 2022, at 06:30, Konovalov, Vadim wrote:
> > Usage of "?" in this context is obviously incorrect.
> > Not only you're messing with following search engines, you're also
> > applying wrong meaning to the umlaut.
> >
> > IMO adding such kind of jokes isn't improving overall impression.
> >
> > This joke is stupid.
>
> It isn't umlaut, and it isn't "obviously incorrect", just unusual:

Thanks for the correction, it is indeed "Diaeresis", not umlaut.
Thanks for the link.
It underlines my error: "Often mistakenly called an umlaut" :D

I thought the diaresis was mostly from French.
French often use that "Diaeresis" nowadays, as opposed to English, where I only saw ï in naïve.

IMO using diaeresis in coöperate and in reïmagined is playing smartass, until all "foo" instances replaced with "foö" through all the Perl source code.

This approach also disrespects German (and Estonian) reading of such letters.

>
> https://urldefense.com/v3/__https://www.merriam-webster.com/words-at-play/mary-
> norris-diaeresis__;!!LpKI!iMXf_ZL9C-PSQMdpJ3wrN3WnH6xCnqCrQ19GQVRIYwC9i_EOx67Os
> Rw5zCKpY2ImT1mpHNuaUk19xEs$ [merriam-webster[.]com]
>

Internal Use - Confidential

Re: A troubling thought - smartmatch reïmagined [ In reply to ]

happy.barney at gmail

Jun 28, 2022, 5:27 AM

Post #13 of 13 (611 views)

Permalink

> Comments welcome - on blog or replied here.
>
>
Instead of inventing something "similar to other language but different ...
yet still with same limitation", what about challenge first?

Can you write all your examples (and more) using Type::Tiny based
constraint dispatch (current or reasonably improved)?
eg:
given ($foo) {
when (My::Domain::Accepted) { ... }
when (My::Domain::Rejected) { ... }
...
}