Mailing List Archive

A thought on type safety
I've been giving the idea of type safety in perl some thought recently, as
the topic of adding a type system to perl has been going on.

To be clear, I'm thinking about making the following (or something similar)
into a compile time error.

my $var = "string";
if ($var < 10) { ... }

The above example is fairly trivial, in that you can actually know at
compile time what that value will be (and it probably even gets optimized
into a constant? I don't actually know how these things work). But what
about the following:

#assuming you're using mojo
my $res = $ua->get('some.web.site/api/data')->res;
if $res->code ne 'failed' { ... }

where code is expected to be a http response code, but you goofed up. This
is more difficult b/c you can't know beforehand what sort of information is
gonna come from the outside world.

So I was wondering if the following could work (for some lexically scoped
pragma): make sure that all data which comes from outside will die if it is
not assigned a type, and then trust that assigned type in the rest of the
code. Something akin to taint mode (without actually knowing the details of
how taint works and its restrictions, forgive me for being a millenial).

Meaning that given a little bit of annotation, akin to this (using
attributes, just b/c its a syntax which exists):

package Mojo::Message::Response; #this is where ->res lives
sub code :returns(Int) { ... }

we can now know that any usage of this will return an Integer.

Or maybe this:

my $line :is(Int) = <>;

where on any assignment the check (actually, assert_valid, but whatever)
method of Int is called (assuming this is a Type::Tiny constraint, just b/c
that's what I'm used to).

Two questions on this thought direction:

1. Is the taint-mode type thing doable?
2. Is there a slot in a variable available for typing data?
Re: A thought on type safety [ In reply to ]
So let me start with, I don't think from a technical perspective there is
anything stopping you from doing what you want in your examples below using
Type::Tiny and some XS hackery that people on this list or on #xs on
irc.perl.org could help you implement. But ... (there's always a but isn't
there?) I've been thinking about Type systems and Perl for a while now.
Stevan back in the Moose days used to go on and on about Ocaml, and how
great it was as a typed language. It was a side conversation with Ric (or
several really) that it finally dawned on me why I'd been banging my head
against a type system for Perl5 for so long and it's because Perl5 doesn't
do types the way most languages do.

On Tue, Aug 11, 2020 at 4:22 PM Veesh Goldman <rabbiveesh@gmail.com> wrote:

> I've been giving the idea of type safety in perl some thought recently, as
> the topic of adding a type system to perl has been going on.
>
> To be clear, I'm thinking about making the following (or something
> similar) into a compile time error.
>
> my $var = "string";
> if ($var < 10) { ... }
>
> The above example is fairly trivial, in that you can actually know at
> compile time what that value will be (and it probably even gets optimized
> into a constant? I don't actually know how these things work). But what
> about the following:
>
>
Right off the bat your simple example isn't really simple in Perl. There is
nothing illegal about what you typed, except that it'll do something
nonsensical. It's obviously a bug as you've written it to us but to perl
... how is it different from:

my $var = "1string";
if ($var < 10) { ... }

I'm going to be corrected I'm sure but in a high level way, Types in Perl
are created by the operators, not by the value. This is how: sort { $a <=>
$b } 0...100 and sort { $a cmp $b } 0...100 both do different things *and*
return different values (one returns a list of integers the other a list
of strings)

#assuming you're using mojo
> my $res = $ua->get('some.web.site/api/data')->res;
> if $res->code ne 'failed' { ... }
>
> where code is expected to be a http response code, but you goofed up. This
> is more difficult b/c you can't know beforehand what sort of information is
> gonna come from the outside world.
>

I've been doing a non-trivial amount of programming in Go recently. Your
example here, none of the data would be coming from the outside world. In
Go the the get() method would return a struct that would have a res()
method defined on it, that res() method would define the type it returns.
It would all be knowable and detectable at compile time. Go's solution is
in fact very similar to your proposal (I'm guessing that's not a surprise).


> So I was wondering if the following could work (for some lexically scoped
> pragma): make sure that all data which comes from outside will die if it is
> not assigned a type, and then trust that assigned type in the rest of the
> code. Something akin to taint mode (without actually knowing the details of
> how taint works and its restrictions, forgive me for being a millenial).
>
> Meaning that given a little bit of annotation, akin to this (using
> attributes, just b/c its a syntax which exists):
>
> package Mojo::Message::Response; #this is where ->res lives
> sub code :returns(Int) { ... }
>
> we can now know that any usage of this will return an Integer.
>
> Or maybe this:
>
> my $line :is(Int) = <>;
>
> where on any assignment the check (actually, assert_valid, but whatever)
> method of Int is called (assuming this is a Type::Tiny constraint, just b/c
> that's what I'm used to).
>
> Two questions on this thought direction:
>
> 1. Is the taint-mode type thing doable?
> 2. Is there a slot in a variable available for typing data?
>
>
So this illustrates the problem I think most people have with type
constraints when it comes to Perl. They don't work the way you *think* they
do, and even if you're right about what you think ... they don't work that
way in Perl.

Your first example has nothing to do with checking external data. It's
*entirely* about program correctness. Languages with good type systems (Go,
Rust, OCaml, etc) perform these type checks at compile time so that you
can't build a program without knowing that it passes some level of
correctness. We *could* do this in Perl, possibly, but it would require
fundamentally changing some expectations on the language and how it
operates. At some level though this is sorta-kinda how XS in Perl operates
and ... for example your "$var < 10" example earlier:

perl -MDevel::Peek -e'$var = "a"; Dump($var); $var < 10; Dump($var);'
SV = PV(0x22deb20) at 0x23068f8
REFCNT = 1
FLAGS = (POK,IsCOW,pPOK)
PV = 0x2305050 "a"\0
CUR = 1
LEN = 10
COW_REFCNT = 1
SV = PVNV(0x22dce70) at 0x23068f8
REFCNT = 1
FLAGS = (POK,IsCOW,pIOK,pNOK,pPOK)
IV = 0
NV = 0
PV = 0x2305050 "a"\0
CUR = 1
LEN = 10
COW_REFCNT = 1

Your string (PV) is converted to a StringOrNum (PVNV) by the comparison
operator. I'm not sure people expect comparison operators to mutate the
type of the operands. We'd have to figure out how to handle that in a way
that was intuitive for Perl developers. Also, and this is where I'm not a
core developer so I'm not 100% certain, but that type cast happens at
runtime not compile time ... that would have to change if we want the
compile time benefits that strongly typed languages like Go have. These are
some fundamental changes to the language with deep and subtle effects. I'm
pretty sure it could be done, but none of the proposals I've seen so far
have even acknowledged it as a concern (as best I could tell).

Your second example is related. Moose and by extension Type::Tiny has shown
the power of runtime type checking. However a lot of people have also noted
the runtime cost of these isn't trivial. To truly get the most benefit
(correctness, safety, and optimizations) out of a type system we'd really
need something that had some kind of compile time component. So your second
example: my $line :is(Int) = <>; ... requires some kind of runtime
performance hit to get the benefit you're looking for from, but ... would
it really provide that benefit? We've already seen that Perl autocasts
Strings to Integers when we do a comparison. Wouldn't we expect assignments
to operate similarly? I mean we'd expect 49 in an ASCII file to be
converted to an IV with IV=1 ... would our Int type allow a coercion from
anything that matched \d in unicode? What about:

perl -MDevel::Peek -e'$v = "2 donuts"; 0+$v; Dump($v)'
SV = PVNV(0x10fae70) at 0x11248e8
REFCNT = 1
FLAGS = (POK,IsCOW,pIOK,pNOK,pPOK)
IV = 2
NV = 2
PV = 0x1123040 "2 donuts"\0
CUR = 8
LEN = 10
COW_REFCNT = 1

0+$v here is to trigger the cast operation that Perl *currently* does, the
one's we're all trained to do when we want to "numify" a value. Feel free
to replace it with whatever hypothetical cast operation we'd do implicitly
to convert ASCII 49 into 1. Currently Perl defines every valid String as a
valid Int, most of them are just exactly equal to 0.

You're not the first to bump up against this. Trying to google for the
thread behind Chip Salzenberg's "magicflags" branch(es) lead me to
https://www.nntp.perl.org/group/perl.perl5.porters/2013/05/msg201351.html
where someone tried simple annotations which covers much of the same
problems I've covered much more succinctly by people like DaveM and Nick
Clark. So I think from a technical standpoint it's tricky but do-able ...
maybe.

-Chris