Mailing List Archive

Data checks semantics
Hi all,

No one responded to my email linking the full data checks spec. I'm not
surprised because it was linking to a huge and overwhelming document. I've
now had the time to summarize the key semantics that we've defined.

I hope these can help clarify some things and help the conversation move
forward. Note that the syntax is how we defined it in the Data::Checks
module. Consider it "for information purposes" only. We're largely focusing
on semantics here. I came up with eight key points, with the third point
being troublesome.

*1. Checks are on the variable, not the data*

my $foo :of(INT) = 4;
$foo = 'hello'; # fatal

However:

my $foo :of(INT) = 4;
my $bar = $foo;
$bar = 'hello'; # legal

This is because we don't want checks to have "infectious" side effects that
might surprise you. The developer should have full control over the data
checks.

*2. No type inference*

No surprises. The developer should have full control over the data checks.


*3. Checks are on assignment to the variable*
*This is probably the most problematic bit.*

A check applied to a variable is not an invariant on that variable. It's a
prerequisite for assignment to that variable.

An invariant on the variable would guarantee that the contents of the
variable must always meet a given constraint; a "prerequisite for
assignment" only guarantees that each element must be assigned values that
meet the constraint at the moment they are assigned.

So an array such as `my @data :of(HASH[INT])` only requires that each
element of `@data` must be assigned a hashref whose values are integers. If
you were to subsequently modify an element like so (with the caveat that
the two lines aren't exactly equivalent):

$data[$idx] = { $key => 'not an integer' }; # fatal
$data[$idx]{$key} = 'not an integer"; # not fatal !

The second assignment is not modifying `@data` directly, only retrieving a
value from it and modifying the contents of an entirely different variable
through the retrieved reference value.

We *could* specify that checks are invariants, instead of prerequisites,
but that would require that any reference value stored within a checked
arrayref or hashref would have to have checks automatically and recursively
applied to them as well, which would greatly increase the cost of checking,
and might also lead to unexpected action-at-a-distance, when the
now-checked references are modified through some other access mechanism.

Moreover, we would have to ensure that such auto-subchecked references were
appropriately “de-checked” if they are ever removed from the checked
container. And how would we manage any conflict if the nested referents
happened to have their own (possibly inconsistent) checks?

So the checks are simply assertions on direct assignments, rather than
invariants over a variable’s entire nested data structure.

This is unsatisfying, but we're playing with the matches we have, not the
flamethrower we want.


*4. Signature checks*
We need to work out the syntax, but the current plan is something like this:

sub count_valid :returns(UINT) (@customers :of(OBJ[Customer])) {
...
}

The `@customers` variable should maintain the check in the body of the sub,
but the return check is applied once and only once on the data returned at
the time that it's returned.


*5. Scalars require valid assignments*
my $total :of(NUM); # fatal, because undef fails the check

This is per previous discussions. Many languages allow this:

int foo;

But as soon as you assign something to `foo`, it's fatal if it's not an
integer. For Perl, that's a bit tricky as there's no clear difference
between uninitialized and undef. While using that variable prior to
assignment is fatal in many languages, that would be more difficult in
Perl. Thus, we require a valid assignment.

As a workaround, this is bad, but valid:

my $total :of(INT|UNDEF);

This restriction doesn't apply to arrays or hashes because being empty
trivially passes the check.


*6. Fatal*
By default, a failed check is fatal. We have provisions to downgrade them
to warnings or disable them completely.


*7. Internal representation*
my $foo :of(INT) = "0";
Dump($foo);

"0" naturally coerces to an integer, so that's allowed. However, we don't
plan (for the MVP) to guarantee that Dump shows an IV instead of a PV.
We're hoping that can be addressed post-MVP.

*8. User-defined checks*

Users should be able to define their own checks:

check LongStr :params($N :of(PosInt)) :isa(STR) ($n) { length $n >= $N }

The above would allow this:

my $name :of(LongStr[10]) = get_name(); # must be at least 10 characters

The body of a check definition should return a true or false value, or
die/croak with a more useful message.

A user-defined check is *not* allowed to change the value of the variable
passed in. Otherwise, we could not safely disable checks on demand
(coercions are not planned for the MVP, but we have them specced and they
use a separate syntax).

I was thinking user-defined checks should be post-MVP, but it's unclear to
me how useful checks would be without them. That's a discussion for later.

Best,
Ovid
Re: Data checks semantics [ In reply to ]
Op 25-05-2023 om 22:22 schreef Ovid:
> Hi all,
>
> No one responded to my email linking the full data checks spec. I'm
> not surprised because it was linking to a huge and overwhelming
> document. I've now had the time to summarize the key semantics that
> we've defined.


I read it. It's a good document. I just don't like the coercion bit, but
other than that it seems fine.


> I was thinking user-defined checks should be post-MVP, but it's
> unclear to me how useful checks would be without them. That's a
> discussion for later.


Or you can first do a MVP without and then one with. (Watering down the
meaning of MVP, but whatever).


HTH,

M4
Re: Data checks semantics [ In reply to ]
On Thu, May 25, 2023 at 10:30?PM Martijn Lievaart via perl5-porters <
perl5-porters@perl.org> wrote:


> > No one responded to my email linking the full data checks spec. I'm
> > not surprised because it was linking to a huge and overwhelming
> > document. I've now had the time to summarize the key semantics that
> > we've defined.
>
> I read it. It's a good document. I just don't like the coercion bit, but
> other than that it seems fine.
>

Thank you. I don't like coercions either. I think, given other comments,
that we might be able to leave them out entirely. Though I have to admit,
they are extremely useful at times. They can make it very easy, for
example, to receive data from something like a JSON front-end and still get
useful objects you can work with instead of having to constantly manually
fetch the user/order/catalog from the database (I don't write coercions,
but I've colleagues who do and they do make life easier at the expense of
debugging pain when things go wrong.)

Best,
Ovid
Re: Data checks semantics [ In reply to ]
On Thu, May 25, 2023 at 4:22?PM Ovid <curtis.poe@gmail.com> wrote:
>
> Hi all,
>
> No one responded to my email linking the full data checks spec. I'm not surprised because it was linking to a huge and overwhelming document. I've now had the time to summarize the key semantics that we've defined.

I got distracted trying to work out (with code) my arguments around
your original question about syntax. I have something but it's not
comprehensive to (all of) the semantics below. It's also not entirely
relevant to this thread, though I'm gonna use that syntactic style for
my examples.

> I hope these can help clarify some things and help the conversation move forward. Note that the syntax is how we defined it in the Data::Checks module. Consider it "for information purposes" only. We're largely focusing on semantics here. I came up with eight key points, with the third point being troublesome.
>
> 1. Checks are on the variable, not the data
>
> This is because we don't want checks to have "infectious" side effects that might surprise you. The developer should have full control over the data checks.

Agreed.

> 2. No type inference
>
> No surprises. The developer should have full control over the data checks.

I'm definitely in the "I'd prefer inference" camp but I also am not a
computer scientist and don't fully understand the implementation
details, I also think that given the outline of things this could be
possibly introduced later without breaking backwards compatibility.

> 3. Checks are on assignment to the variable
>
> A check applied to a variable is not an invariant on that variable. It's a prerequisite for assignment to that variable.
>
> An invariant on the variable would guarantee that the contents of the variable must always meet a given constraint; a "prerequisite for assignment" only guarantees that each element must be assigned values that meet the constraint at the moment they are assigned.
[...]
> We *could* specify that checks are invariants, instead of prerequisites, but that would require that any reference value stored within a checked arrayref or hashref would have to have checks automatically and recursively applied to them as well, which would greatly increase the cost of checking, and might also lead to unexpected action-at-a-distance, when the now-checked references are modified through some other access mechanism.
[...]
> This is unsatisfying, but we're playing with the matches we have, not the flamethrower we want.

I think this is a false dilemma "either everything is invariant or
nothing is". We already have places in Perl where after the first
level of a reference we don't enforce a guarantee. An example would be
`constant`:
```
use constant AnArrayRef = [];
AnArrayRef = []; # boom
AnArrayRef->[0] = []; # no boom
```
There are operations that take place at the surface that aren't
assignments that would benefit from a check operation happening.
Quickly off the top of my head something like:
```
my Int $id = 0;
$id .= " "; # this may not be obvious, but I'd like it to be
json_encode({ id => $id });
```
Just checking at the top level that we're not implicitly coercing the
value in the variable into something invalid I think would be useful …
even though it doesn't guarantee the data is invariant all the way
down. That said, maybe that’s a latter pass.

> 4. Signature checks

Yep.

> 5. Scalars require valid assignments

I had an argument here about `my $foo;` and `my $foo = undef;` not
neccessarily being identical but Devel::Peek just disabused me of
that.

> This restriction doesn't apply to arrays or hashes because being empty trivially passes the check.

Does it? `my ARRAY[DEFINED] @foo;` or `check ArrayOfExactlyOne = sub
(@array) { length @array == 1 }; my ArrayOfExactlyOne @foo;`

> 6. Fatal

I think we could live with fatalizeable warnings, but I agree with
both fatal and non-fatal with a way to toggle between.

> 7. Internal representation

If we get the rest of semantics correct this hopefully won't matter
cause it will naturally fall out.

> 8. User-defined checks

First what is `ARRAY[HASHREF[m/a-z/ => OBJECT[Login]]]` except a user
defined check written using a restricted DSL first designed for
MooseX::AttributeHelpers?

I personally would rather have simple user defined checks than the
slew of builtins and the container syntax from
MooseX::AttributeHelpers. I know I’m likely in the minority there. I
think however it provides a better base to iterate from than a more
maximal system.

Yes it could get a bit silly:
```
check ArrayOfHashesOfLoginObjectsWithLowercaseKeys (Array @data) { all
{ 0 unless all { m/a-z/ } keys %$_; 0 unless all { $_ isa Login }
values %$_; } @data }
my ArrayOfHashesOfLoginObjectsWithLowercaseKeys @Logins =
$dao->get_login_objects_for($user);
```
But in reality I think we'd likely break that up a bit.
```
check LowercaseHash (%hash) { all { m/a-z/ } keys %hash }
check HashOfLogins (LowercaseHash %logins) { all { $_ isa Login }
values %logins }
check ArrayOfLoginHashes (@data) { ref $_ eq 'HASH' && my HashOfLogins
$check = %$_ }

my ArrayOfLoginHashes @logins = $DAO->get_logins_for_user($user);
```

Again I'm probably in the minority here, but I've never really liked
the Moose container type syntax that got inherited by everything.

-Chris
Re: Data checks semantics [ In reply to ]
On Fri, May 26, 2023 at 6:47?AM Chris Prather <chris@prather.org> wrote:


> I got distracted trying to work out (with code) my arguments around
> your original question about syntax. I have something but it's not
> comprehensive to (all of) the semantics below. It's also not entirely
> relevant to this thread, though I'm gonna use that syntactic style for
> my examples.
>

This is the key thing. I still want to write up a full proposal based on a
syntax P5P will accept, but so far, we can't seem to get there.


> > 3. Checks are on assignment to the variable
> >
> > A check applied to a variable is not an invariant on that variable. It's
> a prerequisite for assignment to that variable.
> I think this is a false dilemma "either everything is invariant or
> nothing is". We already have places in Perl where after the first
> level of a reference we don't enforce a guarantee. An example would be
> `constant`:
> ```
> use constant AnArrayRef = [];
> AnArrayRef = []; # boom
> AnArrayRef->[0] = []; # no boom
> ```
> There are operations that take place at the surface that aren't
> assignments that would benefit from a check operation happening.
>

Agreed, but there were several key issues raised.

1. Performance
2. How do we handle conflicting checks bound to internal references?
3. If you extract a reference from a data structure, how do you "undo"
the check binding?

For the last point:

my @grades = $records->{hakim}{english}->@*;

Sure, you probably *want* to have a check on whatever's in @grades, but the
developer didn't ask for any and we default to trusting the developer.


> > 5. Scalars require valid assignments
>
> > This restriction doesn't apply to arrays or hashes because being empty
> trivially passes the check.
>
> Does it? `my ARRAY[DEFINED] @foo;` or `check ArrayOfExactlyOne = sub
> (@array) { length @array == 1 }; my ArrayOfExactlyOne @foo;`
>

For `my ARRAY[DEFINED] @foo;`, an empty array is just fine (unless your
suggested syntax means something I'm unaware of). However, the
`ArrayofExactlyOne` would not accept an empty array because you said it
should not.

Best,
Ovid
Re: Data checks semantics [ In reply to ]
On 5/25/23 16:22, Ovid wrote:
> Hi all,
>
> No one responded to my email linking the full data checks spec. I'm
> not surprised because it was linking to a huge and overwhelming
> document. I've now had the time to summarize the key semantics that
> we've defined.

I read the spec; I'm a very heavy user of Type::Tiny, so the material in
the spec seems very familiar.  Without re-writing some of my existing
type/coercion code into this system, it's difficult for me to assess if
there's something that can't easily be translated or done differently.

None of your eight key summary points seem out of line with what I'd
expect (again as a user of Type::Tiny).

I don't think user-defined checks are necessary in the MVP.  I can see a
hybrid approach, using the built in check in the MVP alongside of
Type::Tiny for the user-defined checks.

I use Type::Tiny coercions quite a bit, so I would very much like
coercions to be part of this effort.

To be honest, I would probably not use this system if coercions weren't
part of it.  To have to (re-) introduce all of the boilerplate
conversion code into the caller would be a giant step backwards for me. 
I agree that debugging failing coercions can be challenging, but I don't
think that's integral to coercions; I think that the tooling could be
improved.

Diab
Re: Data checks semantics [ In reply to ]
On Thu, 25 May 2023 at 22:22, Ovid <curtis.poe@gmail.com> wrote:

> Hi all,
>
> No one responded to my email linking the full data checks spec. I'm not
> surprised because it was linking to a huge and overwhelming document. I've
> now had the time to summarize the key semantics that we've defined.
>

I read it as well.


>
> I hope these can help clarify some things and help the conversation move
> forward. Note that the syntax is how we defined it in the Data::Checks
> module. Consider it "for information purposes" only. We're largely focusing
> on semantics here. I came up with eight key points, with the third point
> being troublesome.
>
> *1. Checks are on the variable, not the data*
>
> my $foo :of(INT) = 4;
> $foo = 'hello'; # fatal
>
> However:
>
> my $foo :of(INT) = 4;
> my $bar = $foo;
> $bar = 'hello'; # legal
>
> This is because we don't want checks to have "infectious" side effects
> that might surprise you. The developer should have full control over the
> data checks.
>
>
additional usecases:
- when $foo is a reference (eg, for, sub)

my @data :of(INT) = ...;
for my $foo (@data) { $foo = q (hello) }

- what if I want to copy value and check? eg:

my $bar :of($foo);
my $bar :of(OF[$foo]);



> *2. No type inference*
>
> No surprises. The developer should have full control over the data checks.
>
>
> *3. Checks are on assignment to the variable*
> *This is probably the most problematic bit.*
>
> A check applied to a variable is not an invariant on that variable. It's a
> prerequisite for assignment to that variable.
>
> An invariant on the variable would guarantee that the contents of the
> variable must always meet a given constraint; a "prerequisite for
> assignment" only guarantees that each element must be assigned values that
> meet the constraint at the moment they are assigned.
>
> So an array such as `my @data :of(HASH[INT])` only requires that each
> element of `@data` must be assigned a hashref whose values are integers. If
> you were to subsequently modify an element like so (with the caveat that
> the two lines aren't exactly equivalent):
>
> $data[$idx] = { $key => 'not an integer' }; # fatal
> $data[$idx]{$key} = 'not an integer"; # not fatal !
>
> The second assignment is not modifying `@data` directly, only retrieving a
> value from it and modifying the contents of an entirely different variable
> through the retrieved reference value.
>
> We *could* specify that checks are invariants, instead of prerequisites,
> but that would require that any reference value stored within a checked
> arrayref or hashref would have to have checks automatically and recursively
> applied to them as well, which would greatly increase the cost of checking,
> and might also lead to unexpected action-at-a-distance, when the
> now-checked references are modified through some other access mechanism.
>
>
depends on implementation:
one can change SV implementation to contain "magic function map" + "its
local storage" (as linked list), so in this case every value will store both
required checks and verified checks (two different maps), so assignment
will look like:

for each required check {
perform required check unless exists verified_checks[ required check };
scream;
}


> Moreover, we would have to ensure that such auto-subchecked references
> were appropriately “de-checked” if they are ever removed from the checked
> container. And how would we manage any conflict if the nested referents
> happened to have their own (possibly inconsistent) checks?
>

solvable by splitting checks into two categories: on assignment only
verified checks will be used, required checks will come from assignment
context


> So the checks are simply assertions on direct assignments, rather than
> invariants over a variable’s entire nested data structure.
>
> This is unsatisfying, but we're playing with the matches we have, not the
> flamethrower we want.
>
>
> *4. Signature checks*
> We need to work out the syntax, but the current plan is something like
> this:
>
> sub count_valid :returns(UINT) (@customers :of(OBJ[Customer])) {
> ...
> }
>
> The `@customers` variable should maintain the check in the body of the
> sub, but the return check is applied once and only once on the data
> returned at the time that it's returned.
>
>
- why not "sub count_valid :of(UINT)" ? (make more sense with later
comments)
- quite often I run across problem that check on one parameter depends on
value of another parameter, eg:
- parameters A is required when parameter B is defined
- parameters A and B must be both positive or both negative numbers


>
> *5. Scalars require valid assignments*
> my $total :of(NUM); # fatal, because undef fails the check
>
> This is per previous discussions. Many languages allow this:
>
> int foo;
>
> But as soon as you assign something to `foo`, it's fatal if it's not an
> integer. For Perl, that's a bit tricky as there's no clear difference
> between uninitialized and undef. While using that variable prior to
> assignment is fatal in many languages, that would be more difficult in
> Perl. Thus, we require a valid assignment.
>
>
Implementation details:
its possible to distinguish between declared and declared with assigned
value (two different code paths in C)
storing status of assigned value may also lead to easier implementation of
"not-exists" value

with aforementioned "magic" list for such variable will look like:
{ fatal_on_read, NULL }
{ check_on_assign, { required: { ... } } }
{ sv } - current SV operations

where fatal_on_read will contain "assign value" as:
- execute next "assign value"
- if success, drop fatal_on_read from list

As a workaround, this is bad, but valid:
>
> my $total :of(INT|UNDEF);
>
> This restriction doesn't apply to arrays or hashes because being empty
> trivially passes the check.
>
>
> *6. Fatal*
> By default, a failed check is fatal. We have provisions to downgrade them
> to warnings or disable them completely.
>
>
> *7. Internal representation*
> my $foo :of(INT) = "0";
> Dump($foo);
>
> "0" naturally coerces to an integer, so that's allowed. However, we don't
> plan (for the MVP) to guarantee that Dump shows an IV instead of a PV.
> We're hoping that can be addressed post-MVP.
>
> *8. User-defined checks*
>
> Users should be able to define their own checks:
>
> check LongStr :params($N :of(PosInt)) :isa(STR) ($n) { length $n >= $N
> }
>

I'd add a rule, that every user defined check must extend existing check
(either user defined or builtin)
I'd also prefer (maybe as default warning) that primitive checks can be
used only as bases for user defined checks.

personal preference:
- :extend instead of :isa
- use $_
- adopt concepts of facets used by XML schema, eg:

check Long_String
:facet(Min_Length :of(PosInt))
:extends(STR)
{ length >= $Min_Length }

or:
check Long_String
:facet(Min_Length :of(PosInt))
:extends(STR[.Min_Length ($Min_Length))
;

This will allow:
- mix multiple of them, eg:
Str [. Min_Length (10), Max_Length (32), Pattern (qr/.../) }

- embed additional information for outer usage (like: generate openAPI /
WSDL) eg:
Limit [ Extends (Positive_Int), Max_Value (1_000), Abstract (q (...)) ]


> The above would allow this:
>
> my $name :of(LongStr[10]) = get_name(); # must be at least 10
> characters
>
> The body of a check definition should return a true or false value, or
> die/croak with a more useful message.
>
> A user-defined check is *not* allowed to change the value of the variable
> passed in. Otherwise, we could not safely disable checks on demand
> (coercions are not planned for the MVP, but we have them specced and they
> use a separate syntax).
>
> I was thinking user-defined checks should be post-MVP, but it's unclear to
> me how useful checks would be without them. That's a discussion for later.
>
>
Missing topics:
- what is scope of check names? (respects package? collides with
sub/phaser/variables/formats/... ?
- how to use/import checks from different scope? (if possible)
- how to define callback check?

I'd like to point out that there already exists phaser CHECK
Reading and thinking more about it I'd like to suggest different name:
contract

my $foo :contract(...);
sub foo :contract(...) ...


> Best,
> Ovid
>
Re: Data checks semantics [ In reply to ]
On 2023-05-25 10:24 p.m., Ovid wrote:
> Agreed, but there were several key issues raised.
>
> 2. How do we handle conflicting checks bound to internal references?

This is actually trivially easy. A check is a predicate function. It doesn't
have side-effects. You just run all of them and do a logical "AND" of the
results. The check passes if all attached checks pass. The end. Entirely
consistent and predictable. If it so happens that there are no input values
that satisfy all checks, then all assignments are rejected, its logically
consistent.

-- Darren Duncan
Re: Data checks semantics [ In reply to ]
I have a general idea or several which might resolve some of the contentious
issues. I will keep myself brief but some of it should probably be expanded.

First main point, generalize the concept of "variable" to "container", or
possibly "reference".

Anything that can be the target of an assignment or assignment-like operation,
which is conceptually a container whose content is being replaced, that
container is what has the check attached to it, and the check is asserted any
time some operation tries to replace the content of that container.

Second main point, it should be possible to define a check on a value
expression, which has the effect of asserting the result of the expression, and
the check is itself an expression, like a predicate function call, but it is
special in that Perl will raise an exception or whatever if it fails.

Third main point, lets see if we can support immutable or frozen versions of
everything, for example hashrefs and arrayrefs that guarantee that no elements
can be replaced or added or removed etc. This immutability is not itself
recursive, but if elements are themselves separately declared immutable, it is
effectively so.

When we have this third thing etc, then we effectively DO support invariant
checks, where that guarantee is in effect recursively up to any point where we
have a mutable container, which is where it stops, and if the whole thing is
declared immutable, then the invariant guarantee goes all the way.

-- Darren Duncan

On 2023-05-25 1:22 p.m., Ovid wrote:
> Hi all,
>
> No one responded to my email linking the full data checks spec. I'm not
> surprised because it was linking to a huge and overwhelming document. I've now
> had the time to summarize the key semantics that we've defined.
>
> I hope these can help clarify some things and help the conversation move
> forward. Note that the syntax is how we defined it in the Data::Checks module.
> Consider it "for information purposes" only. We're largely focusing on semantics
> here. I came up with eight key points, with the third point being troublesome.
>
> *1. Checks are on the variable, not the data*
>
>     my $foo :of(INT) = 4;
>     $foo = 'hello'; # fatal
>
> However:
>
>     my $foo :of(INT) = 4;
>     my $bar = $foo;
>     $bar = 'hello'; # legal
>
> This is because we don't want checks to have "infectious" side effects that
> might surprise you. The developer should have full control over the data checks.
>
> *2. No type inference*
>
> No surprises. The developer should have full control over the data checks.
>
> *3. Checks are on assignment to the variable
> *
> /This is probably the most problematic bit./
>
> A check applied to a variable is not an invariant on that variable. It's a
> prerequisite for assignment to that variable.
>
> An invariant on the variable would guarantee that the contents of the variable
> must always meet a given constraint; a "prerequisite for assignment" only
> guarantees that each element must be assigned values that meet the constraint at
> the moment they are assigned.
>
> So an array such as `my @data :of(HASH[INT])` only requires that each element of
> `@data` must be assigned a hashref whose values are integers. If you were to
> subsequently modify an element like so (with the caveat that the two lines
> aren't exactly equivalent):
>
>     $data[$idx]       = { $key => 'not an integer' }; # fatal
>     $data[$idx]{$key} = 'not an integer";             # not fatal !
>
> The second assignment is not modifying `@data` directly, only retrieving a value
> from it and modifying the contents of an entirely different variable through the
> retrieved reference value.
>
> We *could* specify that checks are invariants, instead of prerequisites, but
> that would require that any reference value stored within a checked arrayref or
> hashref would have to have checks automatically and recursively applied to them
> as well, which would greatly increase the cost of checking, and might also lead
> to unexpected action-at-a-distance, when the now-checked references are modified
> through some other access mechanism.
>
> Moreover, we would have to ensure that such auto-subchecked references were
> appropriately “de-checked” if they are ever removed from the checked container.
> And how would we manage any conflict if the nested referents happened to have
> their own (possibly inconsistent) checks?
>
> So the checks are simply assertions on direct assignments, rather than
> invariants over a variable’s entire nested data structure.
>
> This is unsatisfying, but we're playing with the matches we have, not the
> flamethrower we want.
>
> *4. Signature checks
> *
> We need to work out the syntax, but the current plan is something like this:
>
>     sub count_valid :returns(UINT) (@customers :of(OBJ[Customer])) {
>         ...
>     }
>
> The `@customers` variable should maintain the check in the body of the sub, but
> the return check is applied once and only once on the data returned at the time
> that it's returned.
>
> *5. Scalars require valid assignments
> *
>     my $total :of(NUM); # fatal, because undef fails the check
>
> This is per previous discussions. Many languages allow this:
>
>     int foo;
>
> But as soon as you assign something to `foo`, it's fatal if it's not an
> integer.  For Perl, that's a bit tricky as there's no clear difference between
> uninitialized and undef. While using that variable prior to assignment is fatal
> in many languages, that would be more difficult in Perl. Thus, we require a
> valid assignment.
>
> As a workaround, this is bad, but valid:
>
>     my $total :of(INT|UNDEF);
>
> This restriction doesn't apply to arrays or hashes because being empty trivially
> passes the check.
>
> *6. Fatal
> *
> By default, a failed check is fatal. We have provisions to downgrade them to
> warnings or disable them completely.
>
> *7. Internal representation
> *
>     my $foo :of(INT) = "0";
>     Dump($foo);
>
> "0" naturally coerces to an integer, so that's allowed. However, we don't plan
> (for the MVP) to guarantee that Dump shows an IV instead of a PV. We're hoping
> that can be addressed post-MVP.
>
> *8. User-defined checks*
>
> Users should be able to define their own checks:
>
>     check LongStr :params($N :of(PosInt)) :isa(STR) ($n) { length $n >= $N }
>
> The above would allow this:
>
>     my $name :of(LongStr[10]) = get_name(); # must be at least 10 characters
>
> The body of a check definition should return a true or false value, or die/croak
> with a more useful message.
>
> A user-defined check is /not/ allowed to change the value of the variable passed
> in. Otherwise, we could not safely disable checks on demand (coercions are not
> planned for the MVP, but we have them specced and they use a separate syntax).
>
> I was thinking user-defined checks should be post-MVP, but it's unclear to me
> how useful checks would be without them. That's a discussion for later.
>
> Best,
> Ovid
Re: Data checks semantics [ In reply to ]
On Fri, May 26, 2023 at 3:44?AM Darren Duncan <darren@darrenduncan.net> wrote:
>
> On 2023-05-25 10:24 p.m., Ovid wrote:
> > Agreed, but there were several key issues raised.
> >
> > 2. How do we handle conflicting checks bound to internal references?
>
> This is actually trivially easy. A check is a predicate function. It doesn't
> have side-effects. You just run all of them and do a logical "AND" of the
> results. The check passes if all attached checks pass. The end. Entirely
> consistent and predictable. If it so happens that there are no input values
> that satisfy all checks, then all assignments are rejected, its logically
> consistent.

Actually it's easier than that …

1. Checks are on the variable, not the data

References are data not variables, there are no checks bound to them …
or there shouldn't be.

-Chris
Re: Data checks semantics [ In reply to ]
Op 26-05-2023 om 09:35 schreef Branislav Zahradník:
>
> I'd like to point out that there already exists phaser CHECK
> Reading and thinking more about it I'd like to suggest different name:
> contract
>
> my $foo :contract(...);
> sub foo :contract(...) ...


I don't particularly like :of nor :contract. I liked the earlier
proposed :is better. Or what about simply :check(...)? Does what it says
on the tin.


M4
Re: Data checks semantics [ In reply to ]
> I don't particularly like :of nor :contract. I liked the earlier proposed
> :is better. Or what about simply :check(...)? Does what it says on the tin.
>
There are many other words available, for example constraint.

It was only pointing out that there already exists behaviour named CHECK
different from proposed here, especially if you will have code like:
CHECK { ... }
check Foo, ...;

I chose `contract` because it is already used in many computer science
books I read -> with that (and well written documentaion) it sounds like
more powerful
version of currently used words (by other languages) like type, check.

Later it can be also used not only as `requires contract`, but also as
`provides contract` (available for static/compile time check, like Java's
class Foo implements Bar)


>
> M4
>
>
>
Re: Data checks semantics [ In reply to ]
On Sat, May 27, 2023 at 10:44?AM Martijn Lievaart via perl5-porters <
perl5-porters@perl.org> wrote:

> I'd like to point out that there already exists phaser CHECK
> Reading and thinking more about it I'd like to suggest different name:
> contract
>
> my $foo :contract(...);
> sub foo :contract(...) ...
>
> I don't particularly like :of nor :contract. I liked the earlier proposed
> :is better. Or what about simply :check(...)? Does what it says on the tin.
>

It's a contentious issue. We wanted something short so that we don't
discourage its use. However, there's a need for subroutines to distinguish
between what they accept and what they return. For example, I have some
code which does this:

sub get :returns(DEF & !VOID) ( $self, $key :of(STR) ) {
...
}

So it only accepts a string as an argument, the value it returns must be
defined, and calling it in void context is a fatal error. That !VOID
doesn't make any sense in an :of(). Also, :returns(LIST[INT]) is a thing
which doesn't make sense in an :of(). So this is again a case of semantics
impact syntax.

It's interesting that the attribute declaration doesn't seem to put people
off as much as I thought it would, but that being said, we still need to
agree on syntax. Maybe I should call a vote at some point? I dunno. Hard to
get consensus on a mailing list.

While thinking about syntax, I'm was thinking about Branislav's desire to
attach this information to expressions. The following came to mind:

@sorted = map { $_->[0] }
sort { $a->[1] cmp $b->[1] or $a->[0] cmp $b->[0] }
map *:returns(TUPLE[INT, STR])* { [$_, expensive($_)] }
@unsorted;

I've had long map/sort/grep pipelines before and they can be very painful
to debug them. Slapping a check on them, even temporarily, would make this
much easier. I don't know how easy that would be, particularly if we don't
use attributes and decide to lower case check names. I've no idea if this
would even parse?

@sorted = map { $_->[0] }
sort { $a->[1] cmp $b->[1] or $a->[0] cmp $b->[0] }
map [int, str] { [$_, expensive($_)] }
@unsorted;

Whatever syntax we choose, we're likely to cut off some options and it
would be nice to think about that.

Best,
Ovid
Re: Data checks semantics [ In reply to ]
On Sat, 27 May 2023 at 11:24, Ovid <curtis.poe@gmail.com> wrote:

> On Sat, May 27, 2023 at 10:44?AM Martijn Lievaart via perl5-porters <
> perl5-porters@perl.org> wrote:
>
>> I'd like to point out that there already exists phaser CHECK
>> Reading and thinking more about it I'd like to suggest different name:
>> contract
>>
>> my $foo :contract(...);
>> sub foo :contract(...) ...
>>
>> I don't particularly like :of nor :contract. I liked the earlier proposed
>> :is better. Or what about simply :check(...)? Does what it says on the tin.
>>
>
> It's a contentious issue. We wanted something short so that we don't
> discourage its use. However, there's a need for subroutines to distinguish
> between what they accept and what they return. For example, I have some
> code which does this:
>
> sub get :returns(DEF & !VOID) ( $self, $key :of(STR) ) {
> ...
> }
>
>
Just explain how (probably) my brain works ... I don't care what function
returns, I care about contract it fulfills, eg:
sub foo :contract(VOID);
sub bar :contract(DEF);

regarding VOID - what do you mean by ! VOID? prohibit to call function in
void context ?

So it only accepts a string as an argument, the value it returns must be
> defined, and calling it in void context is a fatal error. That !VOID
> doesn't make any sense in an :of(). Also, :returns(LIST[INT]) is a thing
> which doesn't make sense in an :of(). So this is again a case of
> semantics impact syntax.
>
> It's interesting that the attribute declaration doesn't seem to put people
> off as much as I thought it would, but that being said, we still need to
> agree on syntax. Maybe I should call a vote at some point? I dunno. Hard to
> get consensus on a mailing list.
>
> While thinking about syntax, I'm was thinking about Branislav's desire to
> attach this information to expressions. The following came to mind:
>
> @sorted = map { $_->[0] }
> sort { $a->[1] cmp $b->[1] or $a->[0] cmp $b->[0] }
> map *:returns(TUPLE[INT, STR])* { [$_, expensive($_)] }
> @unsorted;
>
> I've had long map/sort/grep pipelines before and they can be very painful
> to debug them. Slapping a check on them, even temporarily, would make this
> much easier. I don't know how easy that would be, particularly if we don't
> use attributes and decide to lower case check names. I've no idea if this
> would even parse?
>
> @sorted = map { $_->[0] }
> sort { $a->[1] cmp $b->[1] or $a->[0] cmp $b->[0] }
> map [int, str] { [$_, expensive($_)] }
> @unsorted;
>
>
Unless you introduce new sigil every expression without attribute like
syntax will conflict with current syntax. Example:
map { Enum_Foo => Str } { [ $_, expensive ($_) ] }

I think here we should benefit from single keyword, providing implicit
ARRAY / HASH when [] / {} is used in check defintion context
map :contract([INT, INT])

( using 'contract' as representation of single keyword approach only)

Whatever syntax we choose, we're likely to cut off some options and it
> would be nice to think about that.
>
> Best,
> Ovid
>
Re: Data checks semantics [ In reply to ]
Hi there,

On Sat, 27 May 2023, Ovid wrote:

> ...
> It's interesting that the attribute declaration doesn't seem to put people
> off as much as I thought it would ...

It does, but we've probably given up on the whole thing already.

--

73,
Ged.
Re: Data checks semantics [ In reply to ]
On Sat, May 27, 2023 at 2:05?PM G.W. Haywood <
perl5porters@jubileegroup.co.uk> wrote:

> > ...
> > It's interesting that the attribute declaration doesn't seem to put
> people
> > off as much as I thought it would ...
>
> It does, but we've probably given up on the whole thing already.


Could you provide more context? Given up on what? The syntax, or the whole
idea of data checks? Why did you "give up"?

I know that the syntax would cause the proposal to be rejected, but I can't
tell what syntax people want. Worse, while people don't seem to be
objecting to a PPC for data checks, I can't tell if people actually *want* it.
So I really would like to know what you mean by "given up."

Best,
Ovid
Re: Data checks semantics [ In reply to ]
On 5/27/23 12:58, Ovid wrote:
> Worse, while people don't seem to be objecting to a PPC for data
> checks, I can't tell if people actually /want/ it.
>
I want it.
Re: Data checks semantics [ In reply to ]
Of course, we want data checks.
Ovid, maybe you can create another GitHub repository similar to Corinna
(Ovid/Cor) and document the syntax?
I think the syntax came before the implementation for feature 'class' as
well.
Thanks.

On Sat, May 27, 2023 at 8:55?PM Diab Jerius <dj.p5p@avoiding.work> wrote:

> On 5/27/23 12:58, Ovid wrote:
>
> Worse, while people don't seem to be objecting to a PPC for data checks, I
> can't tell if people actually *want* it.
>
> I want it.
>
>
>
>
Re: Data checks semantics [ In reply to ]
Hi there,

On Sat, 27 May 2023, Ovid wrote:
> On Sat, May 27, 2023 at 2:05?PM G.W. Haywood wrote:
>
>>> ...
>>> It's interesting that the attribute declaration doesn't seem to put
>> people off as much as I thought it would ...
>>
>> It does, but we've probably given up on the whole thing already.
>
> ...Why did you "give up"?

As I said in

https://www.nntp.perl.org/group/perl.perl5.porters/2023/05/msg266363.html

if you absolutely must do something like this I'd be happy with

my int $scalar;

and nothing else for the time being. The KISS principle.

As I said in that message, some of what I see makes my flesh crawl.
We're off into the weeds, as I think happens in Perl dicussions, with
something that's going to take years to get right - assuming it ever
does turn out right, and doesn't turn into another given/when fiasco.

> ... I can't tell if people actually *want* it. ...

I think you can tell how much people really want it by dividing the
number of people saying so on this list by the number of Perl coders
Out There coding in Perl. For the great majority, all this is of no
interest whatever, and they'll never use it. It might even push them
still further away from Perl coding, because there's more than enough
complexity already. I can't see that I'd ever invest any time in it;
if I wanted to try coding in a new language it wouldn't be a version
of Perl that I've never tried before, because I've tried the one that
I'm using, and it's broken, and nobody seems interested in fixing it.
They just seem to want to add more stuff to it that will surely break
and also need fixing, and probably never get fixed because the really
good people don't do stuff like that.

It's not that I don't want it. I actually want it to *not* happen, at
least not to Perl 5. Knock yourself out with Perl 7.

Sorry if this isn't the sort of thing you want to hear. You did ask.

--

73,
Ged.
Re: Data checks semantics [ In reply to ]
G.W. Haywood via perl5-porters <perl5-porters@perl.org> wrote:
> On Sat, 27 May 2023, Ovid wrote:
>>
>> ... I can't tell if people actually *want* it. ...

Personally, I'm following this discussion with great interest. However, I feel the difficulty of performing data checks inside nested hash/array structures may ultimately hamstring it to a degree that I'm not actually sure I'd really use it for variables at all.

The signature checks, on the other hand, are something I really do want to have available in Perl and I hope they will get added.

Syntax-wise, I'm not sure yet what I'd prefer myself, much less what's right for Perl.


> I can't see that I'd ever invest any time in it;
> if I wanted to try coding in a new language it wouldn't be a version
> of Perl that I've never tried before, because I've tried the one that
> I'm using, and it's broken, and nobody seems interested in fixing it.

Is it a supported version of Perl? If so, could you point to a specific bug report?


> It's not that I don't want it. I actually want it to *not* happen, at
> least not to Perl 5. Knock yourself out with Perl 7.

For stuff to get into Perl 7.0, it needs to be added to Perl 5.x first.

This is according to the PSC's strategy; see:
https://blogs.perl.org/users/psc/2022/05/what-happened-to-perl-7.html


--
Arne Johannessen
<https://arne.johannessen.de/>
Re: Data checks semantics [ In reply to ]
Hi there,

On Sun, 28 May 2023, Arne Johannessen wrote:
> G.W. Haywood via perl5-porters <perl5-porters@perl.org> wrote:
>>
>> ... it's broken, and nobody seems interested in fixing it.
>
> Is it a supported version of Perl? If so, could you point to a specific bug report?

1. Apparently this one isn't just one issue, it's a class of issues:

https://github.com/Perl/perl5/issues/12573#issuecomment-1406905309

I still don't know if it's supposed to be open or closed. Github's UI
seems to be saying it's closed, but there's a note from 2013 (and yes,
that's 2013 not 2023) saying

@cpansprout - Status changed from 'resolved' to 'open'

then the next note after that is mine of Jan 27 2023. I'd hoped to be
able to make some sort of a contribution there, but when I discovered
the scale of the issue I decided that with my then current workload it
was too much to even think about taking on. With my current workload
I wouldn't even have looked for the issue on Github in the first place.

2. Something like the below may or may not be in Perl/perl5/issues/:

https://www.nntp.perl.org/group/perl.perl5.porters/2023/05/msg266363.html

See my paragraph which begins "Isn't there more to it than that?".

If it is indeed a Perl core issue - maybe it is, maybe it isn't - then,
given that my browser and Github don't get on with each other, if it is
in Github I don't know how I'd find it. Whatever it is, it took two of
us a month to pin down and work around. Short version: "Don't print it".

I'm just trying to remember what the expression was when I brought these
up here... oh, yes, that was it: "Pet peeves." Not very encouraging.

> For stuff to get into Perl 7.0, it needs to be added to Perl 5.x first.

Indeed. In my opinion, for something as old and as widely used as
Perl 5, that's completely crackers. My main pet peeve is that when
things don't work as designed you should stop adding more bits that
don't work as designed either, and fix the things that need fixing.
I know it's not as much fun, I'm sorry but life's like that and you
just have to think yourself lucky you're not in a trench in Bakhmut.

Sorry, rant over for now.

--

73,
Ged.
Re: Data checks semantics [ In reply to ]
On Sun, May 28, 2023 at 1:02?PM Arne Johannessen via perl5-porters <
perl5-porters@perl.org> wrote:


> > It's not that I don't want it. I actually want it to *not* happen, at
> > least not to Perl 5. Knock yourself out with Perl 7.
>
> For stuff to get into Perl 7.0, it needs to be added to Perl 5.x first.
>

While I don't *think* this is what you were referring to, I think data
checks should *not* be a target for Perl 7. I mean, I would love for them
to work, but this spans the entire language and will have a huge impact.
I'm thinking Perl 8 for this.

Best,
--
Curtis "Ovid" Poe
--
CTO, All Around the World
World-class software development and consulting
https://allaroundtheworld.fr/
Re: Data checks semantics [ In reply to ]
On Sun, 28 May 2023 at 13:55, G.W. Haywood via perl5-porters
<perl5-porters@perl.org> wrote:
>
> Hi there,
>
> On Sun, 28 May 2023, Arne Johannessen wrote:
> > G.W. Haywood via perl5-porters <perl5-porters@perl.org> wrote:
> >>
> >> ... it's broken, and nobody seems interested in fixing it.
> >
> > Is it a supported version of Perl? If so, could you point to a specific bug report?
>
> 1. Apparently this one isn't just one issue, it's a class of issues:
>
> https://github.com/Perl/perl5/issues/12573#issuecomment-1406905309
...
> 2. Something like the below may or may not be in Perl/perl5/issues/:
>
> https://www.nntp.perl.org/group/perl.perl5.porters/2023/05/msg266363.html
>
...
> > For stuff to get into Perl 7.0, it needs to be added to Perl 5.x first.
>
> Indeed. In my opinion, for something as old and as widely used as
> Perl 5, that's completely crackers. My main pet peeve is that when
> things don't work as designed you should stop adding more bits that
> don't work as designed either, and fix the things that need fixing.

Neither of the two issues you raised have to do with this thread or
the subject that Ovid is trying to discuss.

Yves

--
perl -Mre=debug -e "/just|another|perl|hacker/"
Re: Data checks semantics [ In reply to ]
Hi there,

On Sun, 28 May 2023, demerphq wrote:

> On Sun, 28 May 2023 at 13:55, G.W. Haywood via perl5-porters wrote:
>> On Sun, 28 May 2023, Arne Johannessen wrote:
>>> G.W. Haywood via perl5-porters <perl5-porters@perl.org> wrote:
>>>>
>>>> ... it's broken, and nobody seems interested in fixing it.
>>>
>>> Is it a supported version of Perl? If so, could you point to a specific bug report?
>>
>> https://github.com/Perl/perl5/issues/12573#issuecomment-1406905309
>> https://www.nntp.perl.org/group/perl.perl5.porters/2023/05/msg266363.html
>> ...
>
> Neither of the two issues you raised have to do with this thread or
> the subject that Ovid is trying to discuss.

Indeed not - I was just answering Arne's question. Our perambulation
*started* when I answered Ovid's question, which was something along
the lines of "does anybody care?" Here's the fallout. It's my view
that, to a first and very likely a second approximation, nobody does.
Having said that it's interesting Perl 8 just got a mention. I'll
crawl back into my cave now and bother you no more in this thread.

--

73,
Ged.
Re: Data checks semantics [ In reply to ]
On 5/28/23 10:40, G.W. Haywood via perl5-porters wrote:
> Hi there,
>
> On Sun, 28 May 2023, demerphq wrote:
>
>> On Sun, 28 May 2023 at 13:55, G.W. Haywood via perl5-porters wrote:
>>> On Sun, 28 May 2023, Arne Johannessen wrote:
>>>> G.W. Haywood via perl5-porters <perl5-porters@perl.org> wrote:
>>>>>
>>>>> ... it's broken, and nobody seems interested in fixing it.
>>>>
>>>> Is it a supported version of Perl? If so, could you point to a
>>>> specific bug report?
>>>
>>> https://github.com/Perl/perl5/issues/12573#issuecomment-1406905309
>>> https://www.nntp.perl.org/group/perl.perl5.porters/2023/05/msg266363.html
>>>
>>> ...
>>
>> Neither of the two issues you raised have to do with this thread or
>> the subject that Ovid is trying to discuss.
>
> Indeed not - I was just answering Arne's question.  Our perambulation
> *started* when I answered Ovid's question, which was something along
> the lines of "does anybody care?"  Here's the fallout.  It's my view
> that, to a first and very likely a second approximation, nobody does.

Type::Tiny has 477 reverse dependencies on CPAN.  I would say that to a
first approximation a lot of people care.

1 2  View All