Mailing List Archive

use utf8 WAS Re: Pre-RFC: a `module` keyword
> On Jan 24, 2022, at 14:20, Ovid via perl5-porters <perl5-porters@perl.org> wrote:
>
> Benefits:
>
> * Postfix block lexically scopes changes
> * Strict, warnings, utf8 source, signatures, and "no feature 'indirect'" by default

The last discussion of adding utf8.pm to the feature bundle seemed to land at the conclusion that it’s a bad idea for `print "hi"` to be subtly-wrong in “modern” Perl … enough so that utf8.pm should remain absent from the feature bundle.

The alternate proposal was instead to make the feature bundle require that any non-`use utf8` Perl would need to be plain ASCII; thus, people only get the print('hi')-is-subtly-wrong problem if they go out of their way to `use utf8`, but there’s also something to discourage Unicode matches against undecoded string constants.

I personally would like to see Perl gain the ability to distinguish bytes from characters reliably before any default `use utf8` stuff goes in, even for “new hotness” like the proposed. (Then, teach Perl to reject “mismatches”, e.g., JSON-decode of character strings.) The fact that `use utf8` at least *conceptually*--if not visibly--breaks the print()ing of string constants is, IMO, a compelling reason to avoid it. It fixes one problem but introduces a new one.

cheers,
-Felipe