Mailing List Archive

Pre-RFC: command-line flag for slurping
"-0777" flag is the usual way to read the whole file at once (instead of
line by line) in one-liners.

I feel this isn't ideal. "-0" is a bad flag. It's overly general, users
rarely need $/ to be set to anything other than undef or "\n". Also, the
input record separator has to be specified as an octal number, which is
weird. The fact that the numbers above 0o377 are special-cased to mean
"undef" makes it even more confusing.

Slurping is an extremely common operation and it deserves its own
one-letter flag. I propse "-g" (mnemonics: gobble, grab, gulp). I wish
it could be "-s", but sadly it's already taken :(
Re: Pre-RFC: command-line flag for slurping [ In reply to ]
I think there should be a standard slurp keyword built in to the language.
It doesn't matter what it's called, but reading a file into a string comes under "easy things should be easy".


This email and any files transmitted with it are CONFIDENTIAL and are intended solely for the use of the individual(s) or entity to whom they are addressed. Any unauthorised copying, disclosure or distribution of the material within this email is strictly forbidden. Any views or opinions presented within this email are solely those of the author and do not necessarily represent those of PGIM Limited, QMA Wadhwani LLP or their affiliates unless otherwise specifically stated. An electronic message is not binding on its sender. Any message referring to a binding agreement must be confirmed in writing and duly signed. If you have received this email in error, please notify the sender immediately and delete the original. Telephone, electronic and other communications and conversations with PGIM Limited, QMA Wadhwani LLP and/or their associated persons may be recorded and retained. PGIM Limited and QMA Wadhwani LLP are authorised and regulated by the Financial Conduct Authority. PGIM Limited (registered in England No. 3809039) has its registered office at Grand Buildings, 1-3 Strand, Trafalgar Square, London WC2N 5HR and QMA Wadhwani LLP (registered in England No. OC303168) has its registered office at 9th Floor, Orion House, 5 Upper St. Martin's Lane, London, England, WC2H 9EA.

Please note that your personal information may be stored and processed in any country where we have facilities or in which we engage service providers. If you provide personal information to us by email or otherwise, you consent to the transfer of that information to countries outside of your country of residence and these countries may have different data protection rules than your country.

To learn about our privacy policies, please use this link<https://www.pgim.com/disclaimer/privacy-center> to read the Privacy Notices.
Re: Pre-RFC: command-line flag for slurping [ In reply to ]
On Wed, Sep 29, 2021 at 11:00 AM Tomasz Konojacki <me@xenu.pl> wrote:

> "-0777" flag is the usual way to read the whole file at once (instead of
> line by line) in one-liners.
>
> I feel this isn't ideal. "-0" is a bad flag. It's overly general, users
> rarely need $/ to be set to anything other than undef or "\n". Also, the
> input record separator has to be specified as an octal number, which is
> weird. The fact that the numbers above 0o377 are special-cased to mean
> "undef" makes it even more confusing.
>
> Slurping is an extremely common operation and it deserves its own
> one-letter flag. I propse "-g" (mnemonics: gobble, grab, gulp). I wish
> it could be "-s", but sadly it's already taken :(
>

I think this is an excellent idea. This sort of processing using Perl
oneliners is extremely common and spread across the internet, and the
'-0777' flag is a constant source of confusion. Ideally perl would have
support for long options so we didn't have to take up the dwindling
one-letter options, but in the meantime, they are not in high demand so IMO
it's fine to use one for this.

-Dan
Re: Pre-RFC: command-line flag for slurping [ In reply to ]
On 9/29/21 08:21, Dan Book wrote:
> On Wed, Sep 29, 2021 at 11:00 AM Tomasz Konojacki <me@xenu.pl> wrote:
>
> "-0777" flag is the usual way to read the whole file at once
> (instead of
> line by line) in one-liners.
>
> I feel this isn't ideal. "-0" is a bad flag. It's overly general,
> users
> rarely need $/ to be set to anything other than undef or "\n".
> Also, the
> input record separator has to be specified as an octal number,
> which is
> weird. The fact that the numbers above 0o377 are special-cased to mean
> "undef" makes it even more confusing.
>
> Slurping is an extremely common operation and it deserves its own
> one-letter flag. I propse "-g" (mnemonics: gobble, grab, gulp). I wish
> it could be "-s", but sadly it's already taken :(
>
>
> I think this is an excellent idea. This sort of processing using Perl
> oneliners is extremely common and spread across the internet, and the
> '-0777' flag is a constant source of confusion. Ideally perl would
> have support for long options so we didn't have to take up the
> dwindling one-letter options, but in the meantime, they are not in
> high demand so IMO it's fine to use one for this.
>
> -Dan

I totally support this feature... and I think it should be a --long
option instead of "-g". The Perl interpretter is *long overdue* for long
options.

- Scott
Re: Pre-RFC: command-line flag for slurping [ In reply to ]
On Wed, 29 Sep 2021 17:00:15 +0200
Tomasz Konojacki <me@xenu.pl> wrote:

> Slurping is an extremely common operation and it deserves its own
> one-letter flag. I propse "-g" (mnemonics: gobble, grab, gulp). I wish
> it could be "-s", but sadly it's already taken :(

+1

--
Paul "LeoNerd" Evans

leonerd@leonerd.org.uk | https://metacpan.org/author/PEVANS
http://www.leonerd.org.uk/ | https://www.tindie.com/stores/leonerd/
Re: Pre-RFC: command-line flag for slurping [ In reply to ]
* Scott Baker <scott@perturb.org> [2021-09-29 08:37:57 -0700]:

> On 9/29/21 08:21, Dan Book wrote:
> > On Wed, Sep 29, 2021 at 11:00 AM Tomasz Konojacki <me@xenu.pl> wrote:
> >
> > "-0777" flag is the usual way to read the whole file at once
> > (instead of
> > line by line) in one-liners.
> >
> > I feel this isn't ideal. "-0" is a bad flag. It's overly general,
> > users
> > rarely need $/ to be set to anything other than undef or "\n".
> > Also, the
> > input record separator has to be specified as an octal number,
> > which is
> > weird. The fact that the numbers above 0o377 are special-cased to mean
> > "undef" makes it even more confusing.
> >
> > Slurping is an extremely common operation and it deserves its own
> > one-letter flag. I propse "-g" (mnemonics: gobble, grab, gulp). I wish
> > it could be "-s", but sadly it's already taken :(
> >
> >
> > I think this is an excellent idea. This sort of processing using Perl
> > oneliners is extremely common and spread across the internet, and the
> > '-0777' flag is a constant source of confusion. Ideally perl would have
> > support for long options so we didn't have to take up the dwindling
> > one-letter options, but in the meantime, they are not in high demand so
> > IMO it's fine to use one for this.

Seems like anything would need to consider also -n and -l, any others? I'm not
a perl oneliner wizard by any means, but these two came up in my google.

#Note:
#$ perl -v
#This is perl 5, version 32, subversion 1 (v5.32.1) built for x86_64-netbsd-thread-multi

...

-0777, only

$ perl -0777 -MO=Deparse -e 'chomp; print $_' ./file.txt
BEGIN { $/ = undef; $\ = undef; }
chomp $_;
print $_;
-e syntax OK

-l, only

$ perl -l -MO=Deparse -e 'chomp; print $_' ./file.txt
BEGIN { $/ = "\n"; $\ = "\n"; }
chomp $_;
print $_;
-e syntax OK

-n, only

$ perl -n -MO=Deparse -e 'chomp; print $_' ./file.txt
LINE: while (defined($_ = readline ARGV)) {
chomp $_;
print $_;
}
-e syntax OK

-l -0777,

$ perl -l -0777 -MO=Deparse -e 'chomp; print $_' ./file.txt
BEGIN { $/ = undef; $\ = "\n"; }
chomp $_;
print $_;
-e syntax OK

-n -0777,

$ perl -n -0777 -MO=Deparse -e 'chomp; print $_' ./file.txt
BEGIN { $/ = undef; $\ = undef; }
LINE: while (defined($_ = readline ARGV)) {
chomp $_;
print $_;
}
-e syntax OK

-l -n -0777, (note "double chomp" - not quite sure where that came from)

$ perl -l -n -0777 -MO=Deparse -e 'chomp; print $_' ./file.txt
BEGIN { $/ = undef; $\ = "\n"; }
LINE: while (defined($_ = readline ARGV)) {
chomp $_;
chomp $_;
print $_;
}
-e syntax OK

Cheers,
Brett

ps: I like the idea of "long" options, too.

> >
> > -Dan
>
> I totally support this feature... and I think it should be a --long option
> instead of "-g". The Perl interpretter is *long overdue* for long options.
>
> - Scott

--
--
oodler@cpan.org
oodler577@sdf-eu.org
SDF-EU Public Access UNIX System - http://sdfeu.org
irc.perl.org #openmp #pdl #native
Re: Pre-RFC: command-line flag for slurping [ In reply to ]
On Wed, Sep 29, 2021 at 2:55 PM Oodler 577 via perl5-porters <
perl5-porters@perl.org> wrote:

> * Scott Baker <scott@perturb.org> [2021-09-29 08:37:57 -0700]:
>
> > On 9/29/21 08:21, Dan Book wrote:
> > > On Wed, Sep 29, 2021 at 11:00 AM Tomasz Konojacki <me@xenu.pl> wrote:
> > >
> > > "-0777" flag is the usual way to read the whole file at once
> > > (instead of
> > > line by line) in one-liners.
> > >
> > > I feel this isn't ideal. "-0" is a bad flag. It's overly general,
> > > users
> > > rarely need $/ to be set to anything other than undef or "\n".
> > > Also, the
> > > input record separator has to be specified as an octal number,
> > > which is
> > > weird. The fact that the numbers above 0o377 are special-cased to
> mean
> > > "undef" makes it even more confusing.
> > >
> > > Slurping is an extremely common operation and it deserves its own
> > > one-letter flag. I propse "-g" (mnemonics: gobble, grab, gulp). I
> wish
> > > it could be "-s", but sadly it's already taken :(
> > >
> > >
> > > I think this is an excellent idea. This sort of processing using Perl
> > > oneliners is extremely common and spread across the internet, and the
> > > '-0777' flag is a constant source of confusion. Ideally perl would have
> > > support for long options so we didn't have to take up the dwindling
> > > one-letter options, but in the meantime, they are not in high demand so
> > > IMO it's fine to use one for this.
>
> Seems like anything would need to consider also -n and -l, any others? I'm
> not
> a perl oneliner wizard by any means, but these two came up in my google.
>

-0777 interacts with these options in a reasonable and expected way, and
the proposed -g would do the same.

-Dan
Re: Pre-RFC: command-line flag for slurping [ In reply to ]
* Dan Book <grinnz@gmail.com> [2021-09-29 15:12:53 -0400]:

> On Wed, Sep 29, 2021 at 2:55 PM Oodler 577 via perl5-porters <
> perl5-porters@perl.org> wrote:
>
> > * Scott Baker <scott@perturb.org> [2021-09-29 08:37:57 -0700]:
> >
> > > On 9/29/21 08:21, Dan Book wrote:
> > > > On Wed, Sep 29, 2021 at 11:00 AM Tomasz Konojacki <me@xenu.pl> wrote:
> > > >
> > > > "-0777" flag is the usual way to read the whole file at once
> > > > (instead of
> > > > line by line) in one-liners.
> > > >
> > > > I feel this isn't ideal. "-0" is a bad flag. It's overly general,
> > > > users
> > > > rarely need $/ to be set to anything other than undef or "\n".
> > > > Also, the
> > > > input record separator has to be specified as an octal number,
> > > > which is
> > > > weird. The fact that the numbers above 0o377 are special-cased to
> > mean
> > > > "undef" makes it even more confusing.
> > > >
> > > > Slurping is an extremely common operation and it deserves its own
> > > > one-letter flag. I propse "-g" (mnemonics: gobble, grab, gulp). I
> > wish
> > > > it could be "-s", but sadly it's already taken :(
> > > >
> > > >
> > > > I think this is an excellent idea. This sort of processing using Perl
> > > > oneliners is extremely common and spread across the internet, and the
> > > > '-0777' flag is a constant source of confusion. Ideally perl would have
> > > > support for long options so we didn't have to take up the dwindling
> > > > one-letter options, but in the meantime, they are not in high demand so
> > > > IMO it's fine to use one for this.
> >
> > Seems like anything would need to consider also -n and -l, any others? I'm
> > not
> > a perl oneliner wizard by any means, but these two came up in my google.
> >
>
> -0777 interacts with these options in a reasonable and expected way, and
> the proposed -g would do the same.

If it's going to be equivalent to "-0777", which directly impacts the record
separator, then "-R" is probably better. None of the mnemonics proposed to fit
with "-g" are common parlance. It's also associated with creating executables
containing extra-stuff for debuggers in gcc. That seems it could be helpful at
some point for actual debugging, or at least seems a little dissonant for cli
jockies.

Then again, the general move towards long opts would make it pretty easy to do
something like "--record-separator|--slurp". So it seems to me that supporting
long opts is probably the real "ask".

Brett

>
> -Dan

--
--
oodler@cpan.org
oodler577@sdf-eu.org
SDF-EU Public Access UNIX System - http://sdfeu.org
irc.perl.org #openmp #pdl #native
Re: Pre-RFC: command-line flag for slurping [ In reply to ]
On Wed, Sep 29, 2021 at 6:46 PM Oodler 577 <oodler577@sdf-eu.org> wrote:

>
> If it's going to be equivalent to "-0777", which directly impacts the
> record
> separator, then "-R" is probably better. None of the mnemonics proposed to
> fit
> with "-g" are common parlance. It's also associated with creating
> executables
> containing extra-stuff for debuggers in gcc. That seems it could be
> helpful at
> some point for actual debugging, or at least seems a little dissonant for
> cli
> jockies.


I don't find either of the suggestions particularly intuitive, so have no
preference.

Then again, the general move towards long opts would make it pretty easy to
> do
> something like "--record-separator|--slurp". So it seems to me that
> supporting
> long opts is probably the real "ask".
>

This would be fantastic, but though this could serve as one of several
examples of its benefit, this would of course need to be its own RFC
proposed by someone willing to implement it, which so far we don't
have, and thus shouldn't hold up this or similar ideas.

-Dan
Re: Pre-RFC: command-line flag for slurping [ In reply to ]
Tomasz Konojacki writes:

> "-0777" flag is the usual way to read the whole file at once in
> one-liners.
>
> Slurping is an extremely common operation and it deserves its own
> one-letter flag. I propse "-g" (mnemonics: gobble, grab, gulp).

Yes, please. I have taught Perl for a living, and having a single option
for this would making teaching one-liners much easier.

The current -0 option has the flexibility of being completely general,
but that generality gets in the way of learning the most common cases:
you need to understand the concepts of record separators, octal, and
then a non-octal token.

There are now several messages of support and none against, so I think
you may now proceed to writing and submitting an RFC.

Oodler 577 via perl5-porters writes:

> If it's going to be equivalent to "-0777", which directly impacts the
> record separator, then "-R" is probably better.

One of the advantages I see of such a flag is *not* needing to think
about it at the level of the record separator (unless you want to). The
flag would cause Perl to read in a whole file/all of standard input at
once. That it would do this by setting the input record separator to a
specific value is *how* it does that, but not *why* you would use the
flag.

> Then again, the general move towards long opts

It sounds like there are many in favour of long options. An RFC on that
matter may be worthwhile.

But that isn't a reason to delay *this* RFC for something else which may
or may not happen.

Smylers
Re: Pre-RFC: command-line flag for slurping [ In reply to ]
2021-9-30 0:00 Tomasz Konojacki <me@xenu.pl> wrote:

> "-0777" flag is the usual way to read the whole file at once (instead of
> line by line) in one-liners.
>
> I feel this isn't ideal. "-0" is a bad flag. It's overly general, users
> rarely need $/ to be set to anything other than undef or "\n". Also, the
> input record separator has to be specified as an octal number, which is
> weird. The fact that the numbers above 0o377 are special-cased to mean
> "undef" makes it even more confusing.
>
> Slurping is an extremely common operation and it deserves its own
> one-letter flag. I propse "-g" (mnemonics: gobble, grab, gulp). I wish
> it could be "-s", but sadly it's already taken :(
>
>
>
I feel slurping is a very general operation in this age.

For example, HTML.

HTML is not a line to line protocol. the whole file needs to be parsed by
regular expression.

<h1>
Foo
</h1>

And the available memory is large enough.
Re: Pre-RFC: command-line flag for slurping [ In reply to ]
I agree that the "slurp" operation is important, but I would argue that it
is more important in code than globally defined by a flag. The flag is
fine, but is there a recommended way to "slurp" a file. I usually use:
$_ = join('',<>); since it is easy to remember, and use a short custom
made function getfile($filename) for slurping files. It would be nice to
have a recommended way to "slurp" in-line as well as having a flag.

On Fri, 1 Oct 2021, Yuki Kimoto wrote:

>
>
> 2021-9-30 0:00 Tomasz Konojacki <me@xenu.pl> wrote:
> "-0777" flag is the usual way to read the whole file at once (instead of
> line by line) in one-liners.
>
> I feel this isn't ideal. "-0" is a bad flag. It's overly general, users
> rarely need $/ to be set to anything other than undef or "\n". Also, the
> input record separator has to be specified as an octal number, which is
> weird. The fact that the numbers above 0o377 are special-cased to mean
> "undef" makes it even more confusing.
>
> Slurping is an extremely common operation and it deserves its own
> one-letter flag. I propse "-g" (mnemonics: gobble, grab, gulp). I wish
> it could be "-s", but sadly it's already taken :(
>
>
>
> I feel? slurping is a very general operation in?this age.
>
> For example, HTML.
>
> HTML is not a line to line protocol. the whole file needs to be parsed by regular expression.
>
> ? <h1>
> ? ? ?Foo
> ? </h1>
>
> And the available memory is large enough.
>
>
>
Re: Pre-RFC: command-line flag for slurping [ In reply to ]
Vlado Keselj writes:

> It would be nice to have a recommended way to "slurp" in-line ...

Indeed. Maybe that's on a similar level to trim in usefulness, how it's
done now, and Cpan modules.

> ...as well as having a flag.

This (pre-)RFC is about having a command-line flag.

Having a command-line flag doesn't make the in-code slurping situation
any worse (in that it doesn't affect it at all).

Let's not derail this RFC by burdening it with other things that could
also be done. Separate ideas can have their own RFCs, and any interested
party is welcome to propose one.

Smylers
Re: Pre-RFC: command-line flag for slurping [ In reply to ]
On Fri, Oct 1, 2021 at 12:12 PM Ed Avis <ed.avis@pgim.com> wrote:

> I think there should be a standard slurp keyword built in to the language.
> It doesn't matter what it's called, but reading a file into a string comes
> under "easy things should be easy".
>

Please keep this thread focused on the request at hand.

-Dan
Re: Pre-RFC: command-line flag for slurping [ In reply to ]
Sorry about that... I will have it in mind for future.

On Fri, 1 Oct 2021, Smylers wrote:

> Let's not derail this RFC by burdening it with other things that could
> also be done. Separate ideas can have their own RFCs, and any interested
> party is welcome to propose one.
>
> Smylers
>
>
Re: Pre-RFC: command-line flag for slurping [ In reply to ]
On Thu, Sep 30, 2021 at 09:19:05AM +0100, Smylers wrote:
> Tomasz Konojacki writes:
>
> > "-0777" flag is the usual way to read the whole file at once in
> > one-liners.
> >
> > Slurping is an extremely common operation and it deserves its own
> > one-letter flag. I propse "-g" (mnemonics: gobble, grab, gulp).
>
> Yes, please. I have taught Perl for a living, and having a single option
> for this would making teaching one-liners much easier.
>
> The current -0 option has the flexibility of being completely general,
> but that generality gets in the way of learning the most common cases:
> you need to understand the concepts of record separators, octal, and
> then a non-octal token.
>
> There are now several messages of support and none against, so I think
> you may now proceed to writing and submitting an RFC.

Yes, please. as in

1) specifically, Tomasz, please draft an RFC.

Whilst there is an "official" blank template, it might be easier to start
with https://github.com/Perl/RFCs/blob/master/rfcs/rfc0003.md
which was also about command line options. (It's also suitably short)

2_ generally, "this" was roughly the pre-RFC "workflow" was supposed to flow

a) mail an elevator pitch
b) rapid feedback
c) if rapid feeback is "several people said yes" and "no-one said no"
then anyone competent can say "right, this isn't totally daft, go for it"
(including the original submitter)

It's the first "?" decision point on:

https://github.com/Perl/RFCs/blob/master/docs/process.md

and clearly a "command line flag" can't be a CPAN module. :-)

Nicholas Clark
Re: Pre-RFC: command-line flag for slurping [ In reply to ]
Oodler 577 via perl5-porters <perl5-porters@perl.org> wrote:

>If it's going to be equivalent to "-0777", which directly impacts the record
separator, then "-R" is probably better.

I thought we could do better than that, but maybe not... of the
available choices -R isn't bad. The remaining unused single ascii
letters are:

b g j k o q r y z A B G H J K L N O P Q R Y Z

Though there are some numerics also:

-1 do it in one gulp (confusable with -l, though)

Possibly:

-o do it in "one" gulp (myself I'd keep it in case there's a need
for "output")

Maybe:

-N the "opposite" of -n
-P the "opposite" of -p

so instead of, say, -nR or -pR you could just use -N or -P
Re: Pre-RFC: command-line flag for slurping [ In reply to ]
* Joseph Brenner <doomvox@gmail.com> [2021-10-19 12:25:48 -0700]:

> Oodler 577 via perl5-porters <perl5-porters@perl.org> wrote:
>
> >If it's going to be equivalent to "-0777", which directly impacts the record
> separator, then "-R" is probably better.
>
> I thought we could do better than that, but maybe not... of the
> available choices -R isn't bad. The remaining unused single ascii
> letters are:
>
> b g j k o q r y z A B G H J K L N O P Q R Y Z
>
> Though there are some numerics also:
>
> -1 do it in one gulp (confusable with -l, though)
>
> Possibly:
>
> -o do it in "one" gulp (myself I'd keep it in case there's a need
> for "output")
>
> Maybe:
>
> -N the "opposite" of -n
> -P the "opposite" of -p
>
> so instead of, say, -nR or -pR you could just use -N or -P

FYI ~

In awk, "RS" is the the "record separator" (like Perl's $/); so
any combination of "[rR][sS]" might do - I understand we're working
with just single character options, so the "r" and the "s" would
be defined such that the combination "RS" does the right thing - I'm
not sure if that's possible but if it is, this wwould be the most
historically relevant approach.

That said, Perl reappropriates a lot of stuff from awk, e.g., "BEGIN"
and "END".

Thanks for the comment.

Cheers,
Brett

--
--
oodler@cpan.org
oodler577@sdf-eu.org
SDF-EU Public Access UNIX System - http://sdfeu.org
irc.perl.org #openmp #pdl #native
Re: Pre-RFC: command-line flag for slurping [ In reply to ]
On Tue, 19 Oct 2021 18:14:42 +0000
Nicholas Clark <nick@ccl4.org> wrote:

> 1) specifically, Tomasz, please draft an RFC.

Here it is:

https://github.com/Perl/RFCs/pull/8
Re: Pre-RFC: command-line flag for slurping [ In reply to ]
On Wed, 29 Sep 2021 17:00:15 +0200
Tomasz Konojacki <me@xenu.pl> wrote:

> "-0777" flag is the usual way to read the whole file at once (instead
> of line by line) in one-liners.
>
> I feel this isn't ideal. "-0" is a bad flag. It's overly general,
> users rarely need $/ to be set to anything other than undef or "\n".
> Also, the input record separator has to be specified as an octal
> number, which is weird. The fact that the numbers above 0o377 are
> special-cased to mean "undef" makes it even more confusing.
>
> Slurping is an extremely common operation and it deserves its own
> one-letter flag. I propse "-g" (mnemonics: gobble, grab, gulp). I wish
> it could be "-s", but sadly it's already taken :(

This RFC is now accepted as RFC 0011

https://github.com/Perl/RFCs/blob/master/rfcs/rfc0011.md

We eagerly await an implementation.

--
Paul "LeoNerd" Evans

leonerd@leonerd.org.uk | https://metacpan.org/author/PEVANS
http://www.leonerd.org.uk/ | https://www.tindie.com/stores/leonerd/