The follow way to "trim" using "split" seems to provide a constant time solution, not dependent on the length of the string. Although I don't know how "split" is implemented, this its constancy is not surprising.
In fact, the filthy way I'm generating strings necessarily overtakes the amount of time to run this very quickly.
# bench.sh
for NUM in $(seq 1 100);
do
STRING=$(perl -e "printf qq{ %s }, ' a b ' x $NUM")
time perl x.pl "$STRING" 2>&1 | grep real
done
# x.pl
my $foo = $ARGV[0];
my $trimmed = (split /^\s*|\s*$/, $foo)[-1];
print qq{'$trimmed'\n}; # <- commenting out provides no benefit timewise
exerpt of output ('real' bounces between 7ms and 16ms, indicating a sensitivity to the mac OS process scheduler itself which is even more indicitave to the efficiency of this solution):
real 0m0.007s
user 0m0.002s
sys 0m0.004s
'a ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba b'
real 0m0.007s
user 0m0.002s
sys 0m0.003s
'a ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba b'
real 0m0.015s
user 0m0.003s
sys 0m0.006s
'a ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba b'
real 0m0.007s
user 0m0.002s
sys 0m0.003s
'a ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba b'
real 0m0.008s
user 0m0.002s
sys 0m0.004s
'a ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba ba b'
real 0m0.008s
user 0m0.002s
sys 0m0.003s
Cheers,
Brett
--
oodler@cpan.org
?
https://github.com/oodler577 #pdl #p5p #p7-dev #native @ irc.perl.org
Sent with ProtonMail Secure Email.
??????? Original Message ???????
On Friday, May 28, 2021 5:52 PM, Joseph Brenner <doomvox@gmail.com> wrote:
> Some quick-and-dirty benchmarking, trimming 100,000 short strings:
>
> case 1:
> $line =~ s/^\s+//;
> $line =~ s/\s+$//;
>
> real 0m1.427s
>
> ==============
>
> case 2:
> $line =~ s/^\s*(.+?)\s*$/$1/;
>
> real 0m1.853s
>
> ==============
>
> case 3:
> $line =~ s/^\s*|\s*$//g;
>
> real 0m2.864s
>
> ==============
>
> So, case 2 is 30% slower, case 3 is 100% slower.
>
> There's a simple fix that improves case 3 quite a bit:
>
> case 4:
> $line =~ s/^\s+|\s+$//g;
>
> real 0m1.704s
>
> ==============
>
> However: I took it very easy on this case using short lines... it's
> very sensitive to line length (that \g is checking every point in the
> string) and it slows down by a factor of ten with lines that are only
> around 80 chars long.
>
> Anyway, these speed penalties are Not Good, but they're also not
> (usually) a reason to care.
> Granted I was exaggerating calling these hairy and
> unreadable, but I think they're all harder to read.
>
> (For example, with "case 3", my first thought was it was
> broken and wouldn't strip trailing whitespace if it
> had stripped leading whitespace, but then I noticed the /g.
> And further, it's using a * instead of a +, so without the /g
> it never strips trailing space: so there were two things
> I didn't understand.)
>
> The thing you should ask yourself as a perl programmer is
> "what did I think I would gain from doing this in one
> line?".
>
> The key point for the perl5-porters though is that there
> is indeed a need for a built-in trim.