Mailing List Archive

In regular expression, "minimalist" quantified subpattern (*?) can be greedy
Hello

I've found the following problem with the "minimalist" quantified subpattern.

With this line :
$ echo "_foo_bar-" |perl -ne '@test= (/_(\w*?)-/) ; print $test[0],"\n" ;'


I get :
foo_bar

But I'd think I should get only 'bar' since I asked for a minimun match.
(although it's a "backward" minimum match).

Note that this works when I ask for a "forward" minimun match :

$ echo "-foo_bar_" |perl -ne '@test= (/-(\w*?)_/) ; print $test[0],"\n" ;'
foo

Here's the output of myconfig :

Summary of my perl5 (patchlevel 1) configuration:
Platform:
osname=hpux, osver=9, archname=hpux
uname='hp-ux marlis a.09.05 a 9000715 2010519172 two-user license '
hint=recommended
Compiler:
cc='cc', optimize='+O2', ld='ld'
cppflags='-D_POSIX_SOURCE -D_HPUX_SOURCE -Aa'
ccflags ='-D_POSIX_SOURCE -D_HPUX_SOURCE -Aa'
ldflags =''
stdchar='unsigned char', d_stdstdio=define, usevfork=true
voidflags=15, castflags=0, d_casti32=define, d_castneg=define
intsize=4, alignbytes=8, usemymalloc=n, randbits=15
Libraries:
so=sl
libpth=/lib/pa1.1 /lib /usr/lib /usr/local/lib
libs=-lm -ldld
libc=/lib/libc.sl
Dynamic Linking:
dlsrc=dl_hpux.xs, dlext=sl, d_dlsymun=undef
cccdlflags='+z', ccdlflags='-Wl,-E ', lddlflags='-b'


Thanks for the wonderful work you've done on perl5.


--
-----------------------------------------------------------------------------
Name: Dominique Dumont
^^^^^^ Email: Dominique_Dumont@grenoble.hp.com
/ O O \ HP Desk: Dominique DUMONT / HP6300/UM
( \____/ ) Address : HEWLETT PACKARD, 38053 Grenoble Cedex 09 FRANCE
\______/ Tel,Telnet: (33) 76 62 57 24 - 7 779 5724
Telex,Fax: 980 124 - (33) 76 62 14 88
-----------------------------------------------------------------------------

"What is the sound of Perl? Is it not the sound of a wall that
people have stopped banging their heads against?"
--Larry Wall
Re: In regular expression, "minimalist" quantified subpattern (*?) can be greedy [ In reply to ]
>>>>> "Dominique" == Dominique Dumont <domi@ss7serv.grenoble.hp.com> writes:

Dominique> Hello
Dominique> I've found the following problem with the "minimalist" quantified subpattern.

Dominique> With this line :
Dominique> $ echo "_foo_bar-" |perl -ne '@test= (/_(\w*?)-/) ; print $test[0],"\n" ;'


Dominique> I get :
Dominique> foo_bar

Dominique> But I'd think I should get only 'bar' since I asked for a minimun match.
Dominique> (although it's a "backward" minimum match).

No. It gets what I expect it to get. It's not psychic... but perhaps
because I know the underlying shtick I know that I'm looking for the
leftmost underscore followed by the least number of \w characters that
will let me end at a dash. Cuz there's also a "first match left to right"
rule as well as the "greedy/stingy" rule. If you want a "first match
right to left" just throw a greedy .* in front, like:

/.*_(\w*?)-/

and that'll get exactly what you want.

Name: Randal L. Schwartz / Stonehenge Consulting Services (503)777-0095
Keywords: Perl training, UNIX[tm] consulting, video production, skiing, flying
Email: <merlyn@stonehenge.com> Snail: (Call) PGP-Key: (finger merlyn@ora.com)
Web: <A HREF="http://www.teleport.com/~merlyn/">My Home Page!</A>
Quote: "I'm telling you, if I could have five lines in my .sig, I would!" -- me