Hullo Folk!
Perhaps I am truly taxing perl's regular expression
capabilities.
I have discovered a bug (or at least a regexp limitation)
in perl 5.001 and perl 5.001m.
Perl dies with a segmentation fault under the isolated
conditions I detail below.
The current setup:
I have compiled perl5.001m on a 3.2 68k NeXTstep machine,
with gcc's -g flag.
Try this:
On a unix machine, make a text file (> 4K)
without any ampersands or semicolons:
grasshopper> ls -al /dev > /tmp/capture;
Add one ampersand (&) to the beginning of this file,
/tmp/capture, with your favorite editor.
try this perl script:
-cut here-
#!/b/penrose/albini/perl
#
# iso.pl (an isolation of an html parsing bug)
require 5.001;
@document = <>;
$document = join(' ', @document);
$document =~ s/&([^&;]|\w|\s)*;//g;
printf STDERR "Made it!!!\n\n";
exit;
---cut here---
Unfortunately, with the above file, /tmp/capture,
as input, the script never makes it to the end.
Here are the results:
grasshopper> cd ~/lord/perl/perl5.001m
grasshopper> gdb perl
(gdb) run -d ~/perl/traverse/DDT/iso.pl < /tmp/capture
The program being debugged has been started already.
Start it from the beginning? (y or n) y
Starting program: /homes/penrose/lord/perl/perl5.001m/perl -d
~/perl/traverse/DDT/iso.pl < /tmp/capture
Loading DB routines from $RCSfile: perl5db.pl,v $$Revision: 4.1
$$Date: 92/08/07 18:24:07 $
Emacs support available.
Enter h for help.
main::(/b/penrose/perl/traverse/DDT/iso.pl:3):
3: require 5.001;
DB<1> r
Program generated(1): Memory access exception on address 0x3f7ffd8
(invalid address).
Reading in symbols for regexec.c...done.
0x4242e in regmatch (prog=0x1c0f3d "\t") at regexec.c:570
570 {
(gdb)
If you remove the ampersand, the script shouldn't have any problems.
I haven't had the time to figure out regmatch() and why this
memory fault occurs.
Christopher Penrose
penrose@ucsd.edu
http://www-crca.ucsd.edu/TajMahal/after.html
Perhaps I am truly taxing perl's regular expression
capabilities.
I have discovered a bug (or at least a regexp limitation)
in perl 5.001 and perl 5.001m.
Perl dies with a segmentation fault under the isolated
conditions I detail below.
The current setup:
I have compiled perl5.001m on a 3.2 68k NeXTstep machine,
with gcc's -g flag.
Try this:
On a unix machine, make a text file (> 4K)
without any ampersands or semicolons:
grasshopper> ls -al /dev > /tmp/capture;
Add one ampersand (&) to the beginning of this file,
/tmp/capture, with your favorite editor.
try this perl script:
-cut here-
#!/b/penrose/albini/perl
#
# iso.pl (an isolation of an html parsing bug)
require 5.001;
@document = <>;
$document = join(' ', @document);
$document =~ s/&([^&;]|\w|\s)*;//g;
printf STDERR "Made it!!!\n\n";
exit;
---cut here---
Unfortunately, with the above file, /tmp/capture,
as input, the script never makes it to the end.
Here are the results:
grasshopper> cd ~/lord/perl/perl5.001m
grasshopper> gdb perl
(gdb) run -d ~/perl/traverse/DDT/iso.pl < /tmp/capture
The program being debugged has been started already.
Start it from the beginning? (y or n) y
Starting program: /homes/penrose/lord/perl/perl5.001m/perl -d
~/perl/traverse/DDT/iso.pl < /tmp/capture
Loading DB routines from $RCSfile: perl5db.pl,v $$Revision: 4.1
$$Date: 92/08/07 18:24:07 $
Emacs support available.
Enter h for help.
main::(/b/penrose/perl/traverse/DDT/iso.pl:3):
3: require 5.001;
DB<1> r
Program generated(1): Memory access exception on address 0x3f7ffd8
(invalid address).
Reading in symbols for regexec.c...done.
0x4242e in regmatch (prog=0x1c0f3d "\t") at regexec.c:570
570 {
(gdb)
If you remove the ampersand, the script shouldn't have any problems.
I haven't had the time to figure out regmatch() and why this
memory fault occurs.
Christopher Penrose
penrose@ucsd.edu
http://www-crca.ucsd.edu/TajMahal/after.html