Yes, I reported this to Larry on Friday, although I hadn't come
up with a minimal case yet. Using # comments in patterns can break them.
In short, changing
\@
to
\@ # oops
breaks it, but
\@ (?# break)
will be ok.
--tom
+-------------------------------------------------+
| In article <455cmu$8dd@csnews.cs.colorado.edu>, |
| Tom Christiansen <tchrist@mox.perl.com> wrote: |
| > |
| > $addr = 'I sent mail to person@foo.bar.'; |
| > $addr =~ /\S+\@[^.\s]+(\.[^.\s]+)*/; |
| ^^^^ ^^^^ |
+-------------------------------------------------+
+------------------------------------------------------------------+
| >But that seems like a lot of work compared to the first example |
| >I gave up there. |
+------------------------------------------------------------------+
+-------------------------+
| Not enough work, Tom :) |
+-------------------------+
+---------------------------------------------------------------------+
| Trying to bring your regex in conformance with RFC 1035, I stumbled |
| into an annoing bug. |
+---------------------------------------------------------------------+
+----------------------------+
| First the correct version: |
+----------------------------+
+---------------------------------------------------------------+
| % perl -le ' |
| $_= "First I got mail from person\@foo.bzz-gw4.bar; then from |
| tchrist\@perl-.com and lwall\@5perl"; |
| s{ |
| ( |
| \S* # username |
| \@ |
| [A-Za-z] # letter |
| ( |
| [\w\-]* # let-dig-hyp |
| \w # let-dig |
| )* |
| ( |
| \. |
| [A-Za-z] # letter |
| ( |
| [\w\-]* # let-dig-hyp |
| \w # let-dig |
| )* |
| )* |
| ) |
| } |
| {>>>$1<<<}gsx; |
| print; |
| ' |
| First I got mail from >>>person@foo.bzz-gw4.bar<<<; then from |
| >>>tchrist@perl<<<-.com and lwall@5perl |
+---------------------------------------------------------------+
+------------------------------------------------+
| You may notice, that I didn't comment the line |
+------------------------------------------------+
+-------+
| \@ |
+-------+
+----------------------------------------------------------------------+
| Well, that's the bug report. Any comment after the @ sign breaks the |
| whole regex. |
+----------------------------------------------------------------------+
+------------------------------------------------------------------------+
| The problem doesn't stick to the character '@', any character seems to |
| show the same bug. |
+------------------------------------------------------------------------+
+---------------------------------------------------------------------+
| This is probably related to NETaa13992, but maybe not, because I DO |
| use the /x modifier. |
+---------------------------------------------------------------------+
+---------+
| andreas |
+---------+
up with a minimal case yet. Using # comments in patterns can break them.
In short, changing
\@
to
\@ # oops
breaks it, but
\@ (?# break)
will be ok.
--tom
+-------------------------------------------------+
| In article <455cmu$8dd@csnews.cs.colorado.edu>, |
| Tom Christiansen <tchrist@mox.perl.com> wrote: |
| > |
| > $addr = 'I sent mail to person@foo.bar.'; |
| > $addr =~ /\S+\@[^.\s]+(\.[^.\s]+)*/; |
| ^^^^ ^^^^ |
+-------------------------------------------------+
+------------------------------------------------------------------+
| >But that seems like a lot of work compared to the first example |
| >I gave up there. |
+------------------------------------------------------------------+
+-------------------------+
| Not enough work, Tom :) |
+-------------------------+
+---------------------------------------------------------------------+
| Trying to bring your regex in conformance with RFC 1035, I stumbled |
| into an annoing bug. |
+---------------------------------------------------------------------+
+----------------------------+
| First the correct version: |
+----------------------------+
+---------------------------------------------------------------+
| % perl -le ' |
| $_= "First I got mail from person\@foo.bzz-gw4.bar; then from |
| tchrist\@perl-.com and lwall\@5perl"; |
| s{ |
| ( |
| \S* # username |
| \@ |
| [A-Za-z] # letter |
| ( |
| [\w\-]* # let-dig-hyp |
| \w # let-dig |
| )* |
| ( |
| \. |
| [A-Za-z] # letter |
| ( |
| [\w\-]* # let-dig-hyp |
| \w # let-dig |
| )* |
| )* |
| ) |
| } |
| {>>>$1<<<}gsx; |
| print; |
| ' |
| First I got mail from >>>person@foo.bzz-gw4.bar<<<; then from |
| >>>tchrist@perl<<<-.com and lwall@5perl |
+---------------------------------------------------------------+
+------------------------------------------------+
| You may notice, that I didn't comment the line |
+------------------------------------------------+
+-------+
| \@ |
+-------+
+----------------------------------------------------------------------+
| Well, that's the bug report. Any comment after the @ sign breaks the |
| whole regex. |
+----------------------------------------------------------------------+
+------------------------------------------------------------------------+
| The problem doesn't stick to the character '@', any character seems to |
| show the same bug. |
+------------------------------------------------------------------------+
+---------------------------------------------------------------------+
| This is probably related to NETaa13992, but maybe not, because I DO |
| use the /x modifier. |
+---------------------------------------------------------------------+
+---------+
| andreas |
+---------+