Mailing List Archive

Finally fixing ESCAPE_CHARS::std
It seems that a bug was introduced back in 2007 by the addition of
"\X" as an 'escape char' for HTML::Entities. When it was added, the
string was single-quoted, which is appropriate for "\X", but the
string was later changed back to double-quoting, which fixes the "\n"
and "\t" it also contains, but also breaks the "\X" and causes the
following warning:

unrecognized escape \X

I believe the following double-backslash for \X will fix this:

-$ESCAPE_CHARS::std = qq{^\n\t\X !\#\$%\'-;=?-Z\\\]-~};
+$ESCAPE_CHARS::std = qq{^\n\t\\X !\#\$%\'-;=?-Z\\\]-~};

It eliminates the warning, but I am not quite sure on the thought behind
using \X as an escape char, so before I push this patch, somebody please
check me on this.

Thanks,
Josh
--
Josh Lavin
Perusion -- Expert Interchange Consulting http://www.perusion.com/
... ask me about job opportunities ...

_______________________________________________
interchange-users mailing list
interchange-users@icdevgroup.org
http://www.icdevgroup.org/mailman/listinfo/interchange-users
Re: Finally fixing ESCAPE_CHARS::std (ATTN; Stefan) [ In reply to ]
On 05/22/2015 07:50 AM, Josh Lavin wrote:
> It seems that a bug was introduced back in 2007 by the addition of
> "\X" as an 'escape char' for HTML::Entities. When it was added, the
> string was single-quoted, which is appropriate for "\X", but the
> string was later changed back to double-quoting, which fixes the "\n"
> and "\t" it also contains, but also breaks the "\X" and causes the
> following warning:
>
> unrecognized escape \X
>
> I believe the following double-backslash for \X will fix this:
>
> -$ESCAPE_CHARS::std = qq{^\n\t\X !\#\$%\'-;=?-Z\\\]-~};
> +$ESCAPE_CHARS::std = qq{^\n\t\\X !\#\$%\'-;=?-Z\\\]-~};
>
> It eliminates the warning, but I am not quite sure on the thought behind
> using \X as an escape char, so before I push this patch, somebody please
> check me on this.

That was commit #3f45ec14 by Stefan that added the \X and changed from
double quotes to single quotes. The git log references ticket #58 from
the RT system so I would imagine that there is much more details of the
reasoning behind the changes in there. I can't seem to find the old RT
system to look it up anymore but most of the tickets have been moved to
the github issue tracker since then, unfortunately #58 has not been.

Stefan, can you comment on the reason for the change, or perhaps dig up
the old RT entry for the ticket?


Peter

_______________________________________________
interchange-users mailing list
interchange-users@icdevgroup.org
http://www.icdevgroup.org/mailman/listinfo/interchange-users
Re: Finally fixing ESCAPE_CHARS::std (ATTN; Stefan) [ In reply to ]
On Fri, 22 May 2015, Peter wrote:

> On 05/22/2015 07:50 AM, Josh Lavin wrote:
>> It seems that a bug was introduced back in 2007 by the addition of
>> "\X" as an 'escape char' for HTML::Entities. When it was added, the
>> string was single-quoted, which is appropriate for "\X", but the
>> string was later changed back to double-quoting, which fixes the "\n"
>> and "\t" it also contains, but also breaks the "\X" and causes the
>> following warning:
>>
>> unrecognized escape \X
>>
>> I believe the following double-backslash for \X will fix this:
>>
>> -$ESCAPE_CHARS::std = qq{^\n\t\X !\#\$%\'-;=?-Z\\\]-~};
>> +$ESCAPE_CHARS::std = qq{^\n\t\\X !\#\$%\'-;=?-Z\\\]-~};
>>
>> It eliminates the warning, but I am not quite sure on the thought behind
>> using \X as an escape char, so before I push this patch, somebody please
>> check me on this.
>
> That was commit #3f45ec14 by Stefan that added the \X and changed from
> double quotes to single quotes. The git log references ticket #58 from
> the RT system so I would imagine that there is much more details of the
> reasoning behind the changes in there. I can't seem to find the old RT
> system to look it up anymore but most of the tickets have been moved to
> the github issue tracker since then, unfortunately #58 has not been.
>
> Stefan, can you comment on the reason for the change, or perhaps dig up
> the old RT entry for the ticket?

This question seems to have gotten dropped, and the fix never got
committed.

Raw \X was there before when it was single-quoted, and when it became
double-quoted that turned into an invalid string escape. It seems pretty
clear it should be changed to \\X as Josh Lavin suggested.

(For the record, \X in a regex is an atom for "Match Unicode "eXtended
grapheme cluster" as per the perlre manpage.)

If nobody objects soon, I propose you just commit it, Josh.

Jon


--
Jon Jensen
End Point Corporation
https://www.endpoint.com/

_______________________________________________
interchange-users mailing list
interchange-users@icdevgroup.org
http://www.icdevgroup.org/mailman/listinfo/interchange-users