On dim, 2003-02-16 at 04:26, Tomasz Wegrzanowski wrote:
> Please change regular expressions used to match URLs so
> that final `)' isn't matched.
The trouble with ) is of course that various pages on Wikipedia (and
perhaps elsewhere) _do_ end with a close-paren. However, unless there is
an open-paren in the URL, we can likely ignore it. It should be possible
to come up with a fun regexp for that. :)
> Probably some other characters should be disallowed too,
> like comma and dot.
OutputPage::subReplaceExternalLinks():
# this is the list of separators that should be ignored if they
# are the last character of an URL but that should be included
# if they occur within the URL, e.g. "go to www.foo.com, where .."
# in this case, the last comma should not become part of the URL,
# but in "www.foo.com/123,2342,32.htm" it should.
$sep = ",;\.:";
-- brion vibber (brion @ pobox.com)