This is an update from yesterday's post on urls which are not
currently being parsed by sa in version 2.63
Further cases:
6. msn redirection services g.msn.com
workaround for PerMsgStatus.pm
$uri =~ s/^http:\/\/g.msn.com\/[^\*]+\?http\:(.*)$/http\:$1/g;
7. use of html escape sequences in the url
http://toform.net/mcp/879/1352/cap112.html
To translate these into the equivalent ascii characters,
I have used HTML::entities rather than reinvent the wheel
workaround for PerMsgStatus.pm
use HTML::Entities;
$uri = HTML::Entities::decode($uri);
Here is a cumulative diff containing the workarounds for these
and the previous cases. The diff is against PerMsgStatus.pm
2.63 already patched with SpamCopUri 0.09
Hopefully someone can include these
in version 3 and more elegantly....
diff PerMsgStatus.pm.orig PerMsgStatus.pm
----cut-------
45a47
> use HTML::Entities;
1777a1780,1789
> dbg("Got URI: $uri");
> $uri =~ s/\%68/h/g;
> $uri =~ s/\%74/t/g;
> $uri =~ s/\%70/p/g;
> $uri =~ s/http:\/([^\/])/http:\/\/$1/g;
> $uri =~ s/http:\/\/http:\/\//http:\/\//g;
> $uri =~ s/^http:\/\/(?:drs|rd).yahoo.com\/[^\*]+\*(.*)$/$1/g;
> $uri =~ s/^http:\/\/g.msn.com\/[^\*]+\?http\:(.*)$/http\:$1/g;
> $uri = HTML::Entities::decode($uri);
> dbg("URI after filter: $uri");
----cut-------
currently being parsed by sa in version 2.63
Further cases:
6. msn redirection services g.msn.com
workaround for PerMsgStatus.pm
$uri =~ s/^http:\/\/g.msn.com\/[^\*]+\?http\:(.*)$/http\:$1/g;
7. use of html escape sequences in the url
http://toform.net/mcp/879/1352/cap112.html
To translate these into the equivalent ascii characters,
I have used HTML::entities rather than reinvent the wheel
workaround for PerMsgStatus.pm
use HTML::Entities;
$uri = HTML::Entities::decode($uri);
Here is a cumulative diff containing the workarounds for these
and the previous cases. The diff is against PerMsgStatus.pm
2.63 already patched with SpamCopUri 0.09
Hopefully someone can include these
in version 3 and more elegantly....
diff PerMsgStatus.pm.orig PerMsgStatus.pm
----cut-------
45a47
> use HTML::Entities;
1777a1780,1789
> dbg("Got URI: $uri");
> $uri =~ s/\%68/h/g;
> $uri =~ s/\%74/t/g;
> $uri =~ s/\%70/p/g;
> $uri =~ s/http:\/([^\/])/http:\/\/$1/g;
> $uri =~ s/http:\/\/http:\/\//http:\/\//g;
> $uri =~ s/^http:\/\/(?:drs|rd).yahoo.com\/[^\*]+\*(.*)$/$1/g;
> $uri =~ s/^http:\/\/g.msn.com\/[^\*]+\?http\:(.*)$/http\:$1/g;
> $uri = HTML::Entities::decode($uri);
> dbg("URI after filter: $uri");
----cut-------