I've rewritten wikiPage::removeHTMLtags again. (Checked into CVS, diff:
http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/wikipedia/phpwiki/fpw/wikiPage.php.diff?r1=1.60&r2=1.61)
Exciting new features:
* Removes unwanted tag attributes, such as the scripting attributes
(onmouseclick, onmouseout, etc) which can be used to create fake links
or automatically redirect the browser to another web site (see the
previous version of the [[Goatse.cx]] article for an example)
* Makes a more serious attempt to fix mismatched open/close tag pairs.
Related, makes some attempts at normalization of tables. ie, <tr> not
allowed outside of <table> etc.
* Nested tables now work.
The function feels more weighty than it ought to be, but it works on
everything I've tried throwing at it so far, which is an improvement
over the previous versions.
I also threw in fixes for:
* Character entities in <pre> sections
* ISBN numbers with letters in them
* == Section headers == at the edges of HTML tags
-- brion vibber (brion @ pobox.com)
http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/wikipedia/phpwiki/fpw/wikiPage.php.diff?r1=1.60&r2=1.61)
Exciting new features:
* Removes unwanted tag attributes, such as the scripting attributes
(onmouseclick, onmouseout, etc) which can be used to create fake links
or automatically redirect the browser to another web site (see the
previous version of the [[Goatse.cx]] article for an example)
* Makes a more serious attempt to fix mismatched open/close tag pairs.
Related, makes some attempts at normalization of tables. ie, <tr> not
allowed outside of <table> etc.
* Nested tables now work.
The function feels more weighty than it ought to be, but it works on
everything I've tried throwing at it so far, which is an improvement
over the previous versions.
I also threw in fixes for:
* Character entities in <pre> sections
* ISBN numbers with letters in them
* == Section headers == at the edges of HTML tags
-- brion vibber (brion @ pobox.com)