Mailing List Archive

HTML parser/reformatter/filter wanted
I'm looking for a piece of Python code that can filter HTML and tie up
unclosed tags.

By way of explanation: I'm building a bulletin-board-ish solution where
users should be able to post HTMl-formatted submissions; I only want to
enable a subset of HTML (eg., <b>, <i> and so forth), plus I want to make
sure text-level tags are ended correctly. I could sit down and write a
smaller HTML parser, but I don't want to reinvent wheels if I can just
grab this stuff off the shelf.

--
Alexander Staubo http://www.mop.no/~alex/
"He could open a tin of sardines with his teeth, strike a Swan Vestas
on his chin, rope steers, drive a steam locomotive and hum all the
works of Gilbert and Sullivan without becoming confused or breaking
down in tears."
--Robert Rankin, _The Book of Ultimate Truths_
HTML parser/reformatter/filter wanted [ In reply to ]
On Tue, Jul 13, 1999 at 09:57:29PM +0200, Alexander Staubo wrote:
>
> I'm looking for a piece of Python code that can filter HTML and tie up
> unclosed tags.
>
> By way of explanation: I'm building a bulletin-board-ish solution where
> users should be able to post HTMl-formatted submissions; I only want to
> enable a subset of HTML (eg., <b>, <i> and so forth), plus I want to make
> sure text-level tags are ended correctly. I could sit down and write a
> smaller HTML parser, but I don't want to reinvent wheels if I can just
> grab this stuff off the shelf.
>

greetings,

I'm not sure if it will do *exactly* what you want, but a good starting
point is probably 'htmllib.py' in the standard distribution. it will at
least give you something to build on.

regards,
J
--
|| visit gfd <http://quark.newimage.com/>
|| psa member #293 <http://www.python.org/>
|| New Image Systems & Services, Inc. <http://www.newimage.com/>