Mailing List Archive

Browsers
Lynx is a text mode browser:

http://lynx.browser.org/

For fancier stuff, look at Mozilla:

http://www.mozilla.org/

Also go to http://www.w3.org/, and look for both "amaya" and
"arena".

I believe that source is available for most, if not all of these.

- Dave



Daniel Faulkner <m01ymu00@cwcom.net> wrote:
> Is there a basic browser some where that I can look at to see how it
> works? (not grail)
> As I can't understand much of the python internet software and don't
> understand how to parse the HTML once I've got it.

> Thanks

> Dan
Browsers [ In reply to ]
>Daniel Faulkner <m01ymu00@cwcom.net> wrote:
>> Is there a basic browser some where that I can look at to see how it
>> works? (not grail)
>> As I can't understand much of the python internet software and don't
>> understand how to parse the HTML once I've got it.

G. David Kuhlman writes:
>Lynx is a text mode browser:
> http://lynx.browser.org/
>For fancier stuff, look at Mozilla:
> http://www.mozilla.org/

Note, however, that an HTML parser capable of coping with all
the invalid HTML on the Web is a complicated beast. For example, Lynx
currently has an SGMLish style parser that has been brain damaged in
various ways to cope with invalid HTML. I don't know how much error
correction Grail includes, but it might actually be a simpler parser
if it hasn't been complicated with various error recovery hacks.
Another good option might be to look at the test code in htmllib.py,,
which does simple HTML-to-text formatting. (When trying to figure out
a module, always look in the module's code first, since authors will
often include simple examples or test scripts inside an 'if
__name__=='__main__'" block.

--
A.M. Kuchling http://starship.python.net/crew/amk/
Time, place, and action may with pains be wrought, / But Genius must be born;
and can never be taught.
-- John Congreve
Browsers [ In reply to ]
I have just been messing around with some network modules and I was
wondering how you would go about turning HTML into a basic picture (you
know just get rid of the HTML tabs make some of the writing bold and
include some spaces/center it nothing to hard or complicated) I have
messed around with some of the parsers but they don't seem to make a
huge difference. Any tips?

Dan