Mailing List Archive

Question about current code
I want to add support for TeX mode to Wikipedia script.

I wrote a program which takes TeX code on stdin,
validates and standarizes it (so that "x+y" and "x + y" don't have to be
generated twice), and even has some extensions ("%" -> "\%"),
checks if given image already exists, and if i doesn't, passes it to latex,
dvips and convert to get nice antialiased png file.

How to integrate it with Wikipedia script ?

Wikipedia script when rendering should find all <math>.*</math> in markup,
call this program (which is very fast if images are already rendered)
and get info provided by it to render a page. I'm not even sure what should
that program return.

There are 3 possibilities:
* illegal markup - <tt>Illegal markup: ^$%^*^%$%(^$%(^$^$%^$</tt>
* markup validated ok, latex runs corectly - <img href="/path/345456986858674.png">
* markup validated ok, latex failed - ???

The last case shouldn't happen too often, but if we always wait for latex
to finish, that will unnecesarily increase latency.

I'm also thinking that some similar solution should be added for
chemical reactions.

Any opinions ?
Re: Question about current code [ In reply to ]
On Wed, Nov 27, 2002 at 02:50:24AM +0100, Tomasz Wegrzanowski wrote:
>How to integrate it with Wikipedia script ?
>
>Wikipedia script when rendering should find all <math>.*</math> in markup,
>call this program (which is very fast if images are already rendered)
>and get info provided by it to render a page. I'm not even sure what should
>that program return.

An easier to implement way is this: have a special page where you can
type in the TeX; when you hit "submit", it gets rendered, and saved as
the png file of a name you have specified. It would look like, and be
analogous to the image upload page. Then you could use the image
directly in your page, and any other page that is needed, and it would
save re-parsing and re-generating the image everytime someone views the
page.

Jonathan

--
Geek House Productions, Ltd.

Providing Unix & Internet Contracting and Consulting,
QA Testing, Technical Documentation, Systems Design & Implementation,
General Programming, E-commerce, Web & Mail Services since 1998

Phone: 604-435-1205
Email: djw@reactor-core.org
Webpage: http://reactor-core.org
Address: 2459 E 41st Ave, Vancouver, BC V5R2W2
Re: Question about current code [ In reply to ]
On Wed, Nov 27, 2002 at 01:13:11AM -0500, The Cunctator wrote:
>We would be best served by not duplicating the efforts of
>http://planetmath.org, a GFDL math encyclopedia.
>
>I recommend that Wikipedia either 1) not use Tex-graphics, and stick to
>what can be done with plain-jane wiki/HTML, or 2) integrate efforts with
>planetmath.org.

planetmath, alas, is not a Wiki. Totally different dynamic. I would be
happy to see Planetmath integrate with us.

Jonathan

--
Geek House Productions, Ltd.

Providing Unix & Internet Contracting and Consulting,
QA Testing, Technical Documentation, Systems Design & Implementation,
General Programming, E-commerce, Web & Mail Services since 1998

Phone: 604-435-1205
Email: djw@reactor-core.org
Webpage: http://reactor-core.org
Address: 2459 E 41st Ave, Vancouver, BC V5R2W2
Re: Question about current code [ In reply to ]
On Tue, 2002-11-26 at 22:12, Jonathan Walther wrote:
> On Wed, Nov 27, 2002 at 02:50:24AM +0100, Tomasz Wegrzanowski wrote:
> >How to integrate it with Wikipedia script ?
> >
> >Wikipedia script when rendering should find all <math>.*</math> in markup,
> >call this program (which is very fast if images are already rendered)
> >and get info provided by it to render a page. I'm not even sure what should
> >that program return.

We would be best served by not duplicating the efforts of
http://planetmath.org, a GFDL math encyclopedia.

I recommend that Wikipedia either 1) not use Tex-graphics, and stick to
what can be done with plain-jane wiki/HTML, or 2) integrate efforts with
planetmath.org.
Re: Question about current code [ In reply to ]
On Mit, 2002-11-27 at 07:13, The Cunctator wrote:

> We would be best served by not duplicating the efforts of
> http://planetmath.org, a GFDL math encyclopedia.
>
> I recommend that Wikipedia either 1) not use Tex-graphics, and stick to
> what can be done with plain-jane wiki/HTML, or 2) integrate efforts with
> planetmath.org.

Hi Cunc,

we should try to incorporate the PlanetMath material into Wikipedia, as
it's very much compatible with our mission statement. I for one welcome
Tomasz' hard work to bring more powerful math rendering features to
Wikipedia, regardless of what we do about PlanetMath.

Regards

Erik
--
FOKUS - Fraunhofer Insitute for Open Communication Systems
Project BerliOS - http://www.berlios.de
Re: Question about current code [ In reply to ]
On Wed, Nov 27, 2002 at 01:13:11AM -0500, The Cunctator wrote:
> On Tue, 2002-11-26 at 22:12, Jonathan Walther wrote:
> > On Wed, Nov 27, 2002 at 02:50:24AM +0100, Tomasz Wegrzanowski wrote:
> > >How to integrate it with Wikipedia script ?
> > >
> > >Wikipedia script when rendering should find all <math>.*</math> in markup,
> > >call this program (which is very fast if images are already rendered)
> > >and get info provided by it to render a page. I'm not even sure what should
> > >that program return.
>
> We would be best served by not duplicating the efforts of
> http://planetmath.org, a GFDL math encyclopedia.
>
> I recommend that Wikipedia either 1) not use Tex-graphics, and stick to
> what can be done with plain-jane wiki/HTML, or 2) integrate efforts with
> planetmath.org.

Neither of these two solutions would work well, nor would "upload pngs".

First, math in HTML is way too ugly. Even things so simple as fractions
and integrals are impossible to render well.

Second, if we want to export Wikipedia to different formats, in particular dict,
Wikipedia markup model will have to change from "parse fragments you understand
leave everything else" to "parse everything, if you don't understand something
then it is not markup". This is good step in this direction - contributors will
be able to use more powerful markup and it will be renderable as plain text too.

Third, planetmath.org is TeX-only math-only project and not a Wiki, so technically
it is very different, and we need special markup for other purposes like chemistry
too.

Oh, and fourth:
* <math>.*</math> is much easier for parsers to live with than [[math:.*]],
as `]]' is much more likely to happen in equations than `</math>',
* it can be easily extended to different math syntax <math syntax=foo></math>
(TeX syntax is very good, but many people don't know it)
* and `> <' are visually more delimiters-like than [[math: ]].
* this isn't a link, it's markup.
So i prefer <math></math> to [[math: ]]
Re: Question about current code [ In reply to ]
On Wed, Nov 27, 2002 at 02:00:08PM +0100, Tomasz Wegrzanowski wrote:
>Neither of these two solutions would work well, nor would "upload pngs".

On further thought, I recommend a new namespace, TeX, and that the tex
source for an equation goes in it's own entry; every time the TeX is
updated, a new png is generated, and the old png discarded, but the old
TeX is kept, same as how old versions of articles are kept.

This scheme is comparitavely easy to code because it is an add-on
instead of involving changes to the parser itself, and it reduces cpu
load. Running TeX for every page with an equation on it could bring the
server to it's knees.

It would be nice if the resulting png could be made a link to the
original TeX, so one could edit that directly.

Jonathan

--
Geek House Productions, Ltd.

Providing Unix & Internet Contracting and Consulting,
QA Testing, Technical Documentation, Systems Design & Implementation,
General Programming, E-commerce, Web & Mail Services since 1998

Phone: 604-435-1205
Email: djw@reactor-core.org
Webpage: http://reactor-core.org
Address: 2459 E 41st Ave, Vancouver, BC V5R2W2
Re: Question about current code [ In reply to ]
On Wed, 2002-11-27 at 08:00, Tomasz Wegrzanowski wrote:
> On Wed, Nov 27, 2002 at 01:13:11AM -0500, The Cunctator wrote:
> > On Tue, 2002-11-26 at 22:12, Jonathan Walther wrote:
> > > On Wed, Nov 27, 2002 at 02:50:24AM +0100, Tomasz Wegrzanowski wrote:
> > > >How to integrate it with Wikipedia script ?
> > > >
> > > >Wikipedia script when rendering should find all <math>.*</math> in markup,
> > > >call this program (which is very fast if images are already rendered)
> > > >and get info provided by it to render a page. I'm not even sure what should
> > > >that program return.
> >
> > We would be best served by not duplicating the efforts of
> > http://planetmath.org, a GFDL math encyclopedia.
> >
> > I recommend that Wikipedia either 1) not use Tex-graphics, and stick to
> > what can be done with plain-jane wiki/HTML, or 2) integrate efforts with
> > planetmath.org.
>
> Neither of these two solutions would work well, nor would "upload pngs".

I fail to see how the second method wouldn't work well-it's a philosophical
approach, not a particular method. All I'm saying is that we should
figure out ways to work with planetmath.org, not duplicate their effort.
Re: Question about current code [ In reply to ]
On Wed, Nov 27, 2002 at 11:49:28AM -0500, The Cunctator wrote:
> On Wed, 2002-11-27 at 08:00, Tomasz Wegrzanowski wrote:
> > On Wed, Nov 27, 2002 at 01:13:11AM -0500, The Cunctator wrote:
> > > I recommend that Wikipedia either 1) not use Tex-graphics, and stick to
> > > what can be done with plain-jane wiki/HTML, or 2) integrate efforts with
> > > planetmath.org.
> >
> > Neither of these two solutions would work well, nor would "upload pngs".
>
> I fail to see how the second method wouldn't work well-it's a philosophical
> approach, not a particular method. All I'm saying is that we should
> figure out ways to work with planetmath.org, not duplicate their effort.

It seems to me that our needs and theirs differ too much.
Some efford could be saved but it's almost impossible that one
solution would fit both problems.
Re: Question about current code [ In reply to ]
Why don't we just implement a generic tag (or namespace, or whatever)
for generation of images from some sort of text. Really, it shouldn't
matter what the source of the image looks like (TeX, HTML, some crazy
mix or something homespun).

The images can be generated by some sort of "plugin". The person
entering the code could choose from a list of possible plugins. This
way, we could have somebody who sees the value in using planetmath
write a planetmath parsing plugin, and other plugins can be written as
necessary.

If we tie ourselves to any one thing (like we did with MySQL), we're
bound to end up spending a lot of time rewriting it all...

Jason

Tomasz Wegrzanowski wrote:

> On Wed, Nov 27, 2002 at 11:49:28AM -0500, The Cunctator wrote:
> > On Wed, 2002-11-27 at 08:00, Tomasz Wegrzanowski wrote:
> > > On Wed, Nov 27, 2002 at 01:13:11AM -0500, The Cunctator wrote:
> > > > I recommend that Wikipedia either 1) not use Tex-graphics, and stick to
> > > > what can be done with plain-jane wiki/HTML, or 2) integrate efforts with
> > > > planetmath.org.
> > >
> > > Neither of these two solutions would work well, nor would "upload pngs".
> >
> > I fail to see how the second method wouldn't work well-it's a philosophical
> > approach, not a particular method. All I'm saying is that we should
> > figure out ways to work with planetmath.org, not duplicate their effort.
>
> It seems to me that our needs and theirs differ too much.
> Some efford could be saved but it's almost impossible that one
> solution would fit both problems.
> _______________________________________________
> Wikitech-l mailing list
> Wikitech-l@wikipedia.org
> http://www.wikipedia.org/mailman/listinfo/wikitech-l

--
"Jason C. Richey" <jasonr@bomis.com>