Mailing List Archive

1 2 3 4  View All
Re: Re: Chat about Wikipedia performance? [ In reply to ]
Nick Reinking wrote:

>* Converts < > and & inside <nowiki>

& shouldn't be treated specially inside <nowiki>.
Character entities aren't wiki markup;
they're bona fide HTML character entities.
(And if you don't buy that argument --
it will change the behaviour of several pages,
not least [[en:Wikipedia:How to edit a page]].)


-- Toby
Re: Re: Chat about Wikipedia performance? [ In reply to ]
> (Nick Reinking <nick@twoevils.org>):
>
> As far as the \n closing, I can change it to behave like that, although
> it seems clearer to me if it continues to span lines (ala HTML). It
> doesn't bother me either way (other than a bit more coding). Do a lot
> of people not close their ''/'''/'''''s?

Wiki syntax is a line-based syntax. There is /no/ wiki markup that
spans lines. It makes editing much simpler: if you make a mistake and
forget to close something, it gets closed off quickly. HTML is not
designed to be human-editable; wiki syntax is.

> This is how it works, based on what seems to be represented on
> http://www.wikipedia.org/wiki/Wikipedia%3AHow_to_edit_a_page
>
> '' ... '' <---- <em> ... </em>
> ''' ... ''' <---- <strong> ... </strong>
> ''''' ... ''''' <---- <strong><em> ... </em></strong>

Sure, those are the easy cases. The code version before this one did
those right, but screwed up on other cases (like ''a'''b'''c'').
Like I said, half the battle here will be defining exactly what
/should/ be done in all cases.

--
Lee Daniel Crocker <lee@piclab.com> <http://www.piclab.com/lee/>
"All inventions or works of authorship original to me, herein and past,
are placed irrevocably in the public domain, and may be used or modified
for any purpose, without permission, attribution, or notification."--LDC
Re: Re: Chat about Wikipedia performance? [ In reply to ]
On Wed, 7 May 2003, Lee Daniel Crocker wrote:

> Date: Wed, 7 May 2003 23:56:54 -0500
> From: Lee Daniel Crocker <lee@piclab.com>
> Subject: Re: [Wikitech-l] Re: Chat about Wikipedia performance?
>
> > (Nick Reinking <nick@twoevils.org>):
> >
> > As far as the \n closing, I can change it to behave like that, although
> > it seems clearer to me if it continues to span lines (ala HTML). It
> > doesn't bother me either way (other than a bit more coding). Do a lot
> > of people not close their ''/'''/'''''s?
>
> Wiki syntax is a line-based syntax. There is /no/ wiki markup that
> spans lines. It makes editing much simpler: if you make a mistake and
> forget to close something, it gets closed off quickly. HTML is not
> designed to be human-editable; wiki syntax is.

Not that that's not enough, but of course, lines (that don't continue on
the next one without a *, :, etc.) are also enclosed by (unclosed! grr)
<p> tags, and you can't put a <em> in the middle of a <p> and close it
after the next <p> and call it good HTML.

--
John R. Owens http://www.ghiapet.homeip.net/
Life's full of mysteries. Consider this one of them.
--Commander Jeffrey Sinclair
Re: Re: Chat about Wikipedia performance? [ In reply to ]
On Thu, May 08, 2003 at 01:07:19AM -0500, John R. Owens wrote:
> > > As far as the \n closing, I can change it to behave like that, although
> > > it seems clearer to me if it continues to span lines (ala HTML). It
> > > doesn't bother me either way (other than a bit more coding). Do a lot
> > > of people not close their ''/'''/'''''s?
> >
> > Wiki syntax is a line-based syntax. There is /no/ wiki markup that
> > spans lines. It makes editing much simpler: if you make a mistake and
> > forget to close something, it gets closed off quickly. HTML is not
> > designed to be human-editable; wiki syntax is.
>
> Not that that's not enough, but of course, lines (that don't continue on
> the next one without a *, :, etc.) are also enclosed by (unclosed! grr)
> <p> tags, and you can't put a <em> in the middle of a <p> and close it
> after the next <p> and call it good HTML.

A couple things... I guess I don't see it as being too difficult or too
complicated for users to understand that you need to enclose text you
want to place emphasis on with '''. I know that stopping at newlines
prevents an entire article from being emphasized, but it also prevents
users from having large sections of emphasized text (without putting
emphasis marks on every line). The Howto page certainly makes it look
like you have to enclose your text, and in all the pages I've edited,
I've never come across a page that leaves open emphasis marks.

Also, there are lots of line spanning constructs in wikitext. <pre>,
<nowiki>, <tr>, <td>, etc. Now, I _know_ that (with the exception of
<nowiki>) that these are HTML constructs (and not Wikitext constructs),
but to the average user, they are exactly the same. So, some things
span multiple lines, and some things don't. I think that is confusing.

--
Nick Reinking -- eschewing obfuscation since 1981 -- Minneapolis, MN
Re: Re: Chat about Wikipedia performance? [ In reply to ]
On Wed, May 07, 2003 at 09:31:29PM -0700, Toby Bartels wrote:
> Nick Reinking wrote:
>
> >* Converts < > and & inside <nowiki>

Whoops, sorry. I misinterpreted what I saw on the howto page. It's
been fixed. :)

--
Nick Reinking -- eschewing obfuscation since 1981 -- Minneapolis, MN
Re: Re: Chat about Wikipedia performance? [ In reply to ]
> > > Wiki syntax is a line-based syntax. There is /no/ wiki markup that
> > > spans lines. It makes editing much simpler: if you make a mistake and
> > > forget to close something, it gets closed off quickly. HTML is not
> > > designed to be human-editable; wiki syntax is.
> >
> > Not that that's not enough, but of course, lines (that don't continue on
> > the next one without a *, :, etc.) are also enclosed by (unclosed! grr)
> > <p> tags, and you can't put a <em> in the middle of a <p> and close it
> > after the next <p> and call it good HTML.
>
> A couple things... I guess I don't see it as being too difficult or too
> complicated for users to understand that you need to enclose text you
> want to place emphasis on with '''. I know that stopping at newlines
> prevents an entire article from being emphasized, but it also prevents
> users from having large sections of emphasized text (without putting
> emphasis marks on every line). The Howto page certainly makes it look
> like you have to enclose your text, and in all the pages I've edited,
> I've never come across a page that leaves open emphasis marks.
>
> Also, there are lots of line spanning constructs in wikitext. <pre>,
> <nowiki>, <tr>, <td>, etc. Now, I _know_ that (with the exception of
> <nowiki>) that these are HTML constructs (and not Wikitext constructs),
> but to the average user, they are exactly the same. So, some things
> span multiple lines, and some things don't. I think that is confusing.

I guess I'd like to clarify one thing. I don't want to sound pushy, or
"not a team player", or somebody who is just jumping in and disrupting
the good work that everybody else is doing. And you all are doing a
great job, BTW. ;)

Anyways, what I want to clarify. To me, when my mind encounters '' or
''', it thinks, "Ooo! Quotation marks that make stuff bold!". My mind
is used to closing quotation marks, so I guess that's why it makes the
most sense for them to span multiple lines until they are closed. I'm
not sure how quotes work in a lot of other languages, but I know they're
closed in Japanese much the same way (but with different symbols than
quotation marks).

That's not to say that stopping them at the end of a line is a bad idea
- I'm sure it helps a lot to prevent new users from making a bad
mistake, seeing a messed up page, and giving up.

In the end, I'm writing a new parser for Wikipedia, not for myself. If
everybody thinks it should end at newlines, I can make it do that, and
that will be that. :)

--
Nick Reinking -- eschewing obfuscation since 1981 -- Minneapolis, MN
Re: Re: Chat about Wikipedia performance? [ In reply to ]
On Thu, 2003-05-08 at 06:58, Nick Reinking wrote:
> The Howto page certainly makes it look
> like you have to enclose your text, and in all the pages I've edited,
> I've never come across a page that leaves open emphasis marks.

That's right. If you don't have both close and open marks, they're left
as literal '' and ''' sequences.

> Also, there are lots of line spanning constructs in wikitext. <pre>,
> <nowiki>, <tr>, <td>, etc. Now, I _know_ that (with the exception of
> <nowiki>) that these are HTML constructs (and not Wikitext constructs),
> but to the average user, they are exactly the same. So, some things
> span multiple lines, and some things don't. I think that is confusing.

Yes, that's why we should destroy the pseudo-HTML and make things
consistent and happy. :)

-- brion vibber (brion @ pobox.com)
Re: Re: Chat about Wikipedia performance? [ In reply to ]
> (Nick Reinking <nick@twoevils.org>):
>
> Also, there are lots of line spanning constructs in wikitext. <pre>,
> <nowiki>, <tr>, <td>, etc. Now, I _know_ that (with the exception of
> <nowiki>) that these are HTML constructs (and not Wikitext constructs),
> but to the average user, they are exactly the same. So, some things
> span multiple lines, and some things don't. I think that is confusing.

HTML things span lines; most of those we can eventually eliminate.
We'll probably always be stuck with <nowiki>, but all of the others
you mention above are totally unnecessary. And there will be a much
better way to emphasize whole paragraphs using style elements.

Wiki syntax is an evolving language, but I'm determined to make it
a clean, well-specified, useful, powerful, and consistent one instead
of the current hodgepodge.

--
Lee Daniel Crocker <lee@piclab.com> <http://www.piclab.com/lee/>
"All inventions or works of authorship original to me, herein and past,
are placed irrevocably in the public domain, and may be used or modified
for any purpose, without permission, attribution, or notification."--LDC
Re: Re: Chat about Wikipedia performance? [ In reply to ]
> (Nick Reinking <nick@twoevils.org>):
>
> Anyways, what I want to clarify. To me, when my mind encounters '' or
> ''', it thinks, "Ooo! Quotation marks that make stuff bold!". My mind
> is used to closing quotation marks, so I guess that's why it makes the
> most sense for them to span multiple lines until they are closed. I'm
> not sure how quotes work in a lot of other languages, but I know they're
> closed in Japanese much the same way (but with different symbols than
> quotation marks).
>
> That's not to say that stopping them at the end of a line is a bad idea
> - I'm sure it helps a lot to prevent new users from making a bad
> mistake, seeing a messed up page, and giving up.
>
> In the end, I'm writing a new parser for Wikipedia, not for myself. If
> everybody thinks it should end at newlines, I can make it do that, and
> that will be that. :)

There are other reasons to kill them at line-ends. Primarily, the
block-level elements like lists and <p>s and <pre>s are defined by the
first character of each line; allowing '' to span lines would require
that we close and re-open them at paragraph boundaries to stay valid
HTML, and that's complicated and error-prone. Second, just /defining/
proper behavior requires specifying some maximum scope; otherwise,
things like '' a ''' b '' c ''' d '' ... will just stack up without
closing. If we define them to close at line-end (and I'd further
define them to stack at most two levels), then they're easier to
cleanly specify.

--
Lee Daniel Crocker <lee@piclab.com> <http://www.piclab.com/lee/>
"All inventions or works of authorship original to me, herein and past,
are placed irrevocably in the public domain, and may be used or modified
for any purpose, without permission, attribution, or notification."--LDC
Re: Re: Chat about Wikipedia performance? [ In reply to ]
On Thu, May 08, 2003 at 01:08:58PM -0500, Lee Daniel Crocker wrote:
> > (Nick Reinking <nick@twoevils.org>):
> >
> > Anyways, what I want to clarify. To me, when my mind encounters '' or
> > ''', it thinks, "Ooo! Quotation marks that make stuff bold!". My mind
> > is used to closing quotation marks, so I guess that's why it makes the
> > most sense for them to span multiple lines until they are closed. I'm
> > not sure how quotes work in a lot of other languages, but I know they're
> > closed in Japanese much the same way (but with different symbols than
> > quotation marks).
> >
> > That's not to say that stopping them at the end of a line is a bad idea
> > - I'm sure it helps a lot to prevent new users from making a bad
> > mistake, seeing a messed up page, and giving up.
> >
> > In the end, I'm writing a new parser for Wikipedia, not for myself. If
> > everybody thinks it should end at newlines, I can make it do that, and
> > that will be that. :)
>
> There are other reasons to kill them at line-ends. Primarily, the
> block-level elements like lists and <p>s and <pre>s are defined by the
> first character of each line; allowing '' to span lines would require
> that we close and re-open them at paragraph boundaries to stay valid
> HTML, and that's complicated and error-prone. Second, just /defining/
> proper behavior requires specifying some maximum scope; otherwise,
> things like '' a ''' b '' c ''' d '' ... will just stack up without
> closing. If we define them to close at line-end (and I'd further
> define them to stack at most two levels), then they're easier to
> cleanly specify.

So, should I go ahead implementing the C parser to handle the current
Wikitext, or should I wait until we have an actual specification?

--
Nick Reinking -- eschewing obfuscation since 1981 -- Minneapolis, MN
Re: Re: Chat about Wikipedia performance? [ In reply to ]
> (Nick Reinking <nick@twoevils.org>):
>
> So, should I go ahead implementing the C parser to handle the current
> Wikitext, or should I wait until we have an actual specification?

I'm generally in the code-first ask-questions-later school, but since
the act of coding brings up questions, we might as well try to deal
with them when they come up.

--
Lee Daniel Crocker <lee@piclab.com> <http://www.piclab.com/lee/>
"All inventions or works of authorship original to me, herein and past,
are placed irrevocably in the public domain, and may be used or modified
for any purpose, without permission, attribution, or notification."--LDC
Re: Re: Chat about Wikipedia performance? [ In reply to ]
> > > Wiki syntax is a line-based syntax. There is /no/ wiki markup that
> > > spans lines. It makes editing much simpler: if you make a mistake and
> > > forget to close something, it gets closed off quickly. HTML is not
> > > designed to be human-editable; wiki syntax is.

I'm having a bit of trouble implementing the C parser because the
Wikitext parser has a lot of quirks. For example, you say that no wiki
markup spans lines, but if you take a look at:
http://www.wikipedia.org/wiki/User:Marumari/Wikitext_Rendering_Quirks
you can see that headers do span lines.

--
Nick Reinking -- eschewing obfuscation since 1981 -- Minneapolis, MN
Re: Re: Chat about Wikipedia performance? [ In reply to ]
> (Nick Reinking <nick@twoevils.org>):
> > > > Wiki syntax is a line-based syntax. There is /no/ wiki markup that
> > > > spans lines. It makes editing much simpler: if you make a mistake and
> > > > forget to close something, it gets closed off quickly. HTML is not
> > > > designed to be human-editable; wiki syntax is.
>
> I'm having a bit of trouble implementing the C parser because the
> Wikitext parser has a lot of quirks. For example, you say that no wiki
> markup spans lines, but if you take a look at:
> http://www.wikipedia.org/wiki/User:Marumari/Wikitext_Rendering_Quirks
> you can see that headers do span lines.

It's not easy, I agree. I wasn't aware that headers could span lines;
I'm not sure whether or not I like that. I don't think so offhand.
You'll also notice that I changed my mind about quotes--I think it's
probably true that users will be somewhat surprized by forcing quotes
to close on every line break, so in my long-range vision I closed
them only at paragraph end.

My suggestion to an implementor is this: if the current code has
quirks, it's largely because the expected behavior is undefined, so
don't be afraid to define it yourself--if your definition makes
sense, and doesn't screw up too many existing pages, it will likely
be adopted.

--
Lee Daniel Crocker <lee@piclab.com> <http://www.piclab.com/lee/>
"All inventions or works of authorship original to me, herein and past,
are placed irrevocably in the public domain, and may be used or modified
for any purpose, without permission, attribution, or notification."--LDC

1 2 3 4  View All