Mailing List Archive

mod_rewrite configuration
Check this out:

http://www.wikipedia.org//

I typo'd that earlier. Check it out. I'm sure the above is a fairly common
typo, so somebody might wanna fix the mod_rewrite rules to allow for it.

Peace out,

Derek

_________________________________________________________________
The new MSN 8: smart spam protection and 3 months FREE*.
http://join.msn.com/?page=features/junkmail&xAPID=42&PS=47575&PI=7324&DI=7474&SU=
http://www.hotmail.msn.com/cgi-bin/getmsg&HL=1216hotmailtaglines_smartspamprotection_3mf
Re: mod_rewrite configuration [ In reply to ]
On Wed, Dec 25, 2002 at 07:47:30PM -0600, Derek Moore wrote:
>Check this out:
>
>http://www.wikipedia.org//
>
>I typo'd that earlier. Check it out. I'm sure the above is a fairly
>common typo, so somebody might wanna fix the mod_rewrite rules to allow for
>it.

Speaking of which, I got my Wikipedia configuration ironed out; here it
is for anyone who is interested. Brion, could something like this be
incoroporated into the INSTALL file? In mod_wiki I would like to alter
the design such that direct access to php or other scripts can be
abolished. That may require using POST for everything, and Javascript.
The handling of the "&" character throws gum in the works.

I had to "touch" the logfile before the Wikipedia software would run.

httpd.conf:

Alias /mywiki-style /usr/src/newcodebase/stylesheets
Alias /mywiki-upload /usr/src/newcodebase/images
Alias /mywiki-wiki.phtml /usr/src/newcodebase/wiki.phtml
Alias /mywiki-redirect.phtml /usr/src/newcodebase/redirect.phtml
RewriteEngine On
RewriteRule ^/mywiki$ /usr/src/newcodebase/wiki.phtml?title=
RewriteRule ^/mywiki/(.*)$ /usr/src/newcodebase/wiki.phtml?title=$1

LocalSettings.php:

$wgServer = "http://reactor-core.org";
$wgStyleSheetPath = "$wgServer/mywiki-style";
$wgScript = "/mywiki-wiki.phtml";
$wgRedirectScript = "/mywiki-redirect.phtml";
$wgArticlePath = "$wgServer/mywiki/$1";
$wgUploadPath = "http://reactor-core.org/mywiki-upload";
$wgUploadDirectory = "/usr/src/newcodebase/images";
$wgLogo = "$wgUploadPath/wiki.png";
$wgDebugLogFile = "$wgUploadDirectory/logfile";

I hope this helps someone!

Jonathan

--
Geek House Productions, Ltd.

Providing Unix & Internet Contracting and Consulting,
QA Testing, Technical Documentation, Systems Design & Implementation,
General Programming, E-commerce, Web & Mail Services since 1998

Phone: 604-435-1205
Email: djw@reactor-core.org
Webpage: http://reactor-core.org
Address: 2459 E 41st Ave, Vancouver, BC V5R2W2
Re: mod_rewrite configuration [ In reply to ]
> On Wed, Dec 25, 2002 at 07:47:30PM -0600, Derek Moore wrote:
>> Check this out:
>>
>> http://www.wikipedia.org//
>>
>> I typo'd that earlier. Check it out. I'm sure the above is a fairly
>> common typo, so somebody might wanna fix the mod_rewrite rules to allow for
>> it.

> Speaking of which, I got my Wikipedia configuration ironed out; here it
> is for anyone who is interested. Brion, could something like this be
> incoroporated into the INSTALL file?

Have you considered applying for developer status? It's fairly easy to get
and makes fixing things much easier.

Regards,

Erik
Re: mod_rewrite configuration [ In reply to ]
> On Wed, Dec 25, 2002 at 07:47:30PM -0600, Derek Moore wrote:
> >Check this out:
> >
> >http://www.wikipedia.org//
> >
> >I typo'd that earlier. Check it out. I'm sure the above is a fairly
> >common typo, so somebody might wanna fix the mod_rewrite rules to allow for
> >it.

I'll tweak the rewrite rules later; in the meantime I've replaced the
classic Apache welcome page with a nice simple redirect-to-main-wiki-
page.

On mer, 2002-12-25 at 18:55, Jonathan Walther wrote:
> Speaking of which, I got my Wikipedia configuration ironed out; here it
> is for anyone who is interested. Brion, could something like this be
> incoroporated into the INSTALL file? In mod_wiki I would like to alter
> the design such that direct access to php or other scripts can be
> abolished. That may require using POST for everything, and Javascript.
> The handling of the "&" character throws gum in the works.

Should be straightforward -- just use %26 wherever it represents content
and & where it's a magic query string component separator:

http://foobar/wiki/Beeswax_%26_Honey
http://foobar/wiki/Beeswax_%26_Honey?action=edit
http://foobar/wiki/Special:Whatlinkshere?target=Beeswax_%26_Honey&limit=500

The URL-building functions may need tweaking to know to use the question
mark to start building a query string. (And remember Apache's
mod-rewrite de-URL-encodes the path components before you get to them;
that's why we use a patched ampescape function to re-encode the
ampersands in titles being moved from the path to the query string.)

> I had to "touch" the logfile before the Wikipedia software would run.

Check the ownerhsip/permissions on your upload directory. If it can't
create the log file, you may not be able to upload files either. The dir
should be writable by the webserver's process.

-- brion vibber (brion @ pobox.com)
Re: mod_rewrite configuration [ In reply to ]
On Wed, Dec 25, 2002 at 08:39:44PM -0800, Brion Vibber wrote:
>Should be straightforward -- just use %26 wherever it represents content
>and & where it's a magic query string component separator:
>
>http://foobar/wiki/Beeswax_%26_Honey
>http://foobar/wiki/Beeswax_%26_Honey?action=edit
>http://foobar/wiki/Special:Whatlinkshere?target=Beeswax_%26_Honey&limit=500

Actually, Brion, because mod_wiki is a module in it's own right, the
URL can be in any format we like; it doesn't need to be ?foo=bar&x=y
format. In fact, I had no intention of going through mod_rewrite at
all. My intention was that you could tell mod_wiki to handle all URL's
that start with a certain prefix. Because of this, the Apache module
could be distributed separately, without patching Apache at all.

Like so:

<Location /w>
SetHandler mod-wiki-handler
</Location>

>mark to start building a query string. (And remember Apache's
>mod-rewrite de-URL-encodes the path components before you get to them;

Maybe we could use an alternate minilanguage. For instance, I'm also
thinking that the namespace should be a variable instead of a part of
the title:

http://foo/w/Whatlinkshere

Then the following would get passed in as POST data:

namespace=Special&target=A%26W+Root+Beer&limit=500

It would be nice to keep compatibility with mod_rewrite though.

How about the following?

http://foo/w/Whatlinkshere#ns=Special&target=A%26W+Root+Beer&limit=500

I also would like language to be a variable, so the URL would become:

http://foo/w/Whatlinkshere#ns=Special&lang=en&target=A%26W+Root+Beer&limit=50

Any objections? You did tell me earlier that the # character was not
allowed in page titles, and is mod_rewrite likely to escape it? With
mod_wiki, mod_rewrite shouldn't be needed, but one never knows when one
will want to redirect one URL to another.

>Check the ownerhsip/permissions on your upload directory. If it can't
>create the log file, you may not be able to upload files either. The dir
>should be writable by the webserver's process.

Ah. Thanks for the tip. I changed the permissions after I tried to do
my first upload, so upload has been working.

Jonathan

--
Geek House Productions, Ltd.

Providing Unix & Internet Contracting and Consulting,
QA Testing, Technical Documentation, Systems Design & Implementation,
General Programming, E-commerce, Web & Mail Services since 1998

Phone: 604-435-1205
Email: djw@reactor-core.org
Webpage: http://reactor-core.org
Address: 2459 E 41st Ave, Vancouver, BC V5R2W2
Re: mod_rewrite configuration [ In reply to ]
On mer, 2002-12-25 at 22:51, Jonathan Walther wrote:
> On Wed, Dec 25, 2002 at 08:39:44PM -0800, Brion Vibber wrote:
> >Should be straightforward -- just use %26 wherever it represents content
> >and & where it's a magic query string component separator:
> >
> >http://foobar/wiki/Beeswax_%26_Honey
> >http://foobar/wiki/Beeswax_%26_Honey?action=edit
> >http://foobar/wiki/Special:Whatlinkshere?target=Beeswax_%26_Honey&limit=500
>
> Actually, Brion, because mod_wiki is a module in it's own right, the
> URL can be in any format we like; it doesn't need to be ?foo=bar&x=y
> format.

The backend implementation is not relevant. By using mod_rewrite, the
URL can be in any format we like with the PHP system. But that's no
excuse for making URLs long, confusing, and fragile.

> Maybe we could use an alternate minilanguage. For instance, I'm also
> thinking that the namespace should be a variable instead of a part of
> the title:
>
> http://foo/w/Whatlinkshere
>
> Then the following would get passed in as POST data:
> namespace=Special&target=A%26W+Root+Beer&limit=500

That, I think, would be annoying, as you couldn't bookmark it or send
someone a URL without digging through backroom docs for the magic
incantation.

> How about the following?
>
> http://foo/w/Whatlinkshere#ns=Special&target=A%26W+Root+Beer&limit=500

Standard URL syntax provides for a query string (starting with "?"), we
shouldn't be afraid to use it. Overloading the anchor syntax (starting
with "#") isn't a good idea, as I believe it won't be sent to the http
server, and is intended to be interpreted by the browser (and would
break the asked-by-some ability to actually use anchors in pages).
Technically you could use the hex-code for #, but that's fragile.
(Somebody's going to try to type it, and it won't work.)

> I also would like language to be a variable, so the URL would become:
>
> http://foo/w/Whatlinkshere#ns=Special&lang=en&target=A%26W+Root+Beer&limit=50

I'd prefer:
http://foo/en/Special:Whatlinkshere?target=A%26W+Root+Beer&limit=50

or, knowing that many special pages operate explicitly on a target, one
could rearrange, folding most of them back into the already existing
"action" sequence:

http://foo/en/A&W_Root_Beer?action=whatlinkshere&limit=50
http://foo/en/User:Billybob?action=contributions

Or even yet, we could take advantage of the path syntax, as long as
special pages are never named with slashes:

http://foo/en/Special:Whatlinkshere/A&W_Root_Beer?limit=50
http://foo/en/Special:Contributions/Billybob

This has the advantage also of preserving ampersand functionality;
ampersands aren't special in the _path_ portion of a URL, so if someone
types them in raw it's okay, whereas it will generally break if someone
puts in a raw ampersand when constructing a query string.

Remember, URLs should be human-readable and human-rememberable if
possible; people *will* try to muck about with them manually. They
*will* e-mail URLs to friends and colleagues. They *will* print them and
send them to other people who will have to type them in. They *will* try
to speak them over the phone. Non-ascii characters and special
punctuation marks can be a pain for this, alas, but we should minimize
the trouble we make in the basic syntax.

-- brion vibber (brion @ pobox.com)
Re: mod_rewrite configuration [ In reply to ]
On Wed, Dec 25, 2002 at 11:32:50PM -0800, Brion Vibber wrote:
>The backend implementation is not relevant. By using mod_rewrite, the
>URL can be in any format we like with the PHP system. But that's no
>excuse for making URLs long, confusing, and fragile.

True. What I meant was that by bypassing mod_rewrite, we might no
longer need a special patch for Apache?

>> http://foo/w/Whatlinkshere#ns=Special&target=A%26W+Root+Beer&limit=500
>
>Standard URL syntax provides for a query string (starting with "?"), we
>shouldn't be afraid to use it.

Ok. Let's say I have an article "Who Killed JFK?" and I want to view
the articles history. I want to type in the URL, and I can't remember
the hex code for '?'. What do I do? Just require people to use the hex
code anyway for cases where the ? is part of the article title?

http://foo/w/Who_Killed_JFK??action=history&limit=10

Also, I'm not clear; what "escaping" does mod_rewrite do? How does it
determine whether to escape the ? and & or not? When does it do the
escaping?

So, if I put in the following URL:

http://foo/w/Who_Killed_JFK%3F?action=history

Will mod_rewrite change the second ? to a %3F if I rewrite the URL
somehow? Will it transform the %3F to a ? if I rewrite the URL somehow?

>I'd prefer:
>http://foo/en/Special:Whatlinkshere?target=A%26W+Root+Beer&limit=50

That's fair enough; do you have a page already written up giving your
reasons for wanting languages to be part of a hierarchy, but not
namespaces?

>or, knowing that many special pages operate explicitly on a target, one
>could rearrange, folding most of them back into the already existing
>"action" sequence:
>
>http://foo/en/A&W_Root_Beer?action=whatlinkshere&limit=50
>http://foo/en/User:Billybob?action=contributions

I heartily approve and endorse that idea. If noone objects, I will
implement it that way in mod_wiki

>Or even yet, we could take advantage of the path syntax, as long as
>special pages are never named with slashes:
>
>http://foo/en/Special:Whatlinkshere/A&W_Root_Beer?limit=50
>http://foo/en/Special:Contributions/Billybob

I'm having difficulty understanding; could you show me what manglement
would happen under other schemes, that doesn't happen under this one?

>Remember, URLs should be human-readable and human-rememberable if
>possible; people *will* try to muck about with them manually. They
>*will* e-mail URLs to friends and colleagues. They *will* print them and
>send them to other people who will have to type them in. They *will* try
>to speak them over the phone. Non-ascii characters and special
>punctuation marks can be a pain for this, alas, but we should minimize
>the trouble we make in the basic syntax.

I think we are on the same page; I concur with everything you said in
that previous paragraph.

Jonathan

--
Geek House Productions, Ltd.

Providing Unix & Internet Contracting and Consulting,
QA Testing, Technical Documentation, Systems Design & Implementation,
General Programming, E-commerce, Web & Mail Services since 1998

Phone: 604-435-1205
Email: djw@reactor-core.org
Webpage: http://reactor-core.org
Address: 2459 E 41st Ave, Vancouver, BC V5R2W2
Re: mod_rewrite configuration [ In reply to ]
On ĵaŭ, 2002-12-26 at 00:33, Jonathan Walther wrote:
> On Wed, Dec 25, 2002 at 11:32:50PM -0800, Brion Vibber wrote:
> >The backend implementation is not relevant. By using mod_rewrite, the
> >URL can be in any format we like with the PHP system. But that's no
> >excuse for making URLs long, confusing, and fragile.
>
> True. What I meant was that by bypassing mod_rewrite, we might no
> longer need a special patch for Apache?

Sure.

> >> http://foo/w/Whatlinkshere#ns=Special&target=A%26W+Root+Beer&limit=500
> >
> >Standard URL syntax provides for a query string (starting with "?"), we
> >shouldn't be afraid to use it.
>
> Ok. Let's say I have an article "Who Killed JFK?" and I want to view
> the articles history. I want to type in the URL, and I can't remember
> the hex code for '?'. What do I do? Just require people to use the hex
> code anyway for cases where the ? is part of the article title?
>
> http://foo/w/Who_Killed_JFK??action=history&limit=10

See RFC 2396, which defines the structure of URIs.
http://www.ietf.org/rfc/rfc2396.txt

That would break up as:
(scheme)(authority)(path)(query)
(http)(foo)(/w/Who_Killed_JFK)(?action=history&limit=10)

(Technically the question mark is "reserved" in the contents of a query
string, and I'm not sure it's allowed except as the separator from the
path.)

Question mark is simply *not* a valid URL _path_ character, there's
nothing we can do about that that doesn't flaunt standards and break
things, any more than we can decide we want a domain name with a slash
or a colon in it. ;)

> Also, I'm not clear; what "escaping" does mod_rewrite do? How does it
> determine whether to escape the ? and & or not? When does it do the
> escaping?

Before Apache gives the path to mod_rewrite, Apache has already
normalized escaped (I should say URL-encoded; %26 etc) characters. So a
path of "/wiki/A%26W_soda" comes to our mod_rewrite rules as
"/wiki/A&W_soda". If we take it and put it raw into a query string:
"title=A&W_soda", no good because that will break into "title=A" and
"W_soda" when PHP interprets the query. That's why we add an explicit
escaping for that character.

> So, if I put in the following URL:
>
> http://foo/w/Who_Killed_JFK%3F?action=history
>
> Will mod_rewrite change the second ? to a %3F if I rewrite the URL
> somehow? Will it transform the %3F to a ? if I rewrite the URL somehow?

The second ? is interpreted as a separator between path and query before
rewriting comes up.

By default if you create a new query string via a rewrite rule it wipes
out any existing query string completely, replacing it. There's an
option [QSA] to append instead; I haven't yet checked to see if it adds
an ampersand separator or if that needs to be tweaked further.

> >I'd prefer:
> >http://foo/en/Special:Whatlinkshere?target=A%26W+Root+Beer&limit=50
>
> That's fair enough; do you have a page already written up giving your
> reasons for wanting languages to be part of a hierarchy, but not
> namespaces?

Our namespaces aren't really hierarchical: Talk:, User:, User_talk:
etc... We have a flat namespace of namespaces. ;)

> >Or even yet, we could take advantage of the path syntax, as long as
> >special pages are never named with slashes:
> >
> >http://foo/en/Special:Whatlinkshere/A&W_Root_Beer?limit=50
> >http://foo/en/Special:Contributions/Billybob
>
> I'm having difficulty understanding; could you show me what manglement
> would happen under other schemes, that doesn't happen under this one?

Somebody types:
http://foo/en/Special:Whatlinkshere?target=A&W_Root_Beer

they get:

(http)(foo)(/en/Special:Whatlinkshere)((target=A)(W_Root_Beer))

Thus they see links to the page [[A]]. Question marks remain problematic
in the other case, of course. You can't win. ;)

-- brion vibber (brion @ pobox.com)
Re: mod_rewrite configuration [ In reply to ]
On Thu, Dec 26, 2002 at 01:23:23AM -0800, Brion Vibber wrote:
>Our namespaces aren't really hierarchical: Talk:, User:, User_talk:
>etc... We have a flat namespace of namespaces. ;)

Didn't someone have the idea that "Talk" pages shouldn't be separate
namespaces, but just a boolean flag?

And when I mentioned hierarchical, what I mean was are there any
objections to having namespace as a component of the path, like so:

http://foo/en/Special/Whatlinkshere?target=A%26W_Root_Beer&limit=50

That is just to illustrate what I meant by namespaces being
hierarchical, I agree with you that something along the following lines
is superior for the particular example above:

http://foo/en/Global/A&W_Root_Beer?action=whatlinkshere&limit=50

Hm. I see a potential problem; no way to gracefully ellide the "Global"
namespace for the default case. Maybe /en/g/ for the global namespace?
What the heck, why change it from what we use in the Wikipedia links;
law of least surprise says don't change if we don't have to.

But then we should do languages the same way.

http://foo/en:Global:A&W_Root_Beer
http://foo/en::A&W_Root_Beer
http://foo/en:A&W_Root_Beer

Should we support one of the above, or none of them?

Jonathan

--
Geek House Productions, Ltd.

Providing Unix & Internet Contracting and Consulting,
QA Testing, Technical Documentation, Systems Design & Implementation,
General Programming, E-commerce, Web & Mail Services since 1998

Phone: 604-435-1205
Email: djw@reactor-core.org
Webpage: http://reactor-core.org
Address: 2459 E 41st Ave, Vancouver, BC V5R2W2