Mailing List Archive

bot indexing of edit pages
If you visit http://www.google.com/search?q=site%3Awikipedia.com+edit,
you'll notice that google indexes all our edit pages. I find this
moderately annoying, as I don't want to find the edit link when I search
and new people not familiar with wikipedia won't know what to do when
they're confronted with a textarea. Also, I don't know what percentage
of accesses are from bots, but you may be able to cut down on useless
accesses.

I have a solution in mind: make an apache rewrite rule that rewrites
something like:

http://www.wikipedia.com/edit/Wikipedia:Bug_reports

into:

http://www.wikipedia.com/wiki.phtml?title=Wikipedia:Bug_reports&action=edit

Make the edit links go to the first version, then use robots.txt to
request that bots not harvest anything under /edit/*

I'm sending this here as it's not a feature request for the php script,
but rather site-specific for wikipedia. If there's a better place,
please tell me.

BTW, great work on the script. It just keeps getting better. I look
forward to Software Phase III. :)

--Dan Keshet