Mailing List Archive

google and umlauts
Google doesn't seem to like our umlauts in page titles. Try
http://www.google.de/search?q=%C3%84nderungen+site%3Ade.wikipedia.com&hl
=de&ie=UTF8&oe=UTF8

Google says "Ungültige Seite Letzte_Änderungen. " (which means "invalid
page Recent_Changes").

Is it a problem on google's or on our side? Will it be solved with the
new software?

----

The most important bug in the new software on test-de.wikipedia.com
seems to be that the upload feature doesn't work. All others shouldn't
be too hard to solve, but I'm no programmer :-)

http://test-de.wikipedia.com/wiki/wikipedia:Beobachtete+Fehler

Bye,
Kurt
Re: google and umlauts [ In reply to ]
Kurt Jansson wrote:

> Google doesn't seem to like our umlauts in page titles. Try
> http://www.google.de/search?q=%C3%84nderungen+site%3Ade.wikipedia.com&hl=de&ie=UTF8&oe=UTF8
>
> Google says "Ungültige Seite Letzte_Änderungen. " (which means "invalid
> page Recent_Changes").

It works fine for me. Google gives me a list of 1610 hits, and only
the first hit has the title "Ungültige Seite...". This is because it
points to the URL
http://de.wikipedia.com/wiki.cgi?Letzte_%C3%84nderungen
for which Wikipedia will return a page having exactly that title.
This is explained by the old UseModWiki software's lack of support for
UTF-8. If the URL is changed from UTF-8 (%C3%84) to ISO 8859-1 (%C4),
the correct page is returned,
http://de.wikipedia.com/wiki.cgi?Letzte_%C4nderungen

This is not Google's fault. Some webpage somewhere has a link to
the wrong URL, and Google has indexed the page that Wikipedia returns
for that URL. Neither the old nor the new Wiki software return 404
errors for pages that don't exist.


--
Lars Aronsson (lars@aronsson.se)
Aronsson Datateknik
Teknikringen 1e, SE-583 30 Linuxköping, Sweden
tel +46-70-7891609
http://aronsson.se/ http://elektrosmog.nu/ http://susning.nu/