Mailing List Archive

Re: Google Translate now assists with human translations of Wikipedia articles [ In reply to ]
Brian wrote:

> In the absence of a sentence aligned corpus one must be created.

It would be nice if such a corpus (or rather, the resulting
dictionary of translated words, phrases and sentences) could also
be "open content". Are you in talks with Google about this,
Brian? Would they be interested in providing open content output
in exchange for open content input?


--
Lars Aronsson (lars@aronsson.se)
Aronsson Datateknik - http://aronsson.se

_______________________________________________
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: Google Translate now assists with human translations of Wikipedia articles [ In reply to ]
In talks with Google? Oh I wish ;)

There are lots of algorithms that do sentence alignment automatically. The
different language articles don't have to be identical for Google to align
them. So we've basically already got what they've got in terms of Wikipedia
data.

On Wed, Jun 10, 2009 at 1:05 AM, Lars Aronsson <lars@aronsson.se> wrote:

> Brian wrote:
>
> > In the absence of a sentence aligned corpus one must be created.
>
> It would be nice if such a corpus (or rather, the resulting
> dictionary of translated words, phrases and sentences) could also
> be "open content". Are you in talks with Google about this,
> Brian? Would they be interested in providing open content output
> in exchange for open content input?
>
>
> --
> Lars Aronsson (lars@aronsson.se)
> Aronsson Datateknik - http://aronsson.se
>
> _______________________________________________
> foundation-l mailing list
> foundation-l@lists.wikimedia.org
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
>
_______________________________________________
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: Google Translate now assists with human translations of Wikipedia articles [ In reply to ]
2009/6/10 Brian <Brian.Mingus@colorado.edu>:
> Not only did you not provide a critique of my more general claim (that the
> user does not enter into a contract with Google regarding Wikipedia's data)
> but you have no provided any sort of well founded critique of this one.
> You've basically said, in both cases, "I don't believe that."
>


Thatys because you've provided zero evidence to back your position.
Have you even rad the TOS:

"By using Google Translator Toolkit (the “Service”), you agree to be
bound by our Google Terms of Services located at
http://www.google.com/accounts/TOS as well as these additional terms."

"1. Your relationship with Google

1.1 Your use of Google’s products, software, services and web
sites (referred to collectively as the “Services” in this document and
excluding any services provided to you by Google under a separate
written agreement) is subject to the terms of a legal agreement
between you and Google. "

"2.1 In order to use the Services, you must firstly agree to the
Terms. You may not use the Services if you do not accept the Terms."

"2.3 You may not use the Services and may not accept the Terms if (a)
you are not of legal age to form a binding contract with Google, or
(b) you are a person barred from receiving the Services under the laws
of the United States or other countries including the country in which
you are resident or from which you use the Services."



". By submitting, posting or displaying the content you give Google a
perpetual, irrevocable, worldwide, royalty-free, and non-exclusive
licence to reproduce, adapt, modify, translate, publish, publicly
perform, publicly display and distribute any Content which you submit,
post or display on or through, the Services."


If if we took your highly non standard position that providing Google
with a URL is not submitting the content the output is displayed by
Google and you have no way to grant them the above rights over it for
third party CC-BY-SA content.


--
geni

_______________________________________________
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: Google Translate now assists with human translations of Wikipedia articles [ In reply to ]
On Tue, Jun 9, 2009 at 23:42, Brian<Brian.Mingus@colorado.edu> wrote:
> Google has built in support for using its machine translation technology to
> help bootstrap human translations of Wikipedia articles.
>
> http://translate.google.com/toolkit/docupload
>
> The benefit to Google is clear - they need sentence-aligned text in multiple
> languages in order to bootstrap their automated system.
>
> This is a great example of machines helping people help machines help
> people, etc... I'm sure this is now the most efficient way to produce high
> quality translations of Wikipedia articles en masse.
>
> We should take the ToS to make sure the translated text can be CC-BY-SA
> licensed.

OK, after a bit of drama in this discussion, i actually tried this toolkit.

First i tried to translate the Hebrew article [[שלום גד]] into English
(that's Shalom Gad, one of my favorite Israeli musicians). Apparently,
it can only translate from English. I am more interested in
translating Wikipedia articles from Hebrew into English, so it was
quite disappointing, but they'll probably fix it soon enough.

Then i tried to translate [[Art critic]] from English into Hebrew.
There were a few pleasant surprises, but on the whole the machine
translation was bad to the point of being unusable. It is much easier
to translate it using vi.

Google want side by side translations. It is not quite possible. A
grammar of a language is not just subjects, objects, tenses and
adjectives. Google seem to ignore [[Text linguistics]] - rules which
apply way beyond the word and the sentence. And these are *grammar
rules*, not just "style". (Disclaimer: The Department of Linguistics
in the Hebrew University of Jerusalem, where i study, is very keen on
this subject.)

I *had* to make very deep changes to paragraph structure - not to
mention sentence structure -, and not just because the Hebrew
Wikipedia has a different MOS, but because it's the basis of the
Hebrew language. A text without these changes would be next to
unreadable. I doubt that a document which is changed so deeply is very
useful to Google at this point. I certainly know that it is not useful
to me - i gave up after two paragraphs.

So yes, Google can revise the legalese of their TOS, but this is not a
very urgent problem. The uselessness of the technology makes the TOS
pretty irrelevant.

--
אמיר אלישע אהרוני
Amir Elisha Aharoni

http://aharoni.wordpress.com

"We're living in pieces,
I want to live in peace." - T. Moore

_______________________________________________
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: Google Translate now assists with human translations of Wikipedia articles [ In reply to ]
On Wed, Jun 10, 2009 at 00:54, masti<mastigm@gmail.com> wrote:
> current level of sophistication of translation tools, especialy of
> languages that do not belog to the same group as english, german,
> french, etc. is completely useless.

Let me disagree. Hungarian is not in the same group by far, and the
results make it possible to understand more than 50% of the text
(sometimes I'd say above 90%). While this is far from proper
translation it is by no means _useless_, since its obvious use is to
understand a completely foreign text to some extents.

And I'd like to second that the quality has been really improving,
whether the state of the art linguistic science backs its theory up or
not. This is observation, and not theory.

But I see this is an exaggeration contest, so I'll go back to the shadow. :-)

grin

_______________________________________________
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: Google Translate now assists with human translations of Wikipedia articles [ In reply to ]
> current level of sophistication of translation tools, especialy of
> languages that do not belog to the same group as english, german,
> french, etc. is completely useless.
>
> Machine translations into slavic languages are to be deleted from wiki
> immediatealy.
>
> masti
>
Just to confirm, yesterday I needed to translate a piece from Bulgariam
Wikipedia article into Russian. I ended up with the manual translation
even though I do no speak a word of Bulgarian (Russian is my
mothertongue). The output of Google Language Tools (Bulgarian into
English) was on substandard level.

Cheers
Yaroslav


_______________________________________________
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: Google Translate now assists with human translations of Wikipedia articles [ In reply to ]
On Wed, Jun 10, 2009 at 06:22, David Goodman<dgoodmanny@gmail.com> wrote:
> On Tue, Jun 9, 2009 at 6:01 PM, Amir E. Aharoni<amir.aharoni@gmail.com> wrote:
>
>> An unedited machine-translated text is likely to be speedily deleted
>> as patent nonsense, before copyvio is even considered.
>
> If it is deleted as nonsense,  that will be a gross error by the
> administrator, at least in enWP.  It is usually possible to roughly
> understand what is meant in a Google translation. That's enough to
> defeat speedy deletion. What these texts need is revision. I think of
> them essentially as an automated dictionary.

According to the dry letter of the policies it may be an error, but
the deletion logs show that it happens quite often.

--
אמיר אלישע אהרוני
Amir Elisha Aharoni

http://aharoni.wordpress.com

"We're living in pieces,
I want to live in peace." - T. Moore

_______________________________________________
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: Google Translate now assists with human translations of Wikipedia articles [ In reply to ]
Such an approach has an critical flaw. I don’t know whether this
applies to, say, English—French translations, but it is known to be
present for cyrillic languages. Statistical approach sometimes
discovers false connections that result in factual errors. Examples of
“translating”, say, “50 USD” as “50 000 UAH” within a particular
context are known; more of such things can arise unexpectedly. So, at
least a good understanding both of the topic and the source language
is a crucial prerequisite, and there should be a warning about it.

I really don’t like the way they write “Wikipedia™” instead of simply
“Wikipedia” — do they really have to emphasize the trademark status?

Perhaps, after some time goes by, I will be able to make a tool to
select all translations made that way on a wiki, which may help
deleting purely nonsensical ones.

— Kalan

_______________________________________________
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: Google Translate now assists with human translations of Wikipedia articles [ In reply to ]
Of course these are now things that you are able to fix and which can be
shared with everyone.

On Wed, Jun 10, 2009 at 9:32 AM, Mark Williamson <node.ue@gmail.com> wrote:

> Sometimes cities are "translated" - "Koper" was translated to English
> from Slovene as "Chicago" and "Kranj" as "Miami"... of course Kranj is
> 100km inland and Miami is largely beachfront and the opposite with
> Chicago and Koper.
>
> "Ljubljana" was translated to English in earlier phases of the
> software as "rape"... In Italian to English, "L'Italia" became
> "Canada"; in Tagalog to English, "Pilipinas" became "Japan" - when
> they first debuted the Tagalog language capability, I tested it with
> the tl.wp article on Manila which informed me that Manila is the
> capital of Japan...
>
> Mark
>
> On Wed, Jun 10, 2009 at 7:33 AM, Nikola Smolenski<smolensk@eunet.yu>
> wrote:
> > Kalan wrote:
> >> present for cyrillic languages. Statistical approach sometimes
> >> discovers false connections that result in factual errors. Examples of
> >> “translating”, say, “50 USD” as “50 000 UAH” within a particular
> >> context are known; more of such things can arise unexpectedly. So, at
> >
> > The funniest example I noticed is that "flew" was translated to Serbian
> > as "MaudDib" :) (this has been corrected since).
> >
> > And yet I can not stress enough how much I find this service useful,
> > both for personal use and to ease translation.
> >
> > _______________________________________________
> > foundation-l mailing list
> > foundation-l@lists.wikimedia.org
> > Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
> >
>
> _______________________________________________
> foundation-l mailing list
> foundation-l@lists.wikimedia.org
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
>
_______________________________________________
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: Google Translate now assists with human translations of Wikipedia articles [ In reply to ]
On Wed, Jun 10, 2009 at 19:29, Brian<Brian.Mingus@colorado.edu> wrote:
> Of course these are now things that you are able to fix and which can be
> shared with everyone.

Unfortunately it's Google, not Wikipedia. There's mysterious Google
code behind it all; not MediaWiki, whose code everyone is free to
study and fix.

Not evil - just mysterious. And overhyped.

--
אמיר אלישע אהרוני
Amir Elisha Aharoni

http://aharoni.wordpress.com

"We're living in pieces,
I want to live in peace." - T. Moore

_______________________________________________
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: Google Translate now assists with human translations of Wikipedia articles [ In reply to ]
Дана Wednesday 10 June 2009 17:32:00 Mark Williamson написа:
> "Ljubljana" was translated to English in earlier phases of the
> software as "rape"... In Italian to English, "L'Italia" became

Well that is a correct translation :)

_______________________________________________
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: Google Translate now assists with human translations of Wikipedia articles [ In reply to ]
Thanks Nikola, I just laughed enough to last me for the rest of the week.

Mark



On Wed, Jun 10, 2009 at 9:49 AM, Nikola Smolenski<smolensk@eunet.yu> wrote:
> äÁÎÁ Wednesday 10 June 2009 17:32:00 Mark Williamson ÎÁÐÉÓÁ:
>> "Ljubljana" was translated to English in earlier phases of the
>> software as "rape"... In Italian to English, "L'Italia" became
>
> Well that is a correct translation :)
>
> _______________________________________________
> foundation-l mailing list
> foundation-l@lists.wikimedia.org
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
>

_______________________________________________
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: Google Translate now assists with human translations of Wikipedia articles [ In reply to ]
> Sometimes cities are "translated" - "Koper" was translated to English
> from Slovene as "Chicago" and "Kranj" as "Miami"... of course Kranj is
> 100km inland and Miami is largely beachfront and the opposite with
> Chicago and Koper.
>
> "Ljubljana" was translated to English in earlier phases of the
> software as "rape"... In Italian to English, "L'Italia" became
> "Canada"; in Tagalog to English, "Pilipinas" became "Japan" - when
> they first debuted the Tagalog language capability, I tested it with
> the tl.wp article on Manila which informed me that Manila is the
> capital of Japan...
>
> Mark
>

I have got îÏ×É ôÒÇ (Novy Trg) as New York from Bulgarian. Looks like a
systematic error.

Cheers
Yaroslav


_______________________________________________
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l

1 2  View All