Mailing List Archive

Re: What's appropriate attribution? [ In reply to ]
On Fri, Oct 24, 2008 at 1:18 PM, Nikola Smolenski <smolensk@eunet.yu> wrote:
> On Friday 24 October 2008 01:19:20 phoebe ayers wrote:
>> Pity the person who wants to reprint [[George W. Bush]] from en:wp...
>> it has 13228 authors (6366 IP addresses!) Sure, most of them are
>> vandalism, but I haven't seen any tool to pull out significant
>> revisions. Does anyone know of such a tool or script?
>
> On Wikitech-l we just had thread WikiTrust and authorship that discussed how
> such a tool could be made. It is doable.

For copyright attribution purposes? Show me.

Most greedy "auto-attributing" code I've seen has a tendency to
incorrectly attribute text in cases of simple re-ordering. It's
reasonable enough for measuring the text churn rate in articles, and
it may be good enough as a starting point for attribution, but if
their is no way to correct it when it's wrong then it probably can't
be used for that purpose. (Also, consider the case where half of an
article is copy and paste moved from another article.) Not that it
shouldn't be done, but I don't expect it could replace other past
proposal such as adding a second 'talk' page entitled "Credits".

_______________________________________________
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: What's appropriate attribution? [ In reply to ]
2008/10/24 Gregory Maxwell <gmaxwell@gmail.com>:
> More importantly: You've picked a extreme corner case. Extreme corner
> cases shouldn't be neglected completely, but they are bad places to
> start policy discussions.

Please do take into account that the most popular and "interesting"
articles are also often the most likely to have a very large history,
though. So if you're compiling a collection that's not focused on
fringe subjects, you're likely to hit some articles that have very
many authors.
--
Erik Möller
Deputy Director, Wikimedia Foundation

Support Free Knowledge: http://wikimediafoundation.org/wiki/Donate

_______________________________________________
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: What's appropriate attribution? [ In reply to ]
On Fri, Oct 24, 2008 at 2:47 PM, Erik Moeller <erik@wikimedia.org> wrote:
> 2008/10/24 Gregory Maxwell <gmaxwell@gmail.com>:
>> More importantly: You've picked a extreme corner case. Extreme corner
>> cases shouldn't be neglected completely, but they are bad places to
>> start policy discussions.
>
> Please do take into account that the most popular and "interesting"
> articles are also often the most likely to have a very large history,
> though. So if you're compiling a collection that's not focused on
> fringe subjects, you're likely to hit some articles that have very
> many authors.

That is a fair point.

Though you could still find more representative article than GWB, even
among popular articles: at least at one point in time it had the
longest revision history of any article. It's also unusually long, and
atypically popular. It's probably a worst case, or close to it, in
terms of both possible and actual author count.

I don't know that "fringe" is really the right word either. There are
many subject areas which are not at all fringe, things which get whole
sections in libraries, where none of the articles are massively
multi-authored. So I'd probably reverse the sense of your point: If
you're working on anything on a popular media subject you'll certainly
come across some articles with long lists of authors.

The end result of both outlooks is, I suppose, the same but I think
the notion that most (or all) Wikipedia articles are massively
multi-authored is fairly widespread, and thats not a correct position
on an article by article basis most of the time (while it's quite true
for Wikipedia as a whole), so I like to take the opportunity to point
that out.

_______________________________________________
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: What's appropriate attribution? [ In reply to ]
On Fri, Oct 24, 2008 at 12:10 PM, Gregory Maxwell <gmaxwell@gmail.com> wrote:
> On Fri, Oct 24, 2008 at 2:47 PM, Erik Moeller <erik@wikimedia.org> wrote:
>> 2008/10/24 Gregory Maxwell <gmaxwell@gmail.com>:
>>> More importantly: You've picked a extreme corner case. Extreme corner
>>> cases shouldn't be neglected completely, but they are bad places to
>>> start policy discussions.
>>
>> Please do take into account that the most popular and "interesting"
>> articles are also often the most likely to have a very large history,
>> though. So if you're compiling a collection that's not focused on
>> fringe subjects, you're likely to hit some articles that have very
>> many authors.
>
> That is a fair point.
>
> Though you could still find more representative article than GWB, even
> among popular articles: at least at one point in time it had the
> longest revision history of any article. It's also unusually long, and
> atypically popular. It's probably a worst case, or close to it, in
> terms of both possible and actual author count.

The original example was [[France]], with 4077 authors, which is still
9-10 pages of authors in 10pt type. And I don't think [[France]] is a
corner case for reprinting at all -- I would hope that it and its
fellow country articles would get included in any typical educational
compilation, atlas, children's encyclopedia, etc. based on Wikipedia
content that got put out.

Yes, [[George Bush]] is atypical, but the chances of someone wanting
to reprint it -- again, for any educational compilation with
biographies it seems like a fair choice -- seem pretty high. I think
any attribution rule that gets made has to take these cases as well as
"more typical" 10-author articles into account.

-- phoebe

_______________________________________________
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: What's appropriate attribution? [ In reply to ]
On Friday 24 October 2008 20:44:31 Gregory Maxwell wrote:
> On Fri, Oct 24, 2008 at 1:18 PM, Nikola Smolenski <smolensk@eunet.yu> wrote:
> > On Friday 24 October 2008 01:19:20 phoebe ayers wrote:
> >> Pity the person who wants to reprint [[George W. Bush]] from en:wp...
> >> it has 13228 authors (6366 IP addresses!) Sure, most of them are
> >> vandalism, but I haven't seen any tool to pull out significant
> >> revisions. Does anyone know of such a tool or script?
> >
> > On Wikitech-l we just had thread WikiTrust and authorship that discussed
> > how such a tool could be made. It is doable.
>
> For copyright attribution purposes? Show me.
>
> Most greedy "auto-attributing" code I've seen has a tendency to
> incorrectly attribute text in cases of simple re-ordering. It's

That isn't the biggest of our concerns: it is acceptable that we have
occasional false positive (person who didn't make significant edits is listed
among the authors) rather than false negative (person who did make
significant edits is not listed among the authors).

A suggestion by Tei is simple and promising: simply make a list of all the
words in each version, sort it alphabetically, and make a diff. Number of
changed lines is number of changed words. Edits that changed only a few words
are not significant for our purpose.

> be used for that purpose. (Also, consider the case where half of an
> article is copy and paste moved from another article.) Not that it

And even that could be mostly identifiable, though it would use a lot of
resources. Fortunately, it happens relatively rarely.

_______________________________________________
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: What's appropriate attribution? [ In reply to ]
On Fri, Oct 24, 2008 at 1:15 PM, Nikola Smolenski <smolensk@eunet.yu> wrote:

> On Thursday 23 October 2008 22:22:26 Anthony wrote:
> > I'm sorry, your numbers are pulled too wildly from the air to be useful.
> A
> > 300 page book about 200 countries? You're better off rewriting
> everything
> > "ab initio" than copying from Wikipedia for that. The work to cull down
> > the information into that small of a format is going to far outweigh the
> > savings from plagiarizing the content anyway.
>
> He's referring to possibility to create a book that would have only the
> introduction from each article, yet it would have to list all authors
> (because you can't determine who was writing in the introduction and who
> wasn't).
>

Sometimes, sadly, it's not possible to get something for nothing. Is it
really part of the mission of the Foundation to allow publishers to create
such books? Would such a book even be worth more than the paper it's
printed on? It seems like one of those books I can get for $0.10 at the
thrift store, or $2.00 at the bargain bin section of a bookstore.

Then again, this whole thread seems to be leading to the conclusion that
Wikipedia and the right to attribution are incompatible. If that's truly
the case, the only fair thing to do is to start over from scratch under
terms that make it clear to all contributors that they have no right to
attribution. I honestly hope it isn't the case, and that I'm just missing
something.

Anthony
_______________________________________________
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l

1 2 3 4 5  View All