Mailing List Archive

copyright question about data
hey all,

sorry if this is a FAQ - and I'm not sure if this question belongs in this
mailing list - but since it concerns the legal side of the fence wrt
wikipedia contributions here goes:

What's the legal status of data retrieved from non-public domain sources?

I understand that text that is retrieved from copyrighted materials is
copyrighted, but how about data and figures that deal with common interest
topics? Can you really copyright the amount of wheat grown in a year in
bangladesh, or the number of accidents in a year on california roads?


And how about graphs? Is data that is extrapolated from graphs and
used in derivative graphs considered a 'creative work' of its own?


I'd think so, but I just want to be sure..

Ed
Re: copyright question about data [ In reply to ]
As far as I understand it, these figures would be derivative works from
public domain works... and their author would get copyright !

Cordialement,

Jean-Baptiste Soufron, Doctorant
CERSA - CNRS, Paris 2
http://soufron.free.fr

Le 11 avr. 05, à 23:42, Edward Peschko a écrit :

> hey all,
>
> sorry if this is a FAQ - and I'm not sure if this question belongs in
> this
> mailing list - but since it concerns the legal side of the fence wrt
> wikipedia contributions here goes:
>
> What's the legal status of data retrieved from non-public domain
> sources?
>
> I understand that text that is retrieved from copyrighted materials is
> copyrighted, but how about data and figures that deal with common
> interest
> topics? Can you really copyright the amount of wheat grown in a year in
> bangladesh, or the number of accidents in a year on california roads?
>
>
> And how about graphs? Is data that is extrapolated from graphs and
> used in derivative graphs considered a 'creative work' of its own?
>
>
> I'd think so, but I just want to be sure..
>
> Ed
> _______________________________________________
> foundation-l mailing list
> foundation-l@wikimedia.org
> http://mail.wikipedia.org/mailman/listinfo/foundation-l
>
Re: copyright question about data [ In reply to ]
On Tue, Apr 12, 2005 at 12:42:31AM +0200, Jean-Baptiste Soufron wrote:
> As far as I understand it, these figures would be derivative works from
> public domain works... and their author would get copyright !
>

But that's what I don't understand. Look at the back of any decent
research book. Take for example, 'the prize'. It acknowledges over 500
sources - copyrighted and non - and includes many graphs from what I assume
is data that came from them.

If the strict interpretation of copyright here is true, then nobody could
get anything done. There has to be a clause allowing for derivative works
that contain elements of original works, but have value added to them. But I'm
having a very tough time putting my finger on the line between what is allowable
and what is not..

Ed
Re: copyright question about data [ In reply to ]
On Apr 11, 2005 11:42 PM, Jean-Baptiste Soufron <jbsoufron@gmail.com> wrote:
> As far as I understand it, these figures would be derivative works from
> public domain works... and their author would get copyright !
>
> Cordialement,
>
> Jean-Baptiste Soufron, Doctorant
> CERSA - CNRS, Paris 2
> http://soufron.free.fr
>

Yes, that is right, the person/organisation that compiled the data
gets credit for their work - it is in the public domain but we still
need to reference the source. The same goes for graphs - a
representation of data is itself copyright, though you can make your
own graph from the data and refer to the source, ie. World Bank, UNDP
or whatever. Correct me if I am wrong..
Cormac


> Le 11 avr. 05, à 23:42, Edward Peschko a écrit :
>
> > hey all,
> >
> > sorry if this is a FAQ - and I'm not sure if this question belongs in
> > this
> > mailing list - but since it concerns the legal side of the fence wrt
> > wikipedia contributions here goes:
> >
> > What's the legal status of data retrieved from non-public domain
> > sources?
> >
> > I understand that text that is retrieved from copyrighted materials is
> > copyrighted, but how about data and figures that deal with common
> > interest
> > topics? Can you really copyright the amount of wheat grown in a year in
> > bangladesh, or the number of accidents in a year on california roads?
> >
> >
> > And how about graphs? Is data that is extrapolated from graphs and
> > used in derivative graphs considered a 'creative work' of its own?
> >
> >
> > I'd think so, but I just want to be sure..
> >
> > Ed
> > _______________________________________________
> > foundation-l mailing list
> > foundation-l@wikimedia.org
> > http://mail.wikipedia.org/mailman/listinfo/foundation-l
> >
Re: copyright question about data [ In reply to ]
Well, the new work is copyrighted but the works it's using are still
public domain and can be re-used at will.

Cordialement,

Jean-Baptiste Soufron, Doctorant
CERSA - CNRS, Paris 2
http://soufron.free.fr

Le 12 avr. 05, à 01:21, Cormac Lawler a écrit :

> On Apr 11, 2005 11:42 PM, Jean-Baptiste Soufron <jbsoufron@gmail.com>
> wrote:
>> As far as I understand it, these figures would be derivative works
>> from
>> public domain works... and their author would get copyright !
>>
>> Cordialement,
>>
>> Jean-Baptiste Soufron, Doctorant
>> CERSA - CNRS, Paris 2
>> http://soufron.free.fr
>>
>
> Yes, that is right, the person/organisation that compiled the data
> gets credit for their work - it is in the public domain but we still
> need to reference the source. The same goes for graphs - a
> representation of data is itself copyright, though you can make your
> own graph from the data and refer to the source, ie. World Bank, UNDP
> or whatever. Correct me if I am wrong..
> Cormac
>
>
>> Le 11 avr. 05, à 23:42, Edward Peschko a écrit :
>>
>>> hey all,
>>>
>>> sorry if this is a FAQ - and I'm not sure if this question belongs in
>>> this
>>> mailing list - but since it concerns the legal side of the fence wrt
>>> wikipedia contributions here goes:
>>>
>>> What's the legal status of data retrieved from non-public domain
>>> sources?
>>>
>>> I understand that text that is retrieved from copyrighted materials
>>> is
>>> copyrighted, but how about data and figures that deal with common
>>> interest
>>> topics? Can you really copyright the amount of wheat grown in a year
>>> in
>>> bangladesh, or the number of accidents in a year on california roads?
>>>
>>>
>>> And how about graphs? Is data that is extrapolated from graphs and
>>> used in derivative graphs considered a 'creative work' of its own?
>>>
>>>
>>> I'd think so, but I just want to be sure..
>>>
>>> Ed
>>> _______________________________________________
>>> foundation-l mailing list
>>> foundation-l@wikimedia.org
>>> http://mail.wikipedia.org/mailman/listinfo/foundation-l
>>>
> _______________________________________________
> foundation-l mailing list
> foundation-l@wikimedia.org
> http://mail.wikipedia.org/mailman/listinfo/foundation-l
>
Re: copyright question about data [ In reply to ]
On Apr 11, 2005 11:42 PM, Edward Peschko <esp5@pge.com> wrote:
> hey all,
>
> sorry if this is a FAQ - and I'm not sure if this question belongs in this
> mailing list - but since it concerns the legal side of the fence wrt
> wikipedia contributions here goes:
>
> What's the legal status of data retrieved from non-public domain sources?
>
> I understand that text that is retrieved from copyrighted materials is
> copyrighted, but how about data and figures that deal with common interest
> topics? Can you really copyright the amount of wheat grown in a year in
> bangladesh, or the number of accidents in a year on california roads?

No, you cannot copyright the data itself. What is copyrighted is the
*representation* of the data, while the *selection* of the data MIGHT
be copyrighted.

> And how about graphs? Is data that is extrapolated from graphs and
> used in derivative graphs considered a 'creative work' of its own?

Yes, I would not see what 'creative work' in making the graph would be
included in such data. The copyright on the graph is not on the data
represented, but in the representation (e.g. the width-to-height
ratio, the colours used, etcetera).

Note: I am not even close to being a lawyer

Andre Engels
Re: copyright question about data [ In reply to ]
Andre Engels wrote:

>On Apr 11, 2005 11:42 PM, Edward Peschko <esp5@pge.com> wrote:
>
>
>>What's the legal status of data retrieved from non-public domain sources?
>>
>>I understand that text that is retrieved from copyrighted materials is
>>copyrighted, but how about data and figures that deal with common interest
>>topics? Can you really copyright the amount of wheat grown in a year in
>>bangladesh, or the number of accidents in a year on california roads?
>>
>>
>No, you cannot copyright the data itself. What is copyrighted is the
>*representation* of the data, while the *selection* of the data MIGHT
>be copyrighted.
>
This is a very important distinction. The selection issue can be
difficult, and is most applicable when you are using the same subset of
data as someone else. If you and the other person are providing
complete data that is not a breech since there is only one way to have
everything. :-) Also an obvious form of representation of the material
(such as alphabetical order) is not copyrightable.

>>And how about graphs? Is data that is extrapolated from graphs and
>>used in derivative graphs considered a 'creative work' of its own?
>>
>>
>Yes, I would not see what 'creative work' in making the graph would be
>included in such data. The copyright on the graph is not on the data
>represented, but in the representation (e.g. the width-to-height
>ratio, the colours used, etcetera).
>
An easy way to avoid copyright infringement in this case is to use a
different form of graph, such as replacing a bar graph with a pie chart.

>Note: I am not even close to being a lawyer
>
People say this all the time. I prefer to treat lawyers in the same way
as experts in other fields.

Ec
Re: copyright question about data [ In reply to ]
On Mon, Apr 11, 2005 at 05:38:42PM -0700, Ray Saintonge wrote:
> Andre Engels wrote:
>
> >On Apr 11, 2005 11:42 PM, Edward Peschko <esp5@pge.com> wrote:
> >
> >
> >>What's the legal status of data retrieved from non-public domain sources?
> >>
> >>I understand that text that is retrieved from copyrighted materials is
> >>copyrighted, but how about data and figures that deal with common interest
> >>topics? Can you really copyright the amount of wheat grown in a year in
> >>bangladesh, or the number of accidents in a year on california roads?
> >>
> >>
> >No, you cannot copyright the data itself. What is copyrighted is the
> >*representation* of the data, while the *selection* of the data MIGHT
> >be copyrighted.
> >
> This is a very important distinction. The selection issue can be
> difficult, and is most applicable when you are using the same subset of
> data as someone else. If you and the other person are providing
> complete data that is not a breech since there is only one way to have
> everything. :-) Also an obvious form of representation of the material
> (such as alphabetical order) is not copyrightable.


How about augmented data? Ie: say someone has a set of data that you'd like
to keep in its entirety, but you add some features that text cannot possibly
have (like, say links to supporting papers for important datapoints,
or zoom-in on graphs). Is that considered copyright infringement?

Ed
Re: copyright question about data [ In reply to ]
Edward Peschko wrote:

>On Mon, Apr 11, 2005 at 05:38:42PM -0700, Ray Saintonge wrote:
>
>
>>Andre Engels wrote:
>>
>>
>>>On Apr 11, 2005 11:42 PM, Edward Peschko <esp5@pge.com> wrote:
>>>
>>>
>>>>What's the legal status of data retrieved from non-public domain sources?
>>>>
>>>>I understand that text that is retrieved from copyrighted materials is
>>>>copyrighted, but how about data and figures that deal with common interest
>>>>topics? Can you really copyright the amount of wheat grown in a year in
>>>>bangladesh, or the number of accidents in a year on california roads?
>>>>
>>>>
>>>No, you cannot copyright the data itself. What is copyrighted is the
>>>*representation* of the data, while the *selection* of the data MIGHT
>>>be copyrighted.
>>>
>>>
>>This is a very important distinction. The selection issue can be
>>difficult, and is most applicable when you are using the same subset of
>>data as someone else. If you and the other person are providing
>>complete data that is not a breech since there is only one way to have
>>everything. :-) Also an obvious form of representation of the material
>>(such as alphabetical order) is not copyrightable.
>>
>>
>How about augmented data? Ie: say someone has a set of data that you'd like
>to keep in its entirety, but you add some features that text cannot possibly
>have (like, say links to supporting papers for important datapoints,
>or zoom-in on graphs). Is that considered copyright infringement?
>
>
Augmenting data helps to establish the fact that you are not limiting
yourself to the original author's selection process.. In many of these
cases determining whether there has been a breech of copyright will
never be a black and white situation. We really are looking at a
balance of probabilities.

Ec
Re: copyright question about data [ In reply to ]
Ray Saintonge wrote:

> Edward Peschko wrote:
>
>> On Mon, Apr 11, 2005 at 05:38:42PM -0700, Ray Saintonge wrote:
>>
>>
>>> Andre Engels wrote:
>>>
>>>
>>>> On Apr 11, 2005 11:42 PM, Edward Peschko <esp5@pge.com> wrote:
>>>>
>>>>
>>>>> What's the legal status of data retrieved from non-public domain
>>>>> sources?
>>>>>
>>>>> I understand that text that is retrieved from copyrighted
>>>>> materials is
>>>>> copyrighted, but how about data and figures that deal with common
>>>>> interest
>>>>> topics? Can you really copyright the amount of wheat grown in a
>>>>> year in
>>>>> bangladesh, or the number of accidents in a year on california roads?
>>>>>
>>>>
>>>> No, you cannot copyright the data itself. What is copyrighted is the
>>>> *representation* of the data, while the *selection* of the data MIGHT
>>>> be copyrighted.
>>>>
>>>
>>> This is a very important distinction. The selection issue can be
>>> difficult, and is most applicable when you are using the same subset
>>> of data as someone else. If you and the other person are providing
>>> complete data that is not a breech since there is only one way to
>>> have everything. :-) Also an obvious form of representation of the
>>> material (such as alphabetical order) is not copyrightable.
>>>
>>
>> How about augmented data? Ie: say someone has a set of data that
>> you'd like
>> to keep in its entirety, but you add some features that text cannot
>> possibly
>> have (like, say links to supporting papers for important datapoints,
>> or zoom-in on graphs). Is that considered copyright infringement?
>>
>>
> Augmenting data helps to establish the fact that you are not limiting
> yourself to the original author's selection process.. In many of
> these cases determining whether there has been a breech of copyright
> will never be a black and white situation. We really are looking at a
> balance of probabilities.
>
> Ec
>
>
I am not a lawyer, but, in the United States at least, isn't Feist v.
Rural relevant?

-- Neil
Re: copyright question about data [ In reply to ]
Well, the new work is copyrighted but the works it's using are still
public domain and can be re-used at will.

Cordialement,

Jean-Baptiste Soufron, Doctorant
CERSA - CNRS, Paris 2
http://soufron.free.fr

Le 12 avr. 05, à 01:21, Cormac Lawler a écrit :

> On Apr 11, 2005 11:42 PM, Jean-Baptiste Soufron <jbsoufron@gmail.com>
> wrote:
>> As far as I understand it, these figures would be derivative works
>> from
>> public domain works... and their author would get copyright !
>>
Cordialement,

Jean-Baptiste Soufron, Doctorant
CERSA - CNRS, Paris 2
http://soufron.free.fr
>>
>
> Yes, that is right, the person/organisation that compiled the data
> gets credit for their work - it is in the public domain but we still
> need to reference the source. The same goes for graphs - a
> representation of data is itself copyright, though you can make your
> own graph from the data and refer to the source, ie. World Bank, UNDP
> or whatever. Correct me if I am wrong..
> Cormac
>
>
>> Le 11 avr. 05, à 23:42, Edward Peschko a écrit :
>>
>>> hey all,
>>>
>>> sorry if this is a FAQ - and I'm not sure if this question belongs in
>>> this
>>> mailing list - but since it concerns the legal side of the fence wrt
>>> wikipedia contributions here goes:
>>>
>>> What's the legal status of data retrieved from non-public domain
>>> sources?
>>>
>>> I understand that text that is retrieved from copyrighted materials
>>> is
>>> copyrighted, but how about data and figures that deal with common
>>> interest
>>> topics? Can you really copyright the amount of wheat grown in a year
>>> in
>>> bangladesh, or the number of accidents in a year on california roads?
>>>
>>>
>>> And how about graphs? Is data that is extrapolated from graphs and
>>> used in derivative graphs considered a 'creative work' of its own?
>>>
>>>
>>> I'd think so, but I just want to be sure..
>>>
>>> Ed
>>> _______________________________________________
>>> foundation-l mailing list
>>> foundation-l@wikimedia.org
>>> http://mail.wikipedia.org/mailman/listinfo/foundation-l
>>>
> _______________________________________________
> foundation-l mailing list
> foundation-l@wikimedia.org
> http://mail.wikipedia.org/mailman/listinfo/foundation-l
>
Re: copyright question about data [ In reply to ]
> >>
> >Augmenting data helps to establish the fact that you are not limiting
> >yourself to the original author's selection process.. In many of
> >these cases determining whether there has been a breech of copyright
> >will never be a black and white situation. We really are looking at a
> >balance of probabilities.
> >
> >Ec
> >
> >
> I am not a lawyer, but, in the United States at least, isn't Feist v.
> Rural relevant?
>

very relevant.. thanks much for the pointer.

just for the record:

http://www.law.cornell.edu/copyright/cases/499_US_340.html

for once, the law seems to be in line with common sense.. ;-)

Ed
Re: copyright question about data [ In reply to ]
Neil Harris wrote:

> Ray Saintonge wrote:
>
>> Edward Peschko wrote:
>>
>>> On Mon, Apr 11, 2005 at 05:38:42PM -0700, Ray Saintonge wrote:
>>>
>>>> Andre Engels wrote:
>>>>
>>>>> On Apr 11, 2005 11:42 PM, Edward Peschko <esp5@pge.com> wrote:
>>>>>
>>>>>> What's the legal status of data retrieved from non-public domain
>>>>>> sources?
>>>>>>
>>>>>> I understand that text that is retrieved from copyrighted
>>>>>> materials is
>>>>>> copyrighted, but how about data and figures that deal with common
>>>>>> interest
>>>>>> topics? Can you really copyright the amount of wheat grown in a
>>>>>> year in
>>>>>> bangladesh, or the number of accidents in a year on california
>>>>>> roads?
>>>>>
>>>>> No, you cannot copyright the data itself. What is copyrighted is the
>>>>> *representation* of the data, while the *selection* of the data MIGHT
>>>>> be copyrighted.
>>>>
>>>> This is a very important distinction. The selection issue can be
>>>> difficult, and is most applicable when you are using the same
>>>> subset of data as someone else. If you and the other person are
>>>> providing complete data that is not a breech since there is only
>>>> one way to have everything. :-) Also an obvious form of
>>>> representation of the material (such as alphabetical order) is not
>>>> copyrightable.
>>>
>>> How about augmented data? Ie: say someone has a set of data that
>>> you'd like
>>> to keep in its entirety, but you add some features that text cannot
>>> possibly
>>> have (like, say links to supporting papers for important datapoints,
>>> or zoom-in on graphs). Is that considered copyright infringement?
>>
>> Augmenting data helps to establish the fact that you are not limiting
>> yourself to the original author's selection process.. In many of
>> these cases determining whether there has been a breech of copyright
>> will never be a black and white situation. We really are looking at
>> a balance of probabilities.
>
> I am not a lawyer, but, in the United States at least, isn't Feist v.
> Rural relevant?

Definitely. For those interested see
http://caselaw.lp.findlaw.com/scripts/getcase.pl?court=US&vol=499&invol=340

Ec