Mailing List Archive: Project Proposal: Wikicat

Project Proposal: Wikicat

Jul 27, 2006, 11:33 PM

Post #1 of 17 (3217 views)

All-

I hereby propose for your consideration Wikicat, a
project to create an open, bibliographic catalog. The
purpose of Wikicat is to both lay the groundwork for a
scholarly apparatus to be used within Wikipedia as
well as create a unique and valuable information
resource in its own right. In particular, Wikicat
will:

* facilitate the process of citation by automatically
fetching bibliographic data based upon unique keys
such as ISBN, ISSN, and LCCN
* allow users to more easily navigate between
information resources by grouping them in a
functionally significant manner (in particular,
according to the principles of [[w:FRBR]]) so that,
for example, different editions, translations, etc.
are all joined together
* apply Wikipedia's collaborative content creation
model to bibliographic data, resulting in a catalog of
unprecedented detail

In terms of implementation, Wikicat will be defined
like any other [[m:Wikidata]] dataset and will
integrate with other datasets such as WiktionaryZ to
share common entities and perhaps someday support
something along the lines of a Semantic Mediawiki. As
Wikidata is currently not code complete, though,
Wikicat will be deployed in stages, during the first
of which it will exist as a read-only database that
populates itself on an as-needed/"as-cited" basis by
importing data from the open catalog servers of such
institutions as the Library of Congress, the
University of California library system, the U.S.
National Library of Medicine, etc.

Details about the project, in increasing technical
detail, are available on the following pages:

http://meta.wikimedia.org/wiki/Proposals_for_new_projects#Wikicat
http://meta.wikimedia.org/wiki/Wikicat
http://meta.wikimedia.org/wiki/Wikicat_Technical_Design

Coding of the first stage of the project is nearly
complete and a list of its operational requirements
will soon be forthcoming. Here is a demo of Wikicat
integration with the Cite/<ref> extension:

http://meta.wikimedia.org/wiki/Image:Wikicat_Cite_screenshoot.png

Thank you for your time and I look forward to your
comments.

Jonathan Leybovich

__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com
_______________________________________________
foundation-l mailing list
foundation-l@wikimedia.org
http://mail.wikipedia.org/mailman/listinfo/foundation-l

Re: Project Proposal: Wikicat [ In reply to ]

oldakquill at gmail

Jul 28, 2006, 3:10 AM

Post #2 of 17 (3198 views)

Permalink

On 28/07/06, Jonathan Leybovich <jleybov@yahoo.com> wrote:
> All-
>
> I hereby propose for your consideration Wikicat, a
> project to create an open, bibliographic catalog. The
> purpose of Wikicat is to both lay the groundwork for a
> scholarly apparatus to be used within Wikipedia as
> well as create a unique and valuable information
> resource in its own right. In particular, Wikicat
> will:
>
> * facilitate the process of citation by automatically
> fetching bibliographic data based upon unique keys
> such as ISBN, ISSN, and LCCN
> * allow users to more easily navigate between
> information resources by grouping them in a
> functionally significant manner (in particular,
> according to the principles of [[w:FRBR]]) so that,
> for example, different editions, translations, etc.
> are all joined together
> * apply Wikipedia's collaborative content creation
> model to bibliographic data, resulting in a catalog of
> unprecedented detail
>
> In terms of implementation, Wikicat will be defined
> like any other [[m:Wikidata]] dataset and will
> integrate with other datasets such as WiktionaryZ to
> share common entities and perhaps someday support
> something along the lines of a Semantic Mediawiki. As
> Wikidata is currently not code complete, though,
> Wikicat will be deployed in stages, during the first
> of which it will exist as a read-only database that
> populates itself on an as-needed/"as-cited" basis by
> importing data from the open catalog servers of such
> institutions as the Library of Congress, the
> University of California library system, the U.S.
> National Library of Medicine, etc.
>
> Details about the project, in increasing technical
> detail, are available on the following pages:
>
> http://meta.wikimedia.org/wiki/Proposals_for_new_projects#Wikicat
> http://meta.wikimedia.org/wiki/Wikicat
> http://meta.wikimedia.org/wiki/Wikicat_Technical_Design
>
> Coding of the first stage of the project is nearly
> complete and a list of its operational requirements
> will soon be forthcoming. Here is a demo of Wikicat
> integration with the Cite/<ref> extension:
>
> http://meta.wikimedia.org/wiki/Image:Wikicat_Cite_screenshoot.png
>
> Thank you for your time and I look forward to your
> comments.

I'm normally against the creation of new projects, but this sounds
like a pretty good idea. Presumably, it'll be a little like Commons
but instead of images, would handle citations. I suppose other
Wikimedia projects will make use of this, do you hope to allow
non-WikiMedia projects to use it?

The project will aim to catalogue books, news, journals, what else? Film?

How will different referencing styles be handled?
--
Oldak Quill (oldakquill@gmail.com)
_______________________________________________
foundation-l mailing list
foundation-l@wikimedia.org
http://mail.wikipedia.org/mailman/listinfo/foundation-l

Re: Project Proposal: Wikicat [ In reply to ]

jleybov at yahoo

Jul 28, 2006, 8:23 AM

Post #3 of 17 (3179 views)

Permalink

>
> I'm normally against the creation of new projects,
> but this sounds
> like a pretty good idea. Presumably, it'll be a
> little like Commons
> but instead of images, would handle citations. I
> suppose other
> Wikimedia projects will make use of this, do you
> hope to allow
> non-WikiMedia projects to use it?

Strictly speaking, citation is its own project (coming
very shortly, I hope):

http://meta.wikimedia.org/wiki/Wikicite

Wikicat will simply act as the data back-end to
support citation, though both projects are very
interrelated. For example, citing a work for the
first time will import it into Wikicat, while editing
a work within Wikicat will affect the data displayed
by all Wikipedia articles citing it.

The software for this is being written as an extension
so it should be useable by any Mediawiki installation.

>
> The project will aim to catalogue books, news,
> journals, what else? Film?

Yes, everything that is currently catalogable will be
supported: books, journals, film, artwork and
artifacts, maps, electronic resources, recorded sound,
natural specimens and realia, etc. It will be like an
IMDB for everything. What will separate it from
existing catalogs, hopefully, is its detail- entries
for a particular journal will describe every article
contained within its issues, an entry for a movie will
show every song or piece of music used on its
soundtrack, etc. Wikicat will also use the model
proposed in Functional Requirements for Bibliographic
Records (FRBR) so that the same or related content can
be easily found, no matter what form its published in:

http://www.frbr.org/eg/hp-goblet-1.html

>
> How will different referencing styles be handled?

This is still an open issue, but the idea is to use a
single, very compact style so that every citation can
be captured as structured data and used to populate a
"text relationship" database (the focus of a 3rd
project: http://meta.wikimedia.org/wiki/WikiTextrose).
This will contain a database of all citations, from
Wikipedia articles to published works, and from those
published works to other published works, including
perhaps someday back to Wikpedia articles :) This
will allow lots of useful functionality, one example
of which is that users will be able to follow a
citation from a Wikipedia article to a work, see all
the works which it cites, and then perhaps improve the
article by using more specialized material than the
work which the article originally cited.

__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com
_______________________________________________
foundation-l mailing list
foundation-l@wikimedia.org
http://mail.wikipedia.org/mailman/listinfo/foundation-l

Re: Project Proposal: Wikicat [ In reply to ]

gerard.meijssen at gmail

Jul 28, 2006, 8:34 AM

Post #4 of 17 (3195 views)

Permalink

Hoi,
The project that you propose has a very large overlap with the WikiAuthors
project. This project is as it is described on Meta about among other things
the disambiguation of Pubmet articles. This project is extremely likely to
be realised. The functionality that you describe can and will by and large
be modelled using Wikidata technology.

As this project does use the Wikidata technology, it would make very much
sense to collaborate and share our efforts and make one big and beautiful
project.

Please let us discuss how we can / will collaborate.. for your information
WiktionaryZ's codebase is updated daily. This means for instance that today
people will have definitions shown in the language of their user interface.
If this definition is not there, English will be shown. When there is no
English, any definition will be shown.

Thanks,
GerardM

On 7/28/06, Jonathan Leybovich <jleybov@yahoo.com> wrote:
>
> All-
>
> I hereby propose for your consideration Wikicat, a
> project to create an open, bibliographic catalog. The
> purpose of Wikicat is to both lay the groundwork for a
> scholarly apparatus to be used within Wikipedia as
> well as create a unique and valuable information
> resource in its own right. In particular, Wikicat
> will:
>
> * facilitate the process of citation by automatically
> fetching bibliographic data based upon unique keys
> such as ISBN, ISSN, and LCCN
> * allow users to more easily navigate between
> information resources by grouping them in a
> functionally significant manner (in particular,
> according to the principles of [[w:FRBR]]) so that,
> for example, different editions, translations, etc.
> are all joined together
> * apply Wikipedia's collaborative content creation
> model to bibliographic data, resulting in a catalog of
> unprecedented detail
>
> In terms of implementation, Wikicat will be defined
> like any other [[m:Wikidata]] dataset and will
> integrate with other datasets such as WiktionaryZ to
> share common entities and perhaps someday support
> something along the lines of a Semantic Mediawiki. As
> Wikidata is currently not code complete, though,
> Wikicat will be deployed in stages, during the first
> of which it will exist as a read-only database that
> populates itself on an as-needed/"as-cited" basis by
> importing data from the open catalog servers of such
> institutions as the Library of Congress, the
> University of California library system, the U.S.
> National Library of Medicine, etc.
>
> Details about the project, in increasing technical
> detail, are available on the following pages:
>
> http://meta.wikimedia.org/wiki/Proposals_for_new_projects#Wikicat
> http://meta.wikimedia.org/wiki/Wikicat
> http://meta.wikimedia.org/wiki/Wikicat_Technical_Design
>
> Coding of the first stage of the project is nearly
> complete and a list of its operational requirements
> will soon be forthcoming. Here is a demo of Wikicat
> integration with the Cite/<ref> extension:
>
> http://meta.wikimedia.org/wiki/Image:Wikicat_Cite_screenshoot.png
>
> Thank you for your time and I look forward to your
> comments.
>
>
> Jonathan Leybovich
>
> __________________________________________________
> Do You Yahoo!?
> Tired of spam? Yahoo! Mail has the best spam protection around
> http://mail.yahoo.com
> _______________________________________________
> foundation-l mailing list
> foundation-l@wikimedia.org
> http://mail.wikipedia.org/mailman/listinfo/foundation-l
>
_______________________________________________
foundation-l mailing list
foundation-l@wikimedia.org
http://mail.wikipedia.org/mailman/listinfo/foundation-l

Re: Project Proposal: Wikicat [ In reply to ]

eloquence at gmail

Jul 28, 2006, 10:20 AM

Post #5 of 17 (3193 views)

Permalink

WiktionaryZ currently supports multilingual free-text attributes and
will support other data type attributes in the future. The data you
can associate with a WiktionaryZ entitity will likely depend on what
class that entity belongs to. For instance, if an entity is of type
person, you could associate a birthdate, death date, place of birth,
place of death, first name/last name (though that might be handled on
another level), etc.

If the entity belongs to the class "publication", you could associate
references to specific edition; if it belongs to class "author", you
could associate publications, and so on.

In this generic model, a reference is just one instance of an entity
whose associated data you might want to display in a Wikipedia
article. So we would probably come up with a special type of template
that can fetch data from WiktionaryZ and insert it into a page layout.
Then you could, for instance, use

<<person:Jimmy Wales>>

to display the person data about Jimbo, or

<<country:Germany>>

to make a nice country infobox. Then again, you could also do

<<ref:The Origin of Species|author=Charles Darwin|class=book>>

to show _any_ book edition of Darwin's work, or

<<ref:The Origin of Species|ISBN=whatever>>

to cite a specific edition. If the title is not ambiguous, you might
even only have to do something like

<<ref:The Cosmic and the Comic: Einstein's Scientific Spirituality>>

and all the properties would be derived automatically from that. In
fact, the WiktionaryZ model very much matches the work/expression
distinction, though we use the terminology the other way around: an
expression refers to _any_ possible meaning of a string of characters,
whereas each meaning ("work" in this context) has its own defined
meaning ID and can also be disambiguated by its relations and other
associated data (for publications: author, year, various codes, etc.).

One advantage, in my view, is that we do not require people to pass
around codes and numbers unless they really need to and want to, and
even then, we retain the expression (work title) in the wiki source
text, making it easy to see what a particular reference is about.

Erik
_______________________________________________
foundation-l mailing list
foundation-l@wikimedia.org
http://mail.wikipedia.org/mailman/listinfo/foundation-l

Re: Project Proposal: Wikicat [ In reply to ]

jmerkey at wolfmountaingroup

Jul 28, 2006, 10:59 AM

Post #6 of 17 (3180 views)

Permalink

Erik Moeller wrote:

>WiktionaryZ currently supports multilingual free-text attributes and
>will support other data type attributes in the future. The data you
>can associate with a WiktionaryZ entitity will likely depend on what
>class that entity belongs to. For instance, if an entity is of type
>person, you could associate a birthdate, death date, place of birth,
>place of death, first name/last name (though that might be handled on
>another level), etc.
>
>If the entity belongs to the class "publication", you could associate
>references to specific edition; if it belongs to class "author", you
>could associate publications, and so on.
>
>In this generic model, a reference is just one instance of an entity
>whose associated data you might want to display in a Wikipedia
>article. So we would probably come up with a special type of template
>that can fetch data from WiktionaryZ and insert it into a page layout.
>Then you could, for instance, use
>
><<person:Jimmy Wales>>
>
>to display the person data about Jimbo, or
>
><<country:Germany>>
>
>to make a nice country infobox. Then again, you could also do
>
><<ref:The Origin of Species|author=Charles Darwin|class=book>>
>
>to show _any_ book edition of Darwin's work, or
>
><<ref:The Origin of Species|ISBN=whatever>>
>
>to cite a specific edition. If the title is not ambiguous, you might
>even only have to do something like
>
><<ref:The Cosmic and the Comic: Einstein's Scientific Spirituality>>
>
>and all the properties would be derived automatically from that. In
>fact, the WiktionaryZ model very much matches the work/expression
>distinction, though we use the terminology the other way around: an
>expression refers to _any_ possible meaning of a string of characters,
>whereas each meaning ("work" in this context) has its own defined
>meaning ID and can also be disambiguated by its relations and other
>associated data (for publications: author, year, various codes, etc.).
>
>One advantage, in my view, is that we do not require people to pass
>around codes and numbers unless they really need to and want to, and
>even then, we retain the expression (work title) in the wiki source
>text, making it easy to see what a particular reference is about.
>
>Erik
>_______________________________________________
>foundation-l mailing list
>foundation-l@wikimedia.org
>http://mail.wikipedia.org/mailman/listinfo/foundation-l
>
>
>
Erik,

Anything you can do to reduce the dependencies on extensions is a great
move. You should incorporate all of them you
can into the main mediawiki tree. You need to setup the exportDump.php
program to output dynamically generated
image names into the actual names as well.

www.wikipedia.org/wiki/Chess

is a great example of HOW NOT to code a template for image generation.
It makes it a lot easier to keep images
synced up between distributed wikipedia mirrors. For now, I have
modified your php code to create a collision listing
of missing image names when the page is converted into HTML for the
first time so the wikix tools can grab and
input the images into the database.

Jeff
_______________________________________________
foundation-l mailing list
foundation-l@wikimedia.org
http://mail.wikipedia.org/mailman/listinfo/foundation-l

Re: Project Proposal: Wikicat [ In reply to ]

jleybov at yahoo

Jul 28, 2006, 1:37 PM

Post #7 of 17 (3184 views)

Permalink

GerardM wrote:

> The project that you propose has a very large
overlap with the WikiAuthors
> project.

Yes, a little bit of déjà vu here ;) :

http://meta.wikimedia.org/wiki/Talk:WikiAuthors#Wikicat

>
> This project is as it is described on Meta about
among other things
> the disambiguation of Pubmet articles. This project
is extremely likely to
> be realised. The functionality that you describe can
and will by and large
> be modelled using Wikidata technology.
>
> As this project does use the Wikidata technology, it
would make very much
> sense to collaborate and share our efforts and make
one big and beautiful
> project.

Absolutely. The issue Wikiauthors is trying to
address ([[w:authority control]]) ideally should not
be occurring in a professionally-run catalog, and
there may still be the possibility that beneath the
Pubmed web interface enough information is being
recorded in the bibliographic records to uniquely
distinguish authors. But in any case the ability to
correct or enhance bibliographic data is something
Wikicat will definitely need to support, and this
hopefully will meet all the needs of Wikiauthors.

> Please let us discuss how we can / will collaborate.

Sure. Let's me, you, and Erik take our discussion off
the foundation list so as not to bother everyone with
unnecessary technical details.

__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com
_______________________________________________
foundation-l mailing list
foundation-l@wikimedia.org
http://mail.wikipedia.org/mailman/listinfo/foundation-l

Re: Project Proposal: Wikicat [ In reply to ]

jleybov at yahoo

Jul 28, 2006, 3:06 PM

Post #8 of 17 (3182 views)

Permalink

Erik-

I'll discuss technical integration details with you
and Gerard offline. However, in a more general vein,
I'm not sure the lexical construct WiktionaryZ wishes
to impose on every sort of Wikidata entity makes
sense. For example, multi-lingualism is certainly
important in a lexicographic context, but it does not
apply to a catalog. A catalog has language-specific
data, for sure, but this is not multi-lingual data-
the language(s) in which a book's title is
historically expressed by the author or publisher is
important, and you cannot just do your own translation
into an arbitrary language and say that is also the
book's title. Similarly, films are given multiple
titles by their distributors for different markets yet
often these are very different from what a direct
translation would look like. Here is more detail on
these issues:

http://meta.wikimedia.org/wiki/Multilingual_Wikidata

This also does not touch the performance/scalability
issues of storing all text data, all numerica data,
etc. in one table.

Regarding different referencing styles, I'm open to
anything though I think you'll find that in practice
standard numbers like ISBN are less cumbersome to use
than titles. For example, <<ref:The Davinci Code>>-
does this mean the book, the movie, the audio book, or
"The Davinci Code: Fact or Fiction?" ?

Also, citation is not just fetching bibliographic data
for the purposes of displaying it in an info box like
other information. It is fundamentally about
associating an assertion with evidence or support, and
so must capture the cited "text" as well as the
paraphrase text. Here is a mock-up of these idea in
the context of an enhanced article validation feature:

http://meta.wikimedia.org/wiki/Image:Wikicite_spider_review_mockup.jpg

__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com
_______________________________________________
foundation-l mailing list
foundation-l@wikimedia.org
http://mail.wikipedia.org/mailman/listinfo/foundation-l

Re: Project Proposal: Wikicat [ In reply to ]

eloquence at gmail

Jul 28, 2006, 3:47 PM

Post #9 of 17 (3173 views)

Permalink

On 7/29/06, Jonathan Leybovich <jleybov@yahoo.com> wrote:
> A catalog has language-specific
> data, for sure, but this is not multi-lingual data-
> the language(s) in which a book's title is
> historically expressed by the author or publisher is
> important, and you cannot just do your own translation
> into an arbitrary language and say that is also the
> book's title.

A translated version of the book should probably have its own
DefinedMeaning, since you will want to relate all kinds of information
specifically on that level. It could be linked to the original edition
not so much through the synonyms/translations, but through a special
relation type, e.g. "is translated edition of." Even with that
information, you may _still_ want to translate the title to languages
where no actual edition is available, so you can cite a book e.g. with
both its official title, and an unofficial translated title in the
language of the Wikipedia edition you're using.

> This also does not touch the performance/scalability
> issues of storing all text data, all numerica data,
> etc. in one table.

It won't be quite as simple as that, but let's discuss that.

> Regarding different referencing styles, I'm open to
> anything though I think you'll find that in practice
> standard numbers like ISBN are less cumbersome to use
> than titles. For example, <<ref:The Davinci Code>>-
> does this mean the book, the movie, the audio book, or
> "The Davinci Code: Fact or Fiction?" ?

Referencing books that also have movie adaptations doesn't seem quite
as common. If I look at a real-world example, e.g. [[Emu]], I'll find
references like

* The heat load from solar radiation on a large, diurnally active
bird, the emu (Dromaius novaehollandiae)
* Ventilatory accommodation of oxygen demand and respiratory water
loss in a large bird, the emu (Dromaius novaehollandiae), and a
re-examination of ventilatory allometry for birds
* Endocrine and testicular changes in a short-day seasonally breeding
bird, the emu (Dromaius novaehollandiae), in southwestern Australia.

I don't think any of these will be turned into movies soon. ;-) In
addition, you could even capture the type of reference with the
template name, e.g. <<book:The Da Vinci Code>> would only refer to an
entity which has the class membership "book". You could also do
pre-save transformations, that is, the user types <<book:Some title>>,
and if the expression unambiguously refers to one publication, a
unique identifier is automatically inserted into the reference in
addition to the title.

Of course, the ideal user interface would probably give you a little
pop-up when you click on a toolbar icon, let you search (or add!) the
reference information, and insert the right tags into the wiki source
text.

> Also, citation is not just fetching bibliographic data
> for the purposes of displaying it in an info box like
> other information. It is fundamentally about
> associating an assertion with evidence or support, and
> so must capture the cited "text" as well as the
> paraphrase text. Here is a mock-up of these idea in
> the context of an enhanced article validation feature:
>
> http://meta.wikimedia.org/wiki/Image:Wikicite_spider_review_mockup.jpg

:-) I see we have indeed been thinking about very similar problem
areas. You'll find a mock-up of a simple scoped citation syntax in
p.190 of the first edition of my book:
http://medienrevolution.dpunkt.de/files/Medienrevolution-1.pdf

I like your systematic source review, though I'm not sure adding this
additional UI layer is necessary. One thing to keep in mind: In a wiki
review process, you'll probably almost never flag things as
"misleading" or "made-up" -- instead, you should encourage direct
editing of the content.

I think we'll have lots to talk about.

Erik
_______________________________________________
foundation-l mailing list
foundation-l@wikimedia.org
http://mail.wikipedia.org/mailman/listinfo/foundation-l

Re: Project Proposal: Wikicat [ In reply to ]

saintonge at telus

Jul 29, 2006, 2:11 PM

Post #10 of 17 (3185 views)

Permalink

Jonathan Leybovich wrote:

>However, in a more general vein,
>I'm not sure the lexical construct WiktionaryZ wishes
>to impose on every sort of Wikidata entity makes
>sense. For example, multi-lingualism is certainly
>important in a lexicographic context, but it does not
>apply to a catalog. A catalog has language-specific
>data, for sure, but this is not multi-lingual data-
>the language(s) in which a book's title is
>historically expressed by the author or publisher is
>important, and you cannot just do your own translation
>into an arbitrary language and say that is also the
>book's title. Similarly, films are given multiple
>titles by their distributors for different markets yet
>often these are very different from what a direct
>translation would look like. Here is more detail on
>these issues:
>
>http://meta.wikimedia.org/wiki/Multilingual_Wikidata
>
The common name of a taxonomic entity IS a translation of the formal
Latin name. Why should it be treated as a separate data type? The
endangered species Microhexura montivaga from North Carolina and
Tennessee is known in English as the spruce-fir moss spider. It may
have a name in the first nations languages of the area, but can it
really have a name in any other language. You can invent a name in
every other language, but these would be purely hypothetical, and not
supported by actual usage. This would amount to unverifiable original
research.

Similarly, arbitrary translations of book titles would be unacceptable.
For a translated title to be meaningful it must have in fact been
translated that way. The title of Camus' famous book "L'Étranger" has
been translated both with the literal "The Stranger" and the literary
"The Outsider"; these would both be valid entries for the repective
translations of the book which would then require a "Translation of ..."
entry.

>Regarding different referencing styles, I'm open to
>anything though I think you'll find that in practice
>standard numbers like ISBN are less cumbersome to use
>than titles. For example, <<ref:The Davinci Code>>-
>does this mean the book, the movie, the audio book, or
>"The Davinci Code: Fact or Fiction?" ?
>
ISBNs have their limitations too. My copy of the 1977, 22nd edition of
"Dorland's Pocket Medical Dictionary" has two different ISBNs depending
on whether there are index tabs on the fore-edge. My earlier 1922, 11th
edition, "The American Illustrated Medical Dictionary", (edited by
Dorland) does not have an ISBN. I bought that specific edition to
enable me to check out potential copyright problems. My more recent
2003, 30th edition, "Dorland's illustrated medical dictionary", has four
different ISBNs depending on whether it's a standard, deluxe, trade or
international edition. There are 27 other editions, and a distinction
still needs to be made between pocket and full-size editions. Sometimes
the difference is important; other times it isn't. The issue is not so
simple that it can be solved by simply using an ISBN number.

>Also, citation is not just fetching bibliographic data
>for the purposes of displaying it in an info box like
>other information. It is fundamentally about
>associating an assertion with evidence or support, and
>so must capture the cited "text" as well as the
>paraphrase text. Here is a mock-up of these idea in
>the context of an enhanced article validation feature:
>
>http://meta.wikimedia.org/wiki/Image:Wikicite_spider_review_mockup.jpg
>
That simple example does best to illustrate some of the difficulties
that are faced in scientific description. My first inclination would be
to ask whether the description is consistent (but not necesarily
identical) with the description that is officially accepted by the
relevant international society. Would a spider be better described as a
kind of arachnid arthropod rather than just an invertebrate. The first
citation is very poor because it uses a simile. Saying that it is
"like" most invertebrates does not imply that it "is" a vertebrate.
Citation sometimes need to be rigorously applicable.

Ec

_______________________________________________
foundation-l mailing list
foundation-l@wikimedia.org
http://mail.wikipedia.org/mailman/listinfo/foundation-l

Re: Project Proposal: Wikicat [ In reply to ]

lars at aronsson

Jul 29, 2006, 3:12 PM

Post #11 of 17 (3178 views)

Permalink

Ray Saintonge wrote:

> ISBNs have their limitations too. My copy of the 1977, 22nd
> edition of "Dorland's Pocket Medical Dictionary" has two
> different ISBNs depending on whether there are index tabs on the
> fore-edge. My earlier 1922, 11th edition, "The American
> Illustrated Medical Dictionary", (edited by Dorland) does not
> have an ISBN.

These comments, while accurate, are on the absolute layman level.

Even though I think Wikicat is one of most promising project
proposals in the last few years, I fear it will have a hard time
to explain, over and over again, the basics of bibliography to all
newcomers who think they are experts. How can this project sort
out the beginners and give them easy but meaningful tasks where
they can be productive, without their ignorance causing damage?

Could we perhaps have a separate mailing list (wikicat-l) for
discussions about Wikicat, even before the project is formally
established? That would be an opportunity to establish some level
of common knowledge and "get to know" each other.

--
Lars Aronsson (lars@aronsson.se)
Aronsson Datateknik - http://aronsson.se
_______________________________________________
foundation-l mailing list
foundation-l@wikimedia.org
http://mail.wikipedia.org/mailman/listinfo/foundation-l

Re: Project Proposal: Wikicat [ In reply to ]

meta.sj at gmail

Jul 29, 2006, 4:07 PM

Post #12 of 17 (3183 views)

Permalink

On Sun, 30 Jul 2006, Lars Aronsson wrote:

>> ISBNs have their limitations too. My copy of the 1977, 22nd
>> edition of "Dorland's Pocket Medical Dictionary" has two
>> different ISBNs depending on whether there are index tabs on the
>> fore-edge. My earlier 1922, 11th edition, "The American
>> Illustrated Medical Dictionary", (edited by Dorland) does not
>> have an ISBN.
>
> These comments, while accurate, are on the absolute layman level.
>
> Even though I think Wikicat is one of most promising project
> proposals in the last few years, I fear it will have a hard time
> to explain, over and over again, the basics of bibliography to all
> newcomers who think they are experts. How can this project sort
> out the beginners and give them easy but meaningful tasks where
> they can be productive, without their ignorance causing damage?
>
> Could we perhaps have a separate mailing list (wikicat-l) for
> discussions about Wikicat, even before the project is formally
> established? That would be an opportunity to establish some level
> of common knowledge and "get to know" each other.

YES. Please.
_______________________________________________
foundation-l mailing list
foundation-l@wikimedia.org
http://mail.wikipedia.org/mailman/listinfo/foundation-l

Re: Project Proposal: Wikicat [ In reply to ]

jleybov at yahoo

Jul 29, 2006, 6:44 PM

Post #13 of 17 (3180 views)

Permalink

Lars Aronsson wrote:
>
>
> Even though I think Wikicat is one of most promising
> project
> proposals in the last few years, I fear it will have
> a hard time
> to explain, over and over again, the basics of
> bibliography to all
> newcomers who think they are experts. How can this
> project sort
> out the beginners and give them easy but meaningful
> tasks where
> they can be productive, without their ignorance
> causing damage?

One of the soft deliverables of the project proposal
is documentation and hopefully a training regime to
orient new users. The hope is that appropriate groups
like Wikiproject Librarians could take up the
responsibility once Wikicat became world-editable.

http://meta.wikimedia.org/wiki/Wikicat#Stage_1.0

>
> Could we perhaps have a separate mailing list
> (wikicat-l) for
> discussions about Wikicat, even before the project
> is formally
> established? That would be an opportunity to
> establish some level
> of common knowledge and "get to know" each other.
>

It would be very useful, I think, to invite the
members of Wikiproject Librarians to this discussion
from the outset. If anyone knows of similar groups on
the non-English Wikipedias please invite them as well.

__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com
_______________________________________________
foundation-l mailing list
foundation-l@wikimedia.org
http://mail.wikipedia.org/mailman/listinfo/foundation-l

Re: Project Proposal: Wikicat [ In reply to ]

saintonge at telus

Jul 29, 2006, 9:25 PM

Post #14 of 17 (3184 views)

Permalink

Jonathan Leybovich wrote:

>All-
>
>I hereby propose for your consideration Wikicat, a
>project to create an open, bibliographic catalog. The
>purpose of Wikicat is to both lay the groundwork for a
>scholarly apparatus to be used within Wikipedia as
>well as create a unique and valuable information
>resource in its own right. In particular, Wikicat
>will:
>
>* facilitate the process of citation by automatically
>fetching bibliographic data based upon unique keys
>such as ISBN, ISSN, and LCCN
>* allow users to more easily navigate between
>information resources by grouping them in a
>functionally significant manner (in particular,
>according to the principles of [[w:FRBR]]) so that,
>for example, different editions, translations, etc.
>are all joined together
>* apply Wikipedia's collaborative content creation
>model to bibliographic data, resulting in a catalog of
>unprecedented detail
>
>In terms of implementation, Wikicat will be defined
>like any other [[m:Wikidata]] dataset and will
>integrate with other datasets such as WiktionaryZ to
>share common entities and perhaps someday support
>something along the lines of a Semantic Mediawiki. As
>Wikidata is currently not code complete, though,
>Wikicat will be deployed in stages, during the first
>of which it will exist as a read-only database that
>populates itself on an as-needed/"as-cited" basis by
>importing data from the open catalog servers of such
>institutions as the Library of Congress, the
>University of California library system, the U.S.
>National Library of Medicine, etc.
>
>Details about the project, in increasing technical
>detail, are available on the following pages:
>
>http://meta.wikimedia.org/wiki/Proposals_for_new_projects#Wikicat
>http://meta.wikimedia.org/wiki/Wikicat
>http://meta.wikimedia.org/wiki/Wikicat_Technical_Design
>
>Coding of the first stage of the project is nearly
>complete and a list of its operational requirements
>will soon be forthcoming. Here is a demo of Wikicat
>integration with the Cite/<ref> extension:
>
>http://meta.wikimedia.org/wiki/Image:Wikicat_Cite_screenshoot.png
>
>Thank you for your time and I look forward to your
>comments.
>
While I have deep sympathy for the intentions of this proposals, I also
find that the kind of theoretical discussions that are linked to the
proposal offer very little encouragement to the average contributor.

Citations and verifiability are absolutely essential to the credibility
of Wikipedia and its sister projects. Nevertheless, a person
undertaking to substantiate his contributions should not need a
professional librarianship background to do so. Any manner of clearly
identifying the source should be acceptable. If someone else considers
it important to bring the format of citations up to modern library
standards he should feel to do that without blame being attributable to
the original contributor.

There is also a need to begin referencing the material that is
relatively easy to access on line or in other relatively inexpensive
sources of public domain material, like CDs sold for $5.00 each that can
each easily contain 100 books or more. In the last few years this
material has been produced at a phenomenal rate. These are available in
image, ASCII plain Jane or more scholarly annotatable formats. We could
begin by including our own Wikisource material in the catalogue.

Ec

_______________________________________________
foundation-l mailing list
foundation-l@wikimedia.org
http://mail.wikipedia.org/mailman/listinfo/foundation-l

Re: Project Proposal: Wikicat [ In reply to ]

eloquence at gmail

Jul 29, 2006, 9:53 PM

Post #15 of 17 (3179 views)

Permalink

On 7/30/06, Ray Saintonge <saintonge@telus.net> wrote:

> Citations and verifiability are absolutely essential to the credibility
> of Wikipedia and its sister projects. Nevertheless, a person
> undertaking to substantiate his contributions should not need a
> professional librarianship background to do so. Any manner of clearly
> identifying the source should be acceptable.

Absolutely. However, we should also make the tools of professional
referencing as easy to use as possible. You're right on that freely
available (not necessarily freely licensed) content will be the first
to be referenced. Thus, it is likely that Wikimedia will become both a
primary beneficiary and driving force of the open access movement.

What is saddening to me is that even better referencing tools and
systematic source checking processes will likely not be sufficient to
deal adequately with the vast amounts of knowledge that is _not_ free
or not even digital. Indeed, already today, I've seen quite a lot of
cases where Wikipedians have reacted with intense frustration to the
citation of sources that they could not verify simply by following a
link.

One of my great hopes is that a broad international coalition of NGOs
will eventually emerge to call for harmonization of copyright terms to
a reasonable length. Perhaps Wikimedia could be part of such a
coalition. If I look at the fantastic work Project Gutenberg is doing
on even the most obscure publications, I cannot begin to imagine the
profound effects on our culture it would have if copyright would last,
say, 14 years, with the option to renew for another 14:
http://creativecommons.org/projects/founderscopyright/

Erik
_______________________________________________
foundation-l mailing list
foundation-l@wikimedia.org
http://mail.wikipedia.org/mailman/listinfo/foundation-l

Project Proposal: Wikicat [ In reply to ]

jleybov at yahoo

Jul 30, 2006, 9:51 AM

Post #16 of 17 (3196 views)

Permalink

Ray Saintonge wrote:
> Citations and verifiability are absolutely essential
to the credibility
> of Wikipedia and its sister projects. Nevertheless,
a person
> undertaking to substantiate his contributions should
not need a
> professional librarianship background to do so. Any
manner of clearly
> identifying the source should be acceptable.

Absolutely. The catalog that is the subject of this
project proposal is distinct, though obviously
quite-interrelated with, the issue of citation.
Whatever will be of most convenience to actual editors
should be supported, though I suspect that ambiguous
citation styles where the user must take an extra step
to disambiguate the work/edition he actually meant
will not be all that popular in practice, in addition
to being more involved to implement within the
software. But again, we should support whatever is
most useful to users.

> There is also a need to begin referencing the
material that is
> relatively easy to access on line or in other
relatively inexpensive
> sources of public domain material, like CDs sold for
$5.00 each that can
> each easily contain 100 books or more. In the last
few years this
> material has been produced at a phenomenal rate.
These are available in
> image, ASCII plain Jane or more scholarly
annotatable formats. We could
> begin by including our own Wikisource material in
the catalogue.

Yes, all these types of items will eventually be
supported by the catalog. But this will not be
possible during the bootstrap/read-only phase of the
catalog, where the only data within it will be
imported from other catalogs like the Library of
Congress. Not only do certain technical pre-reqs need
to be met before Wikicat is made world editable (i.e.
the compeletion of Wikidata), but, as Lars pointed
out, soft pre-reqs need to be met as well,
particularly the creation of cataloging standards and
some sort of training/orientation regime for new editors.

__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com
_______________________________________________
foundation-l mailing list
foundation-l@wikimedia.org
http://mail.wikipedia.org/mailman/listinfo/foundation-l

Re: Project Proposal: Wikicat [ In reply to ]

saintonge at telus

Jul 30, 2006, 11:20 PM

Post #17 of 17 (3193 views)

Permalink

Erik Moeller wrote:

>On 7/30/06, Ray Saintonge <saintonge@telus.net> wrote:
>
>
>>Citations and verifiability are absolutely essential to the credibility
>>of Wikipedia and its sister projects. Nevertheless, a person
>>undertaking to substantiate his contributions should not need a
>>professional librarianship background to do so. Any manner of clearly
>>identifying the source should be acceptable.
>>
>>
>Absolutely. However, we should also make the tools of professional
>referencing as easy to use as possible. You're right on that freely
>available (not necessarily freely licensed) content will be the first
>to be referenced. Thus, it is likely that Wikimedia will become both a
>primary beneficiary and driving force of the open access movement.
>
I can be patient.

>What is saddening to me is that even better referencing tools and
>systematic source checking processes will likely not be sufficient to
>deal adequately with the vast amounts of knowledge that is _not_ free
>or not even digital. Indeed, already today, I've seen quite a lot of
>cases where Wikipedians have reacted with intense frustration to the
>citation of sources that they could not verify simply by following a
>link.
>
That's not just sad; it's scary. It's on a par with saying, "If it's
on the internet it must be true." It reflects a series of tendencies in
the developed world with profound societal effects. When the most
important factor for gaining knowledge is convenience it puts us on
track for a Fahrenheit-451 kind of world. And I'm sure there is a
certain segment of society that will be quite happy to encourage the
people to take their new form of opium.

>One of my great hopes is that a broad international coalition of NGOs
>will eventually emerge to call for harmonization of copyright terms to
>a reasonable length. Perhaps Wikimedia could be part of such a
>coalition. If I look at the fantastic work Project Gutenberg is doing
>on even the most obscure publications, I cannot begin to imagine the
>profound effects on our culture it would have if copyright would last,
>say, 14 years, with the option to renew for another 14:
>http://creativecommons.org/projects/founderscopyright/
>
Reducing the terms of copyright back to a reasonable level will be an
uphill fight, and the way some of our collegues bend over to prevent the
least suggestion of a copyright violation and stay law abiding does not
give me a lot of hope.

Ec

_______________________________________________
foundation-l mailing list
foundation-l@wikimedia.org
http://mail.wikipedia.org/mailman/listinfo/foundation-l