Mailing List Archive: moving forward on article validation

moving forward on article validation

Jun 13, 2006, 4:50 PM

Post #1 of 8 (838 views)

(Mainly concerning wikipedia, but cross-posting to foundation-l because
of some discussion of committees; see the end.)

We've discussed on and off that it'd be nice to vet specific revisions
of Wikipedia articles so readers can either choose to read only quality
articles, or at least have an indication of how good an article is.
This is an obvious prerequisite for a Wikipedia 1.0 print edition, and
would be nice on the website as well.

There is a lengthy list of proposals here:
http://meta.wikimedia.org/wiki/Article_validation_proposals

I wanted to try to rekindle the process by summarizing some of the
proposals, which I think can be grouped into three main types, and then
suggest some ideas on where to go from there.

Proposal #1: Fork or freeze, then bring up to our quality standards.
---
Wikipedians would look around for articles that look reasonably good
(perhaps starting with feature articles) and nominate them to be worked
on. Then either freeze them (by placing a notice or some sort of
technical measure), or else fork them off to a copy. The articles would
then be checked for referencing, accuracy, grammar, and so on, possibly
only by users who've met some bar for participation in the clean-up
process, resulting in an article suitable for publication. Forking or
freezing is to ensure the cleanup process actually terminates rather
than trying to clean up a moving target; there are of course pros and
cons to forking vs. freezing.

Some pros: Fairly straightforward; follows successful methods of "stable
release" management in the software-development world; allows a certain
amount of editorial work not normally suitable for an in-progress
encyclopedia (like cutting out an entire section because it's too far
from being done to go in the print version); is easy to integrate
"expert review" into as a last vetting step before it goes out the door.

Some cons: Either disrupts normal editing through a freeze, or results
in duplicated effort with a fork. Also is likely to result in a fairly
slow process, so the reviewed version of each article may be replaced
with an updated version quite infrequently; most articles will have no
reviewed version, so doesn't do much for increasing the typical quality
of presentation on the website.

Proposal #2: Institute a rating and trust-metric system
---
Wikipedians rate revisions, perhaps on some scale from "complete crap"
to "I'm an expert in this field and am confident of its accuracy and
high quality". Then there is some way of coming up with a score for
that revision, perhaps based on the trustworthiness of the raters
themselves (determined through some method). Once that's done, the
interface can do things like display the last version of an article over
some score, if any, or a big warning that the article sucks otherwise
(and so on).

Some pros: Distributed; no duplicated effort; good revisions are marked
good as soon as enough people have vetted them; humans review the
articles, but the "process" itself is done automatically; most articles
will have some information about their quality to present to a reader

Some cons: Gameing-proof trust metric systems are notoriously hard to
design.

Proposal #3: Extend a feature-article-like process
---
Extend a feature-article type process to work on revisions rather than
articles. For example, nominate revision X of an article as a featured
article; improve it during the process until it gets to a revision Y
that people agree is good. Then sometime later, nominate a new revision
Z, explain what the differences are, and discuss whether this should
supercede the old featured version. Can also have sub-featured statuses
like "good" or "mediocre, but at least nothing is outright wrong". In
principle can be done with no code changes, though there are some that
could ease things along greatly.

Some pros: Gets at the effect of proposal #2 but with a flexible
human-run system instead of an automatic system, and therefore less
likely to be brittle.

Some cons: Will need carefully-designed software assistance to keep all
the information and discussion manageable and avoid descending into a
morass of thousands upon thousands of messy talk pages

---

These are not necessarily mutually exclusive. In my opinion, something
like #3 would be best suited to marking quality of revisions on the
website, and then the best of these could feed into a process like #1
that would do final vetting and cleanup before a print publication (in
addition to print-specific things like editing for space, formatting,
image resolution, etc.).

In any case, obviously proposals can come and go forever. None are
implemented, but that's partly because nobody wants to sink a bunch of
time into implementing a system when there's no guarantee it will even
be used. My hope is to condense the discussion so we choose some
high-level ideas on how to proceed before moving on to the inevitable
details, and then move to implementation once we've agreed what we
actually want.

On an organizational level, it may be useful to have a working group
sorting this out to focus the process. It may be useful, in my opinion,
for the Foundation to make it an official committee of sorts and
indicate at least informally that it'll support getting its
recommendations enacted (e.g. paying another developer if development
resources are the bottleneck). I would be willing to devote a
significant amount of time to such a committee, since I think this is
the single biggest problem holding back Wikipedia's usefulness to the
general public, and I'm sure there are at least several other people
with ideas and expertise in this area who would be willing to do so as well.

Thoughts?

-Mark

_______________________________________________
foundation-l mailing list
foundation-l@wikimedia.org
http://mail.wikipedia.org/mailman/listinfo/foundation-l