Mailing List Archive

Re: Long-term archiving of Wikimedia content [ In reply to ]
additionally, a simple lens can be ground with hand tools and no
preexisting technology except glassmaking to produce several hundred
power magnification


David Goodman, Ph.D, M.L.S.
http://en.wikipedia.org/wiki/User_talk:DGG



On Wed, May 6, 2009 at 11:01 AM, Milos Rancic <millosh@gmail.com> wrote:
> On Wed, May 6, 2009 at 12:50 PM, Thomas Dalton <thomas.dalton@gmail.com> wrote:
>> What are you using to compare the quality of optics to the quality of computers?
>
> * For example, having optics for 2700 years and having computers for
> somewhat more than 50 years.
> * Optics is able to help humans with disabilities for a long time,
> while computers are just starting with that task.
> * Optics is able to reach events around the beginning of the Universe,
> while computers are starting now to make really big things (for
> example, really useful Internet for a lot of humans exists a couple of
> years).
> * A lot of basic cures invented thanks to optics.
> * Thanks to optics we are able to make computers.
> * Generally, everything which differs one simple technology from the
> complex contemporary technology depends on optics.
>
> Of course, it is [still] not possible to compare such things exactly;
> as well as such comparison may be a good source for parody. And, of
> course, computers started to help in all of those fields, but,
> generally, the most of ordinary things around us still depend much
> more on optics than on computers.
>
> _______________________________________________
> foundation-l mailing list
> foundation-l@lists.wikimedia.org
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
>

_______________________________________________
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: Long-term archiving of Wikimedia content [ In reply to ]
Aryeh Gregor wrote:
> Actually, there are more assumptions: you have to assume that humanity
> *ever* recovers, and within a period of time when people will still
> understand written English. You'd have to calibrate the magnitude of
> a catastrophe *very* carefully to get a situation where civilization
> collapses, to the extent that none of the hundreds of millions of
> computers on the planet remains functional for long enough to print
> out any needed info (even using wood/biomass-powered backup
> generators, or emergency fuel supplies) . . . but you still have
> people who can read English around. People are more fragile than
> computers, and not much more numerous.

In that futuristic approach I find it more likely that there will be no
paper / printer, but instead everthing will be stored into
computers/PDAs and transfered between them. So in the event of the
catastrophe you'd be only able to access it with the surviving devices.

Suppose that it does happen *today*.
All electronic systems collapse but the ones at your home.
Also, you cannot produce new ones.
You have a copy of wikipedia on your hard disk. You can access it.
But your computer lifetime is finite. And you also don't know for how
much time you'll still have electric current.
What do you do?

Were I in such situation, I wouldn't have enough paper or ink to do so.
I wouldn't either be able to do so in the needed timeframe before the
electric current is cut. Also, I don't think my hardware is prepared to
handle such amount of work.


From [[en:Printer_(computing)]]
> However, printers are generally slow devices (30 pages per minute is
> considered fast; and many inexpensive consumer printers are far
> slower than that), and the cost per page is actually relatively high.

That's 0.5 pages per second.
Let's assume out printer is able to print a page per second.

Suppose that on each page we can print 120 lines, with 200 characters
per line.
Thus we can store 24000 bytes per page and each second save 24000 bytes
from oblivion.
eswiki-20090504-pages-meta-current is 4.721.723.604 bytes.
So it would take 196738,4835 seconds to print them with such
configuration. 2 days, 6 hours 39 minutes.

Doable in terms of time, specially if the printer is faster.
You would still need 196.739 sheets, though.
And they would weight 983 kg!



_______________________________________________
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: Long-term archiving of Wikimedia content [ In reply to ]
On Wed, May 6, 2009 at 8:13 PM, Platonides <Platonides@gmail.com> wrote:

> Suppose that it does happen *today*.
> All electronic systems collapse but the ones at your home.
> Also, you cannot produce new ones.
> You have a copy of wikipedia on your hard disk. You can access it.
> But your computer lifetime is finite. And you also don't know for how
> much time you'll still have electric current.
> What do you do?


Sell my computer for food, water, guns, and ammunition?
_______________________________________________
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: Long-term archiving of Wikimedia content [ In reply to ]
On Wed, May 6, 2009 at 8:46 PM, Anthony <wikimail@inbox.org> wrote:
> On Wed, May 6, 2009 at 8:13 PM, Platonides <Platonides@gmail.com> wrote:
>
>> Suppose that it does happen *today*.
>> All electronic systems collapse but the ones at your home.
>> Also, you cannot produce new ones.
>> You have a copy of wikipedia on your hard disk. You can access it.
>> But your computer lifetime is finite. And you also don't know for how
>> much time you'll still have electric current.
>> What do you do?
>
>
> Sell my computer for food, water, guns, and ammunition?
> _______________________________________________
> foundation-l mailing list
> foundation-l@lists.wikimedia.org
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
>

I was going to say +5 insightful, but really, it's common sense.
The world is going to hell, and your priority is saving Wikipedia?
My priority is saving my ass. :)

-Chad

_______________________________________________
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: Long-term archiving of Wikimedia content [ In reply to ]
On Wed, May 6, 2009 at 5:46 PM, Anthony <wikimail@inbox.org> wrote:
> On Wed, May 6, 2009 at 8:13 PM, Platonides <Platonides@gmail.com> wrote:
>
>> Suppose that it does happen *today*.
>> All electronic systems collapse but the ones at your home.
>> Also, you cannot produce new ones.
>> You have a copy of wikipedia on your hard disk. You can access it.
>> But your computer lifetime is finite. And you also don't know for how
>> much time you'll still have electric current.
>> What do you do?
>
>
> Sell my computer for food, water, guns, and ammunition?

Actually, if the world is at this stage, you will want to heed
Einstein's advice and get some sticks and stones, rather...(or maybe a
bow and sword, if you prefer the sophisticated version)

Michael


--
Michael Bimmler
mbimmler@gmail.com

_______________________________________________
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: Long-term archiving of Wikimedia content [ In reply to ]
On Wed, May 6, 2009 at 6:54 PM, Chad <innocentkiller@gmail.com> wrote:

>
> My priority is saving my ass. :)
>
> -Chad



Perhaps a tattoo there is the safest place for Wikipedia then!
_______________________________________________
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: Long-term archiving of Wikimedia content [ In reply to ]
Aryeh Gregor wrote:
> Yeah, I'm still going to say the entire idea is ridiculous.

I wouldn't go quite that far. The idea of doing it (or having done it)
makes people feel good, due to the collective sci-fi-like fantasy
implicitly promulgated by the project itself -- a future world of
poverty and decay, saved by the serendipitous discovery of a
time-capsule sent from the past. It's a spectacle, a stunt, and it has
PR value.

I certainly don't begrudge the Long Now Foundation for having done
this with the Rosetta Project, since their primary goal is to
encourage long-term thinking, and expensive stunts are obviously a key
part of that.

But Wikimedia's goals are somewhat different, and we could probably
find some stunts which are more relevant to our mission.

-- Tim Starling


_______________________________________________
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: Long-term archiving of Wikimedia content [ In reply to ]
Making people feel good is ultimately the best reason for archiving the data
- I would agree. And that is synergistic with what I think is the best
strategy for long term archiving, which is giving a complete copy to every
single person in the world. If we were to invest in a class of technologies
it would have to be those that allow for widespread dissemination. A working
dump process is the logical next step..:)

On Wed, May 6, 2009 at 10:16 PM, Tim Starling <tstarling@wikimedia.org>wrote:

> Aryeh Gregor wrote:
> > Yeah, I'm still going to say the entire idea is ridiculous.
>
> I wouldn't go quite that far. The idea of doing it (or having done it)
> makes people feel good, due to the collective sci-fi-like fantasy
> implicitly promulgated by the project itself -- a future world of
> poverty and decay, saved by the serendipitous discovery of a
> time-capsule sent from the past. It's a spectacle, a stunt, and it has
> PR value.
>
> I certainly don't begrudge the Long Now Foundation for having done
> this with the Rosetta Project, since their primary goal is to
> encourage long-term thinking, and expensive stunts are obviously a key
> part of that.
>
> But Wikimedia's goals are somewhat different, and we could probably
> find some stunts which are more relevant to our mission.
>
> -- Tim Starling
>
>
> _______________________________________________
> foundation-l mailing list
> foundation-l@lists.wikimedia.org
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
>
_______________________________________________
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: Long-term archiving of Wikimedia content [ In reply to ]
David Goodman wrote:
> additionally, a simple lens can be ground with hand tools and no
> preexisting technology except glassmaking to produce several hundred
> power magnification
>
>

Not that I want to denigrate glass makers, but what is
wrong with naturally occurring clear material such as
rock crystal and the like?


Yours,

Jussi-Ville Heiskanen



_______________________________________________
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: Long-term archiving of Wikimedia content [ In reply to ]
On Thu, May 7, 2009 at 12:16 AM, Tim Starling <tstarling@wikimedia.org> wrote:
>
> I wouldn't go quite that far. The idea of doing it (or having done it)
> makes people feel good, due to the collective sci-fi-like fantasy
> implicitly promulgated by the project itself -- a future world of
> poverty and decay, saved by the serendipitous discovery of a
> time-capsule sent from the past. It's a spectacle, a stunt, and it has
> PR value.

Producing long-lived snapshots of important projects, and preserving
them for posterity, is more than a feel-good effort -- it is good
practice. Most of the layout work needed to produce this sort of
copy must be done for any sort of print copy, which is certainly a
useful class of dump.

If this were simply a stunt, then it would be worth doing a few times
for impact. In this case I think it's a valid and beautiful way to
store data in its own right. And it is getting cheap enough that
individuals might want to get offline copies in this format. I
checked in with Alexander Rose of the Rosetta project, and here's the
latest news:


They are working with their etchers to lower the cost of the process.
The current amortized cost of making 10 nickel discs (each with 10,000
pages in a 100x100 grid) is around $500 each. They can also make
polymer copies for much less that are likely stable for at least a
century. As they standardize the process, the price may continue to
drop.

The specific process they use involves a few steps : material is
rendered at 300-600dpi (text and images), and laid out in 11x11"
pages. These are saved as separate image files, numbered 1 to 100,000
in 100 directories, and sent to the etcher; which fits each directory
of 100 images onto one row. You need a microscope to read the result,
but a decent USB microscope could do it.



> I certainly don't begrudge the Long Now Foundation for having done
> this with the Rosetta Project, since their primary goal is to encourage
> long-term thinking, and expensive stunts are obviously a key part of that.
>
> But Wikimedia's goals are somewhat different, and we could probably
> find some stunts which are more relevant to our mission.

Are the goals so different? It seems to me long-term thinking is part
and parcel of comprehensively realizing Wikimedia's goals, from
licensing and access to archival revision tracking and
multilingualism. We should probably discuss this in the original
thread about strategic planning.

Rose says they would be glad to include something like a "top 10,000
articles" selection from as many languages as possible in the Rosetta
library. (Thanks!) He suggests this would be a sought-after gift for
major donors. I think that's probably true, and could pay for such an
initiative, but encouraging long-term thinking and long-term valuation
of knowledge around Wikimedia (not just in general) is a more
important reason to consider it.

SJ

_______________________________________________
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: Long-term archiving of Wikimedia content [ In reply to ]
On Thu, May 7, 2009 at 5:17 AM, Samuel Klein <meta.sj@gmail.com> wrote:

> On Thu, May 7, 2009 at 12:16 AM, Tim Starling <tstarling@wikimedia.org>
> wrote:
> >
> > I wouldn't go quite that far. The idea of doing it (or having done it)
> > makes people feel good, due to the collective sci-fi-like fantasy
> > implicitly promulgated by the project itself -- a future world of
> > poverty and decay, saved by the serendipitous discovery of a
> > time-capsule sent from the past. It's a spectacle, a stunt, and it has
> > PR value.
>
> Producing long-lived snapshots of important projects, and preserving
> them for posterity, is more than a feel-good effort -- it is good
> practice.


Why?

On Thu, May 7, 2009 at 12:16 AM, Tim Starling <tstarling@wikimedia.org>wrote:
>
> I wouldn't go quite that far. The idea of doing it (or having done it)
> makes people feel good, due to the collective sci-fi-like fantasy
> implicitly promulgated by the project itself -- a future world of
> poverty and decay, saved by the serendipitous discovery of a
> time-capsule sent from the past.


I'd say that story is completely unrealistic, among other problems. If the
people of the future world of poverty and decay want to learn, they'll
learn. If they don't want to learn, they'll burn the discoverer of the
time-capsule at the stake and then destroy the time-capsule along with it -
or worse, they'll misapply the knowledge contained in the capsule and use it
for their own evil purposes. Thomas Dalton talks about dark ages in
history, but in each of these dark ages there were individuals who had far
greater knowledge than most of the rest of the world, and simply possessing
that knowledge didn't manage to save the world, not for a long time. I'd
write the ending of your sci-fi-like fantasy novel as a tragedy - a
Cassandra-like figure who discovers the capsule and possess all the
knowledge in the world, but is doomed to watch the world destroy itself
because no one will listen. "I see disaster. I see catastrophe. Worse, I
see lawyers!" (and Pokemon characters! and Encyclopedia Dramatica!)

If you're going to engage in this fantasy, it might be better if the time
capsule is intentionally made difficult to read. Then at least it'd only be
usable by a society that is ready to find it. But personally, I think there
are much more important things to do - things that help us here in our own
world, during our own lifetimes.

Quote from Woody Allen's Mighty Aphrodite, cribbed from Wikipedia article
"Cassandra".
_______________________________________________
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: Long-term archiving of Wikimedia content [ In reply to ]
On Wed, May 6, 2009 at 8:13 PM, Platonides <Platonides@gmail.com> wrote:
> In that futuristic approach I find it more likely that there will be no
> paper / printer, but instead everthing will be stored into
> computers/PDAs and transfered between them. So in the event of the
> catastrophe you'd be only able to access it with the surviving devices.

In such a futuristic world, I would expect that the major sources of
power would be things like solar and geothermal that don't require
long-distance supply chains. Then even if the world falls into
anarchy, some well-stocked parts will still have power for a good long
while. So you wouldn't need to actually print it out, you'd have
computers running continuously in some places.

Even if 95% of humanity was wiped out, you'd still have a few hundred
million people. Not one of them is going to be in a position to save
some computers? Even militaries, which are prepared for all sorts of
disasters -- some of which will have computers in multiple
geographically distributed bunkers deep underground with enough fuel
on-site to keep them running for days to years?

> You have a copy of wikipedia on your hard disk. You can access it.
> But your computer lifetime is finite. And you also don't know for how
> much time you'll still have electric current.
> What do you do?

Screw Wikipedia. If I want to preserve useful knowledge, I'll make
sure to safeguard my textbooks. In terms of utility for rebuilding
society, the value of Wikipedia is zero compared to even a tiny
university library. And there are many thousands of university
libraries already conveniently scattered around the world, not a few
of them in subbasements where they'll be resistant to nasty things
happening on the surface.

On Thu, May 7, 2009 at 12:16 AM, Tim Starling <tstarling@wikimedia.org> wrote:
> I wouldn't go quite that far. The idea of doing it (or having done it)
> makes people feel good, due to the collective sci-fi-like fantasy
> implicitly promulgated by the project itself -- a future world of
> poverty and decay, saved by the serendipitous discovery of a
> time-capsule sent from the past. It's a spectacle, a stunt, and it has
> PR value.
>
> I certainly don't begrudge the Long Now Foundation for having done
> this with the Rosetta Project, since their primary goal is to
> encourage long-term thinking, and expensive stunts are obviously a key
> part of that.
>
> But Wikimedia's goals are somewhat different, and we could probably
> find some stunts which are more relevant to our mission.

Okay, I can agree with that.

_______________________________________________
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: Long-term archiving of Wikimedia content [ In reply to ]
I don't want to restart this rather long (but very interesting)
topic, but I'd like to point out / remind people that a couple of
well-placed fires could wipe out most of wikipedia et al. as we
currently know it - surely the first priority, before thinking about
the real long term, is to sort that out? Remember the Library of
Alexandria...

Mike

On 7 May 2009, at 15:21, Aryeh Gregor wrote:

> On Wed, May 6, 2009 at 8:13 PM, Platonides <Platonides@gmail.com>
> wrote:
>> In that futuristic approach I find it more likely that there will
>> be no
>> paper / printer, but instead everthing will be stored into
>> computers/PDAs and transfered between them. So in the event of the
>> catastrophe you'd be only able to access it with the surviving
>> devices.
>
> In such a futuristic world, I would expect that the major sources of
> power would be things like solar and geothermal that don't require
> long-distance supply chains. Then even if the world falls into
> anarchy, some well-stocked parts will still have power for a good long
> while. So you wouldn't need to actually print it out, you'd have
> computers running continuously in some places.
>
> Even if 95% of humanity was wiped out, you'd still have a few hundred
> million people. Not one of them is going to be in a position to save
> some computers? Even militaries, which are prepared for all sorts of
> disasters -- some of which will have computers in multiple
> geographically distributed bunkers deep underground with enough fuel
> on-site to keep them running for days to years?
>
>> You have a copy of wikipedia on your hard disk. You can access it.
>> But your computer lifetime is finite. And you also don't know for how
>> much time you'll still have electric current.
>> What do you do?
>
> Screw Wikipedia. If I want to preserve useful knowledge, I'll make
> sure to safeguard my textbooks. In terms of utility for rebuilding
> society, the value of Wikipedia is zero compared to even a tiny
> university library. And there are many thousands of university
> libraries already conveniently scattered around the world, not a few
> of them in subbasements where they'll be resistant to nasty things
> happening on the surface.
>
> On Thu, May 7, 2009 at 12:16 AM, Tim Starling
> <tstarling@wikimedia.org> wrote:
>> I wouldn't go quite that far. The idea of doing it (or having done
>> it)
>> makes people feel good, due to the collective sci-fi-like fantasy
>> implicitly promulgated by the project itself -- a future world of
>> poverty and decay, saved by the serendipitous discovery of a
>> time-capsule sent from the past. It's a spectacle, a stunt, and it
>> has
>> PR value.
>>
>> I certainly don't begrudge the Long Now Foundation for having done
>> this with the Rosetta Project, since their primary goal is to
>> encourage long-term thinking, and expensive stunts are obviously a
>> key
>> part of that.
>>
>> But Wikimedia's goals are somewhat different, and we could probably
>> find some stunts which are more relevant to our mission.
>
> Okay, I can agree with that.
>
> _______________________________________________
> foundation-l mailing list
> foundation-l@lists.wikimedia.org
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


_______________________________________________
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: Long-term archiving of Wikimedia content [ In reply to ]
2009/5/10 Michael Peel <email@mikepeel.net>:

> I don't want to restart this rather long (but very interesting)
> topic, but I'd like to point out / remind people that a couple of
> well-placed fires could wipe out most of wikipedia et al. as we
> currently know it - surely the first priority, before thinking about
> the real long term, is to sort that out? Remember the Library of
> Alexandria...


The new dumps are progressing very well. Presumably when they're done
we can give the Internet Archive and any similar archivists a yell.


- d.

_______________________________________________
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: Long-term archiving of Wikimedia content [ In reply to ]
On Sun, May 10, 2009 at 5:06 PM, David Gerard <dgerard@gmail.com> wrote:

> 2009/5/10 Michael Peel <email@mikepeel.net>:
>
> > I don't want to restart this rather long (but very interesting)
> > topic, but I'd like to point out / remind people that a couple of
> > well-placed fires could wipe out most of wikipedia et al. as we
> > currently know it - surely the first priority, before thinking about
> > the real long term, is to sort that out? Remember the Library of
> > Alexandria...
>
>
> The new dumps are progressing very well. Presumably when they're done
> we can give the Internet Archive and any similar archivists a yell.


The private parts of the database are probably more valuable than the public
ones, though.
_______________________________________________
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: Long-term archiving of Wikimedia content [ In reply to ]
2009/5/10 Anthony <wikimail@inbox.org>:
> The private parts of the database are probably more valuable than the public
> ones, though.

Why? The private parts are just deleted stuff. The deleted stuff isn't
generally very valuable, that's why it was deleted.

_______________________________________________
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: Long-term archiving of Wikimedia content [ In reply to ]
On Sun, May 10, 2009 at 5:14 PM, Thomas Dalton <thomas.dalton@gmail.com>wrote:

> 2009/5/10 Anthony <wikimail@inbox.org>:
> > The private parts of the database are probably more valuable than the
> public
> > ones, though.
>
> Why? The private parts are just deleted stuff. The deleted stuff isn't
> generally very valuable, that's why it was deleted.


Mostly I meant the user data (especially the passwords). The relative value
of them compared to the rest can be shown by anyone who tries to create a
fork.
_______________________________________________
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: Long-term archiving of Wikimedia content [ In reply to ]
On 10 May 2009, at 22:06, David Gerard wrote:

> 2009/5/10 Michael Peel <email@mikepeel.net>:
>
>> I don't want to restart this rather long (but very interesting)
>> topic, but I'd like to point out / remind people that a couple of
>> well-placed fires could wipe out most of wikipedia et al. as we
>> currently know it - surely the first priority, before thinking about
>> the real long term, is to sort that out? Remember the Library of
>> Alexandria...
>
>
> The new dumps are progressing very well. Presumably when they're done
> we can give the Internet Archive and any similar archivists a yell.

I'll believe that when the dump's finished running... (or is the dump
process recoverable now?)

Personally, I'd like to see much more mirroring of the live
databases, spread around as many countries/continents as possible, in
addition to dumps being made available regularly.

Does the WMF have a disaster recovery plan?

Mike


_______________________________________________
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: Long-term archiving of Wikimedia content [ In reply to ]
On Sun, May 10, 2009 at 5:16 PM, Anthony <wikimail@inbox.org> wrote:

> On Sun, May 10, 2009 at 5:14 PM, Thomas Dalton <thomas.dalton@gmail.com>wrote:
>
>> 2009/5/10 Anthony <wikimail@inbox.org>:
>> > The private parts of the database are probably more valuable than the
>> public
>> > ones, though.
>>
>> Why? The private parts are just deleted stuff. The deleted stuff isn't
>> generally very valuable, that's why it was deleted.
>
>
> Mostly I meant the user data (especially the passwords). The relative
> value of them compared to the rest can be shown by anyone who tries to
> create a fork.
>

By the way, the same argument could be made about all the prior versions -
it "generally isn't very valuable", and that's why it was edited.

The current versions, or something close to them, are widespread enough to
do in case of catastrophe. Google cache, answers.com, etc. At least for
the English language version, anyway.
_______________________________________________
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: Long-term archiving of Wikimedia content [ In reply to ]
On Sun, May 10, 2009 at 5:16 PM, Anthony <wikimail@inbox.org> wrote:
> Mostly I meant the user data (especially the passwords).  The relative value
> of them compared to the rest can be shown by anyone who tries to create a
> fork.

In the dumps, these are always done first:
<http://download.wikimedia.org/enwiki/20090506/>

--
Casey Brown
Cbrown1023

---
Note: This e-mail address is used for mailing lists. Personal emails sent to
this address will probably get lost.

_______________________________________________
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: Long-term archiving of Wikimedia content [ In reply to ]
2009/5/10 Casey Brown <cbrown1023.ml@gmail.com>:
> On Sun, May 10, 2009 at 5:16 PM, Anthony <wikimail@inbox.org> wrote:
>> Mostly I meant the user data (especially the passwords).  The relative value
>> of them compared to the rest can be shown by anyone who tries to create a
>> fork.
>
> In the dumps, these are always done first:
> <http://download.wikimedia.org/enwiki/20090506/>

They are produced, but they aren't distributed (that's what "private"
means!). Do they get stored anywhere other than the file server?

_______________________________________________
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: Long-term archiving of Wikimedia content [ In reply to ]
On Sun, May 10, 2009 at 5:25 PM, Casey Brown <cbrown1023.ml@gmail.com>wrote:

> On Sun, May 10, 2009 at 5:16 PM, Anthony <wikimail@inbox.org> wrote:
> > Mostly I meant the user data (especially the passwords). The relative
> value
> > of them compared to the rest can be shown by anyone who tries to create a
> > fork.
>
> In the dumps, these are always done first:
> <http://download.wikimedia.org/enwiki/20090506/>


Pretty much in order of importance (except that I don't know what "Update
dataset for OAI updater system" means).

I suppose the stub dump is also important as it's not widely replicated and
contains the author information, but in a month or so the GFDL is going to
be dropped anyway so that wouldn't be such a huge loss I guess.

In any case, the Library of Alexandria analogy kind of forgets about the "no
original research" policy. Even if the current versions of all articles
were lost, it wouldn't be all that horrible. It might even be a good
thing. Consider Citizendium's decision not to go the route of the fork but
instead to start from scratch.
_______________________________________________
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: Long-term archiving of Wikimedia content [ In reply to ]
The OAI updater is for incremental updates of search indexes using MWSearch.

On Sun, May 10, 2009 at 3:34 PM, Anthony <wikimail@inbox.org> wrote:

> On Sun, May 10, 2009 at 5:25 PM, Casey Brown <cbrown1023.ml@gmail.com
> >wrote:
>
> > On Sun, May 10, 2009 at 5:16 PM, Anthony <wikimail@inbox.org> wrote:
> > > Mostly I meant the user data (especially the passwords). The relative
> > value
> > > of them compared to the rest can be shown by anyone who tries to create
> a
> > > fork.
> >
> > In the dumps, these are always done first:
> > <http://download.wikimedia.org/enwiki/20090506/>
>
>
> Pretty much in order of importance (except that I don't know what "Update
> dataset for OAI updater system" means).
>
> I suppose the stub dump is also important as it's not widely replicated and
> contains the author information, but in a month or so the GFDL is going to
> be dropped anyway so that wouldn't be such a huge loss I guess.
>
> In any case, the Library of Alexandria analogy kind of forgets about the
> "no
> original research" policy. Even if the current versions of all articles
> were lost, it wouldn't be all that horrible. It might even be a good
> thing. Consider Citizendium's decision not to go the route of the fork but
> instead to start from scratch.
> _______________________________________________
> foundation-l mailing list
> foundation-l@lists.wikimedia.org
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
>
_______________________________________________
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: Long-term archiving of Wikimedia content [ In reply to ]
A brief update:

On Thu, May 7, 2009 at 5:17 AM, Samuel Klein
> The current amortized cost of making 10 nickel
> discs (each with 10,000 pages in a 100x100 grid) is
> around $500 each.   They can also make
> polymer copies for much less that are likely stable
> for at least a century.

The amortized cost is closer to $600 for nickel disks and $150 for polymer.

Back to the near term : can someone tell me the rough cost per GB and
expected lifetime of the backups our colo uses? (I would guess $1 and
20 years.) Are there full offsite backups?

I would feel better if people would help update
http://meta.wikimedia.org/wiki/Contingency_planning
...

SJ

_______________________________________________
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l

1 2 3  View All