Mailing List Archive

Database Tables
Hello,
a closer look in the maintenance directory from cvs esp.
BuildTables.[inc|sql] showed differencs between those two files.
(e.g. table user, user_touched).

Possibly someone can tell me how the real tables look like, it will make
porting Wikipedia to e.g postgeSQL easier.

Smurf
--
--- Anthill Inside! ---
Re: Database Tables [ In reply to ]
> (Thomas Corell <tcorell@igd.fhg.de>):
> Hello,
> a closer look in the maintenance directory from cvs esp.
> BuildTables.[inc|sql] showed differencs between those two files.
> (e.g. table user, user_touched).
>
> Possibly someone can tell me how the real tables look like, it will make
> porting Wikipedia to e.g postgeSQL easier.

buildTables.inc, which is included by createdb.php, is always the
most up-to-date schema. Older databases can be updated using the
valious "patch" files.

--
Lee Daniel Crocker <lee@piclab.com> <http://www.piclab.com/lee/>
"All inventions or works of authorship original to me, herein and past,
are placed irrevocably in the public domain, and may be used or modified
for any purpose, without permission, attribution, or notification."--LDC
Re: Database Tables [ In reply to ]
On Fri, 28 Mar 2003, Thomas Corell wrote:
> a closer look in the maintenance directory from cvs esp.
> BuildTables.[inc|sql] showed differencs between those two files.
> (e.g. table user, user_touched).

buildTables.inc is currently canonical. It consists of a bunch of SQL
statements which are executed in turn, which strikes me as kind of silly.

I intended that buildTables.sql replace it, but it hasn't been kept up to
date and isn't yet used by anything.

-- brion vibber (brion @ pobox.com)
Re: Database Tables [ In reply to ]
> (Brion Vibber <vibber@aludra.usc.edu>):
> On Fri, 28 Mar 2003, Thomas Corell wrote:
> > a closer look in the maintenance directory from cvs esp.
> > BuildTables.[inc|sql] showed differencs between those two files.
> > (e.g. table user, user_touched).
>
> buildTables.inc is currently canonical. It consists of a bunch of SQL
> statements which are executed in turn, which strikes me as kind of silly.
>
> I intended that buildTables.sql replace it, but it hasn't been kept up to
> date and isn't yet used by anything.

Originally it was the .sql, but I changed it to PHP code because
proper initialization of the database requires filling in some data
from variables defined in LocalSettings.php. The only reasonable
way to do that is with PHP code.

Of course, we could change the PHP code to read all the CREATE
TABLE/CREATE INDEX type statements from a .sql file.

--
Lee Daniel Crocker <lee@piclab.com> <http://www.piclab.com/lee/>
"All inventions or works of authorship original to me, herein and past,
are placed irrevocably in the public domain, and may be used or modified
for any purpose, without permission, attribution, or notification."--LDC
Re: Database Tables [ In reply to ]
Lee Daniel Crocker wrote:
>>(Brion Vibber <vibber@aludra.usc.edu>):
>>On Fri, 28 Mar 2003, Thomas Corell wrote:
>>
>
>>buildTables.inc is currently canonical. It consists of a bunch of SQL
>>statements which are executed in turn, which strikes me as kind of silly.
>>I intended that buildTables.sql replace it, but it hasn't been kept up to
>>date and isn't yet used by anything.
> Originally it was the .sql, but I changed it to PHP code because
> proper initialization of the database requires filling in some data
> from variables defined in LocalSettings.php. The only reasonable
> way to do that is with PHP code.
> Of course, we could change the PHP code to read all the CREATE
> TABLE/CREATE INDEX type statements from a .sql file.

Ok, usually the next question is: any further 'useless' files? Possibly
deleting this file (cvs usually still keeps a copy as far as I know)
will cause less confusion ;) Ot at least, if someone plans to use it in
the future, add a appropriate comment.

Sorry for asking such a lot silly questions, but it is very hard to get
a overview of the code for me. No problem for someone coded and cared
for it a long time, of course, but for someone you to it :(

Smurf
--
------------------------- Anthill inside! ---------------------------
Re: Database Tables [ In reply to ]
On Sat, 29 Mar 2003, Thomas Corell wrote:
> Ok, usually the next question is: any further 'useless' files?

Yes, plenty. Start with the documentation, it's all wrong. ;)

Not even sure what these are:
SkinFramed.php
SpecialNeglectedpages.php
SpecialVote.php
sql/


Not yet used:
stylesheets/wikibits.js
Javascript code for enhanced recentchanges (use this instead
of pumping it into the page header), plus unfinished code
for checking client's timezone.

SpecialIntl.php
New interlanguage link code. Experimental, not yet enabled.

-- brion vibber (brion @ pobox.com)
Re: Database Tables [ In reply to ]
Brion Vibber wrote:
> On Sat, 29 Mar 2003, Thomas Corell wrote:
>>Ok, usually the next question is: any further 'useless' files?
> Yes, plenty. Start with the documentation, it's all wrong. ;)
*sight* I understood sofar: RTFS ;)

Yust another thing to ask, to avoid mistakes. I hope I got this right:

On the real server there exists for each language a database, in the
creation scripts usually called wikidb. The php-scripts exists only one
time (I remember Brion told me that not long ago) and there is some
function which selects the right database. I think depending on the
domain name or so. (I will find this in the near future, but it's not
important for database design)

So in the start of it, a connection to the appropriate database is
opened, and the thing starts running.

Are there any bad mistakes, I made in this statements?

Just because I ran over it: There was the idea of having a unique userid
in all wikis. Login once, registerd everywhere ;)
There is the hard way: everybody register again, or the soft way: we
design a 'master'-userid, and the users can 'attach' language specific
UID's and Names to this 'master-UID'. (thats just an idea wandering
around in my brain, may be impossible to implement ;)

Ok, I will download one of the database dumps tomorrow, after getting 3
to 4 hours of sleep ;)

Smurf
--
------------------------- Anthill inside! ---------------------------
Re: Database Tables [ In reply to ]
On Sat, 29 Mar 2003, Thomas Corell wrote:
> Yust another thing to ask, to avoid mistakes. I hope I got this right:
>
> On the real server there exists for each language a database, in the
> creation scripts usually called wikidb. The php-scripts exists only one
> time (I remember Brion told me that not long ago) and there is some
> function which selects the right database. I think depending on the
> domain name or so. (I will find this in the near future, but it's not
> important for database design)

Mmm, not exactly. They all use the same base source, but they're
duplicated in a series of directories. Each has various settings in
LocalSettings.php different, such as language selection, name of database
to use, the URL to use for self-links, etc.

The Apache webserver is configured with name-based virtual hosts to use
this or that directory as the web root depending on the domain name used
to access it (so thus the language / meta / wiktionary / test).

> Are there any bad mistakes, I made in this statements?
>
> Just because I ran over it: There was the idea of having a unique userid
> in all wikis. Login once, registerd everywhere ;)
> There is the hard way: everybody register again,

Ouch! :)

> or the soft way: we
> design a 'master'-userid, and the users can 'attach' language specific
> UID's and Names to this 'master-UID'. (thats just an idea wandering
> around in my brain, may be impossible to implement ;)

My present master plan is that we'll reassign all user_ids. This will be
an automatic process, for the most part: no one needs to know their
user_id (the auto-incremented integer key), we just need to match up all
the identical accounts, merge them into a central user database, and
swap out all the numbers in the cur & old tables.

Everyone needs new login cookies at that point, of course, since it would
run over all subdomains of .wikipedia.org, so no problem that the id is
different.

However, there are two potential pitfalls:
- People using different login _names_ on different wikis would have to
settle on one.

- There might be some cases of duplicate names on different wikis which
belong to different people. We can verify if they're the same by checking
that passwords match, but some people may use different passwords, so we
can't be 100% sure in that case.

> Ok, I will download one of the database dumps tomorrow, after getting 3
> to 4 hours of sleep ;)

'Night. :)

-- brion vibber (brion @ pobox.com)
Re: Database Tables [ In reply to ]
Brion Vibber wrote:
> Mmm, not exactly. They all use the same base source, but they're
> duplicated in a series of directories. Each has various settings in
> LocalSettings.php different, such as language selection, name of database
> to use, the URL to use for self-links, etc.

Just to be sure, you could change some code in any of the php files in
the e.g. german wikipedia only? I though the only file which is language
dependend are the Language*.php ones and Localsettings.php.

> However, there are two potential pitfalls:
> - People using different login _names_ on different wikis would have to
> settle on one.
> - There might be some cases of duplicate names on different wikis which
> belong to different people. We can verify if they're the same by checking
> that passwords match, but some people may use different passwords, so we
> can't be 100% sure in that case.

Well, that are problems you got, if you take the easy way. First of all,
you save the passwords in clear text in the db? *running to the source*
found md5 *wipe* But no salt. Only the hash! *shiver*. I see the urgent
reason to get new user accounts and passworts.

I still not shure why we can't have both, Single sign on (I like those
buzzwords ;) and different names. A table like this:

create table userid_at_wiki (
newuserid int references user(userid),
wikidb varchar(25),
olduid int,
oldname varchar(255)
);

SELECT userid_at_wiki.newuserid,
userid_at_wiki.wikidb,
userid_at_wiki.olduid,
userid_at_wiki.oldname,
user.name
FROM userid_at_wiki JOIN user
ON userid_at_wiki.newuserid = user.newuserid
WHERE user.name = 'Smurf';
Result:
11245, 'meta', 1565, 'Smurf', 'Smurf'
11245, 'de', 756, 'Smurf', 'Smurf'
11245, 'foo', 345, 'yawn', 'Smurf'
(3 rows)

will get around some of you problems brion, IMHO .

Smurf
--
Examples and code are fully experimental and untested, use without
warranty.
------------------------- Anthill inside! ---------------------------
Re: Database Tables [ In reply to ]
On Sat, 2003-03-29 at 00:06, Thomas Corell wrote:
> Brion Vibber wrote:
> > Mmm, not exactly. They all use the same base source, but they're
> > duplicated in a series of directories. Each has various settings in
> > LocalSettings.php different, such as language selection, name of database
> > to use, the URL to use for self-links, etc.
>
> Just to be sure, you could change some code in any of the php files in
> the e.g. german wikipedia only?

One could, but one would only rarely want to.

There are a few performance tweaks that have been hacked into the
English wiki specifically (page counters disabled; redirect search to
google during peak hours), and the Chinese and Meta stylesheets have
been tweaked a little, but other than they all should be identical aside
from LocalSettings.php.

> I though the only file which is language
> dependend are the Language*.php ones and Localsettings.php.

That's right. The above are installation-dependent, not
language-dependent.

> Well, that are problems you got, if you take the easy way. First of all,
> you save the passwords in clear text in the db? *running to the source*
> found md5 *wipe* But no salt. Only the hash! *shiver*. I see the urgent
> reason to get new user accounts and passworts.

Suggestions on improving security are always welcome...

> I still not shure why we can't have both, Single sign on (I like those
> buzzwords ;) and different names.

Mainly because that would be strange and inconsistent. :) We need
single-name ability to prevent name-snatching. For instance, someone
other than Aoineko signed up the Aoineko name on meta.wikipedia.org in
order to slyly misattribute posts under his name.

Now, hypothetically we could _also_ allow other names to be hung onto
that account, but that seems excessive IMHO.

-- brion vibber (brion @ pobox.com)
Re: Database Tables [ In reply to ]
Brion Vibber wrote:

>>Well, that are problems you got, if you take the easy way. First of all,
>>you save the passwords in clear text in the db? *running to the source*
>>found md5 *wipe* But no salt. Only the hash! *shiver*. I see the urgent
>>reason to get new user accounts and passworts.
> Suggestions on improving security are always welcome...

Something like $hash = md5( $userid . $passwort );
I think thats a common way for md5 passwords.


> Mainly because that would be strange and inconsistent. :) We need
> single-name ability to prevent name-snatching. For instance, someone
> other than Aoineko signed up the Aoineko name on meta.wikipedia.org in
> order to slyly misattribute posts under his name.

Oh, well this was only intended to be a solution for old users. I
possibly missed to write this.

Make a list of uid's and names of all wiki's with the wiki name. Block
all the used names for a while (any idea about removing/timing accounts
out in mind?) and every old user can 'promote' an old uid to a new uid
with the right passwort, or add it to the 'alternate names list', or
possibly remove the old name from the blocking list with the right
passwort, of course.

That will be a script totaly outside normal wiki pages, becouse after
some month you can throw it in the virtual paper basket ;)

But, of course we can start although a run to the new names ;)
Smurf
--
------------------------- Anthill inside! ---------------------------