Mailing List Archive

Alternative proposal for interlanguage links redesign and a few other issues
1. move everything to Postgres
2. move everything to common database, with tables foo_cur, foo_old etc.,
where foo are language names
3. make single user table (needs some tweaking to allow slightly different
preferences), single logging system, single recent changes, and all other
nice things we can do with that
4. move everything to UTF-8, so we don't have to use %escapes in English Wikipedia
5. create table interwiki (source_lang, target_lang, source_title, target_title)
6. convert all cur_text by removing interwiki links from page tops,
and add apropriate entries to interwiki.
7. compute transitive symmetric closure of foo_interwiki, and display that
as interwiki links. If there will be many articles of the same language
in it display it like: "English (Astronomy), English (Astrophysics), German"
In practive it shouldn't be such a big problem.
8. when editing add interwiki links on top of page or in separate box.
add Javascript button to add all from links from transitive symmetric closure
of interwiki
9. add some magic functionality that allows changing many interwiki links at time.
It is needed as transitive symmetric closure often contains many
copies of the same. "delete all interwiki links to this page" and
"change all interwiki links from X to Y" would probably be enough.

Computing symmetric transitive closure will require a bit of magic.
I think we should keep results in database and change only - editing
may be a bit slower, as long as viewing is not any worse than now.
Re: Alternative proposal for interlanguage links redesign and a few other issues [ In reply to ]
"Tomasz Wegrzanowski" skribis:

> 1. move everything to Postgres
> 2. move everything to common database, with tables foo_cur, foo_old etc.,
> where foo are language names

[eo]
Do por cxiu lingvo alia tabelo.
Kial?
Cxu ne atributo por la lingvo en komuna
tabelo suficxus?

[en]
So for each new language a new table.
Why?
Doesn't a language field in one
article table would be enough?

Pauxlo
Re: Re: Alternative proposal for interlanguage links redesign and a few other issues [ In reply to ]
On Tue, Jan 07, 2003 at 11:40:10PM +0100, Paul Ebermann wrote:
> [eo]
> Do por cxiu lingvo alia tabelo.
> Kial?
> Cxu ne atributo por la lingvo en komuna
> tabelo suficxus?
>
> [en]
> So for each new language a new table.
> Why?
> Doesn't a language field in one
> article table would be enough?

It should be faster that way, because locking issues on English tables
won't cause any problems for us.

And too many English articles won't cause selects to be any slower
for other languages that way.
Re: Re: Alternative proposal for interlanguage links redesign and a few other issues [ In reply to ]
On mar, 2003-01-07 at 15:06, Tomasz Wegrzanowski wrote:
> On Tue, Jan 07, 2003 at 11:40:10PM +0100, Paul Ebermann wrote:
> > So for each new language a new table. Why?
> > Doesn't a language field in one article table would be enough?
>
> It should be faster that way, because locking issues on English tables
> won't cause any problems for us.
>
> And too many English articles won't cause selects to be any slower
> for other languages that way.

I think a better approach is to make the system work more smoothly so
that neither the other languages _nor_ the English section have to
suffer from the English section's large number of articles.

Separate tables in the same database won't be any better than separate
databases on the same server, which presently causes problems because
the English tables will get locked up and additional requests get backed
up until the server hits its open connection limit -- then other
languages can't connect either, and you just have to wait it out.

If we can iron out the locks, I believe a combined table system will be
both more flexible and easier to program against.

-- brion vibber (brion @ pobox.com)
Re: Re: Alternative proposal for interlanguage links redesign and a few other issues [ In reply to ]
On Tue, Jan 07, 2003 at 03:58:35PM -0800, Brion Vibber wrote:
> On mar, 2003-01-07 at 15:06, Tomasz Wegrzanowski wrote:
> > On Tue, Jan 07, 2003 at 11:40:10PM +0100, Paul Ebermann wrote:
> > > So for each new language a new table. Why?
> > > Doesn't a language field in one article table would be enough?
> >
> > It should be faster that way, because locking issues on English tables
> > won't cause any problems for us.
> >
> > And too many English articles won't cause selects to be any slower
> > for other languages that way.
>
> I think a better approach is to make the system work more smoothly so
> that neither the other languages _nor_ the English section have to
> suffer from the English section's large number of articles.
>
> Separate tables in the same database won't be any better than separate
> databases on the same server, which presently causes problems because
> the English tables will get locked up and additional requests get backed
> up until the server hits its open connection limit -- then other
> languages can't connect either, and you just have to wait it out.

Separate tables in single database is completely equivalent to separate
tables in separate databases (databases aren't real, tables are real).
I simply don't see any point in changing that.

Some "fair" system would be nice, with "other" languages having higher
priorities than English ...

> If we can iron out the locks, I believe a combined table system will be
> both more flexible and easier to program against.

That would be nice, but we should rather try to make minimal number of
changes necessary for given goal.

Having only 'interwiki' and 'users' common is enough, we can merge the rest
later if it won't result in any performance problems.
Re: Re: Alternative proposal for interlanguage links redesign and a few other issues [ In reply to ]
On mar, 2003-01-07 at 16:08, Tomasz Wegrzanowski wrote:
> Separate tables in single database is completely equivalent to separate
> tables in separate databases (databases aren't real, tables are real).
> I simply don't see any point in changing that.

That was my point -- your suggestion (separate tables, one db) will be
no faster than the present situation (separate tables, separate dbs),
and little faster than Paul's suggestion (combined tables, one db), as
logjams on the English wiki will continue to affect both the many
anglophone users and the users on other languages in all three cases.

> Having only 'interwiki' and 'users' common is enough, we can merge the rest
> later if it won't result in any performance problems.

True enough.

-- brion vibber (brion @ pobox.com)
Re: Re: Alternative proposal for interlanguage links redesign and a few other issues [ In reply to ]
On Tue, Jan 07, 2003 at 04:36:32PM -0800, Brion Vibber wrote:
> On mar, 2003-01-07 at 16:08, Tomasz Wegrzanowski wrote:
> > Separate tables in single database is completely equivalent to separate
> > tables in separate databases (databases aren't real, tables are real).
> > I simply don't see any point in changing that.
>
> That was my point -- your suggestion (separate tables, one db) will be
> no faster than the present situation (separate tables, separate dbs),
> and little faster than Paul's suggestion (combined tables, one db), as
> logjams on the English wiki will continue to affect both the many
> anglophone users and the users on other languages in all three cases.

It wasn't about performance but about common interwiki and users.

Having script open 2 databases would required two connections, wouldn't it ?
Now that would be slower.
Re: Alternative proposal for interlanguage links redesign and a few other issues [ In reply to ]
On Die, 2003-01-07 at 19:07, Tomasz Wegrzanowski wrote:
> 1. move everything to Postgres
> 2. move everything to common database, with tables foo_cur, foo_old etc.,
> where foo are language names
> 3. make single user table (needs some tweaking to allow slightly different
> preferences), single logging system, single recent changes, and all other
> nice things we can do with that
> 4. move everything to UTF-8, so we don't have to use %escapes in English Wikipedia
> 5. create table interwiki (source_lang, target_lang, source_title, target_title)
> 6. convert all cur_text by removing interwiki links from page tops,
> and add apropriate entries to interwiki.
> 7. compute transitive symmetric closure of foo_interwiki, and display that
> as interwiki links. If there will be many articles of the same language
> in it display it like: "English (Astronomy), English (Astrophysics), German"
> In practive it shouldn't be such a big problem.
> 8. when editing add interwiki links on top of page or in separate box.
> add Javascript button to add all from links from transitive symmetric closure
> of interwiki
> 9. add some magic functionality that allows changing many interwiki links at time.
> It is needed as transitive symmetric closure often contains many
> copies of the same. "delete all interwiki links to this page" and
> "change all interwiki links from X to Y" would probably be enough.

Have you read my proposal, Tomasz? You seem to be saying the same thing,
only that you want to do other things at the same time (move to
Postgres, have more shared tables etc.).

Regards,

Erik
--
FOKUS - Fraunhofer Insitute for Open Communication Systems
Project BerliOS - http://www.berlios.de