Mailing List Archive

Re: [Wikipedia-l] LAG
Jonathan Walther wrote:
> On Mon, Nov 11, 2002 at 06:38:41PM -0500, The Cunctator wrote:
> >I'm always suspicious when someone makes the assertion that Language or
> >Database X is vastly inferior to Language or Database Y, especially when
>
> I'm not saying Postgres would result in dramatic speedups; it's about
> equal to MySQL in speed. But I believe the slowdown caused by locks
> WOULD go away. I agree with you that better indexing could speed things

This is just like an edit war, so let's apply the NPOV. That is,
let's agree that we are working towards a faster, smoother operation
of the system, regardless of which combination of tools eventually
achieve that goal. Before we have tried Postgres, let's refrain from
claiming that it has "obvious" advantages or drawbacks. Otherwise we
risk getting entrenched in prestigious preferences, and few things can
be more destructive.

To assess whether an alternative solution really is better or worse, I
think we should begin by measuring the current performance, then
change, then measure again. Luckily, I've been running some
response time measurements continuously since we were running the
phase II software. And yes, it has gotten worse in the last 3-4 weeks
than before.

Week Percentage of samples when [[Chemistry]]
starting took more than 5 seconds to retrieve
--------- ----------------------------------------
13 May 02 9%
20 May 02 8%
27 May 02 8%
3 Jun 02 11%
10 Jun 02 8%
17 Jun 02 n/a
24 Jun 02 n/a
1 Jul 02 n/a
8 Jul 02 13%
15 Jul 02 8%
22 Jul 02 4% <-- move to phase III software, all gets faster
29 Jul 02 5%
5 Aug 02 0%
12 Aug 02 n/a
19 Aug 02 n/a
26 Aug 02 1%
2 Sep 02 n/a
9 Sep 02 n/a
16 Sep 02 n/a
23 Sep 02 n/a <-- n/a means my measurement script was broken
30 Sep 02 1%
7 Oct 02 3% <-- still pretty good
14 Oct 02 6% <-- worse
21 Oct 02 17% <-- bad
28 Oct 02 12% <-- bad
4 Nov 02 8% <-- bad
11 Nov 02 11% (Mon-Wed)

(Yes, in Sweden weeks start on Monday)

Let's get those numbers below 5% again.

My script only tries to retrieve (read) pages. It doesn't try
to submit updates. At regular intervals, it accesses a URL for a
page, and outputs a log of the date and time, the URL, the HTTP status
code, the number of bytes retrieved, and the amount of elapsed time.
If you have a better idea for how to extract useful conclusions from
such data, please let me know.


--
Lars Aronsson (lars@aronsson.se)
tel +46-70-7891609
http://aronsson.se/ http://elektrosmog.nu/ http://susning.nu/
Re: Re: [Wikipedia-l] LAG [ In reply to ]
On Wed, Nov 13, 2002 at 08:25:22PM +0100, Lars Aronsson wrote:
> Jonathan Walther wrote:
> > On Mon, Nov 11, 2002 at 06:38:41PM -0500, The Cunctator wrote:
> > >I'm always suspicious when someone makes the assertion that Language or
> > >Database X is vastly inferior to Language or Database Y, especially when
> >
> > I'm not saying Postgres would result in dramatic speedups; it's about
> > equal to MySQL in speed. But I believe the slowdown caused by locks
> > WOULD go away. I agree with you that better indexing could speed things
>
> This is just like an edit war, so let's apply the NPOV. That is,
> let's agree that we are working towards a faster, smoother operation
> of the system, regardless of which combination of tools eventually
> achieve that goal. Before we have tried Postgres, let's refrain from
> claiming that it has "obvious" advantages or drawbacks. Otherwise we
> risk getting entrenched in prestigious preferences, and few things can
> be more destructive.

What about trying other backends of MySQL ?
It has at least two other than the default one - afair BerkeleyDB and
InnoDB.

It shouldn't be much work as it's still MySQL and we could get some
data on which of the three is the best for Wikipedia without much
work.
Re: Re: [Wikipedia-l] LAG [ In reply to ]
On Thu, Nov 14, 2002 at 12:19:27AM +0100, Tomasz Wegrzanowski wrote:
>> achieve that goal. Before we have tried Postgres, let's refrain from
>> claiming that it has "obvious" advantages or drawbacks. Otherwise we
>> risk getting entrenched in prestigious preferences, and few things can
>> be more destructive.
>
>What about trying other backends of MySQL ?
>It has at least two other than the default one - afair BerkeleyDB and
>InnoDB.

Experience with InnoDB on Kuro5hin.org shows it to be problematic. And it
still doesn't give you subselects. Subselects can mean some big savings in
terms of query times.

Jonathan

--
Geek House Productions, Ltd.

Providing Unix & Internet Contracting and Consulting,
QA Testing, Technical Documentation, Systems Design & Implementation,
General Programming, E-commerce, Web & Mail Services since 1998

Phone: 604-435-1205
Email: djw@reactor-core.org
Webpage: http://reactor-core.org
Address: 2459 E 41st Ave, Vancouver, BC V5R2W2
Re: Re: [Wikipedia-l] LAG [ In reply to ]
On Wed, Nov 13, 2002 at 05:46:25PM -0800, Jonathan Walther wrote:
> On Thu, Nov 14, 2002 at 12:19:27AM +0100, Tomasz Wegrzanowski wrote:
> >>achieve that goal. Before we have tried Postgres, let's refrain from
> >>claiming that it has "obvious" advantages or drawbacks. Otherwise we
> >>risk getting entrenched in prestigious preferences, and few things can
> >>be more destructive.
> >
> >What about trying other backends of MySQL ?
> >It has at least two other than the default one - afair BerkeleyDB and
> >InnoDB.
>
> Experience with InnoDB on Kuro5hin.org shows it to be problematic. And it
> still doesn't give you subselects. Subselects can mean some big savings in
> terms of query times.

Well, I don't remember Kuro5hin ever having so serious problems as
Wikipedia.

I mean - if there are things that we can check out just now, with very
little work, then let's do it. In the worst case, if they won't improve
anything or even be worse than current MySQL, we will just be where we
started with little efford wasted.