Hi
I'm fairly new to wikipedia and think it's an amazing project. Congrats to
all involved! I've been hit with the speed problems though, so I've been
following the optimization discussions from afar and may be able to
contribute somewhat. I'm hoping to have some more time to help out with
optimizing as I've got quite a bit of experience in the field (at least I've
just written a MySQL book, so people think I do).
I was quite surprised to read that the web and database server are on the
same machine. Simply separating these two (even for 2 equivalent slower
machines) which you llook like you're going to do, has usually made a
noticeable difference, allowing the machines to do both jobs optimally
rather than hopping about trying to do both. Definitely put the mysql server
on the better machine.
Another fairly easy thing you can do to cut out the slow queries is to
activate slow query logging. This only logs 'slow' queries and you can
quickly tell quite a lot from this, and work on these queries as a priority.
MySQL 4 would make sense as an upgrade. It's more than stable enough on any
*nix platforms, and has some substantial advantages as some of you have
pointed out already.
I haven't really had a look at the code, so I'm not sure how relevant all
this is, but mirroring pages makes a lot of sense, and takes unecessary load
off the database. The front page, and all articles, should be available at
high speed (this at least gives newcomers.something to see even if the db is
churning in the background). If the front page needs to do any database
queries, then mirror it as a static page every x minutes, much more
efficient than doing the query hundreds of times a minute. You could mirror
individual articles, but I'd need to look at the code and usage to see if
this will help.
Hopefully I'll be able to help out more, but these are just some thoughts
that may be useful to think about for now. Apologies if they're not relevant
because I haven't taken a close enough look at the details. I'll hopefully
be able to dive in soon. I'd also like to take a look at the my.cnf file as
well, if that's possible, as well as know the hardware of the db server.
That's something that can be optimized quite quickly, and may help a bit, if
it's not been done already.
regards,
ian gilfillan
I'm fairly new to wikipedia and think it's an amazing project. Congrats to
all involved! I've been hit with the speed problems though, so I've been
following the optimization discussions from afar and may be able to
contribute somewhat. I'm hoping to have some more time to help out with
optimizing as I've got quite a bit of experience in the field (at least I've
just written a MySQL book, so people think I do).
I was quite surprised to read that the web and database server are on the
same machine. Simply separating these two (even for 2 equivalent slower
machines) which you llook like you're going to do, has usually made a
noticeable difference, allowing the machines to do both jobs optimally
rather than hopping about trying to do both. Definitely put the mysql server
on the better machine.
Another fairly easy thing you can do to cut out the slow queries is to
activate slow query logging. This only logs 'slow' queries and you can
quickly tell quite a lot from this, and work on these queries as a priority.
MySQL 4 would make sense as an upgrade. It's more than stable enough on any
*nix platforms, and has some substantial advantages as some of you have
pointed out already.
I haven't really had a look at the code, so I'm not sure how relevant all
this is, but mirroring pages makes a lot of sense, and takes unecessary load
off the database. The front page, and all articles, should be available at
high speed (this at least gives newcomers.something to see even if the db is
churning in the background). If the front page needs to do any database
queries, then mirror it as a static page every x minutes, much more
efficient than doing the query hundreds of times a minute. You could mirror
individual articles, but I'd need to look at the code and usage to see if
this will help.
Hopefully I'll be able to help out more, but these are just some thoughts
that may be useful to think about for now. Apologies if they're not relevant
because I haven't taken a close enough look at the details. I'll hopefully
be able to dive in soon. I'd also like to take a look at the my.cnf file as
well, if that's possible, as well as know the hardware of the db server.
That's something that can be optimized quite quickly, and may help a bit, if
it's not been done already.
regards,
ian gilfillan