Many thanks to Neil Harris for the bots; They've been pounding on the
site for while now, and here's what we've learned:
- The size of the database isn't an issue. If Wikipedia doubles, or
more, performance won't be affected at all. This is what I would have
expected.
- Concurrent access does slow things down, but not pathologically.
I've got 16 bots running over two high-speed connectiions right now,
and the server isn't swapping. However, some pages do take longer to
serve than when load is light.
- Some special pages are still moderately slow (particularly "wanted"
and "random page"), but the real time hogs now are very long pages
with lots of links. Some particular hogs are "Current events",
"Chinese sovereign", "List of rare diseases" and long history pages
with lots of changes like main page and bug reports. The sample
scrabble game is the longest page, but it has very few links so it's
not as much of a hog, though it's still a problem.
"Current events" strikes me as a particularly big, yet solvable,
problem. We should come up with a way of breaking it into manageable
pieces. Chinese sovereign can clearly be broken up as well, and we
can do things like replace the Scrabble diagrams with images.
0
site for while now, and here's what we've learned:
- The size of the database isn't an issue. If Wikipedia doubles, or
more, performance won't be affected at all. This is what I would have
expected.
- Concurrent access does slow things down, but not pathologically.
I've got 16 bots running over two high-speed connectiions right now,
and the server isn't swapping. However, some pages do take longer to
serve than when load is light.
- Some special pages are still moderately slow (particularly "wanted"
and "random page"), but the real time hogs now are very long pages
with lots of links. Some particular hogs are "Current events",
"Chinese sovereign", "List of rare diseases" and long history pages
with lots of changes like main page and bug reports. The sample
scrabble game is the longest page, but it has very few links so it's
not as much of a hog, though it's still a problem.
"Current events" strikes me as a particularly big, yet solvable,
problem. We should come up with a way of breaking it into manageable
pieces. Chinese sovereign can clearly be broken up as well, and we
can do things like replace the Scrabble diagrams with images.
0