Mailing List Archive

ConnectionPools and removeSelf
I'd appreciate help squeezing out a bit more performance out of my
backhand-assisted setup. I'm running a heavily-trafficked two-tier setup
(single master proxying to several slaves). I currently have
BackhandConnectionPools off on the slave servers.

My first question: Whenever I turn BackhandConnectionPools off on the master
server, all 1024 apache processes get eaten up within a minute and the
server dies. I don't think that apache is tearing down the connections in a
timely manner. Is there anything I can do to correct this? With connection
pools on, I'm still getting my apache error logs stuffed with the "connect()
time out" error messages.

Second question: I have RemoveSelf set on the master server. However, it
appears that whenever a slave server uses up all of it's apache processes,
the master server ignores RemoveSelf and tries to serve pages anyways. Is
there anyway to circumvent this behavior so that the master server will
absolutely NEVER try to serve pages itself (and perhaps redirect to another
slave server)? I believe that this behavior was raised by another user but I
don't believe I saw an answer.

Thanks in advance for any help.
ConnectionPools and removeSelf [ In reply to ]
On Saturday, Oct 18, 2003, at 18:55 US/Eastern, Stephen Wang wrote:

> I'd appreciate help squeezing out a bit more performance out of my
> backhand-assisted setup. I'm running a heavily-trafficked two-tier
> setup
> (single master proxying to several slaves). I currently have
> BackhandConnectionPools off on the slave servers.

BackhandConnectionPools (on|off) shouldn't have any effect on the
slaves at all. The slaves don't have the opportunity to redirect
connections from clients, so to use connection pooling or not isn't
relevant.

> My first question: Whenever I turn BackhandConnectionPools off on the
> master
> server, all 1024 apache processes get eaten up within a minute and the
> server dies. I don't think that apache is tearing down the connections
> in a
> timely manner. Is there anything I can do to correct this? With
> connection
> pools on, I'm still getting my apache error logs stuffed with the
> "connect()
> time out" error messages.

This is really odd. Usually we should see the opposite. Things tend
to be a bit less stable under heavy load with connection pooling on.
The reason for this is that the moderator process becomes essential to
serving each request. The moderator establishes connections and the
children request the open connection from the moderator, the moderator
hands it to the Apache child, the child uses it and then the child
hands it back. With a vast number of children you see the moderator
process high contended for.

As you said, perhaps it isn't tearing down the connections correctly.
Assuming you want to disable connection pooling on the front end box, I
would suggest turning keep-alives off on the backend boxes as well as
force downgrading all client connections to HTTP 1.0. The protocol is
a bit simpler as it doesn't use chunked encoding and there is less
plumbing required to proxy it back to a slave server. There is a force
downgrade example in the default Apache httpd.conf file (for MSIE), you
should be able to generalize it to downgrade everything.

If you aren't going to use connection pools (which may achieve higher
performance), then I would recommend setting keepalives off EVERYWHERE.
Most big (10 million+ hits / day) sites running Apache 1.3.x turn of
keep alives to achieve more client concurrency at cost of (hardly
noticeable) degraded client-perceived performance.

> Second question: I have RemoveSelf set on the master server. However,
> it
> appears that whenever a slave server uses up all of it's apache
> processes,
> the master server ignores RemoveSelf and tries to serve pages anyways.
> Is
> there anyway to circumvent this behavior so that the master server will
> absolutely NEVER try to serve pages itself (and perhaps redirect to
> another
> slave server)? I believe that this behavior was raised by another user
> but I
> don't believe I saw an answer.

The answer is clear cut. The problem is that the client has a
connection to a server and has asked a question. So, what if that
server is absolutely unable to make a request to a slave server? It
has to serve the page from somewhere (even if it is an error).

The quick answer is that you can use MulticastStats two argument
version to advertise the front end machine as "someone else". This
means the front end box will see all slave servers and itself as usual,
but it will see "itself" as the IP:port of some other machine whether
it be one of the slave servers or a little thttpd instance that serves
"Sorry the system is overloaded" pages. That set up combined with the
BackhandSelfRedirect On option should solve your problem.

BackhandSelfRedirect on tells mod_backhand that even if it chooses
itself, that is should proxy there no let apache internally serve it.
The caveat here is that mod_backhand needs to be able to choose itself.
If it can't choose anyone, it will serve it internally. So, assuming
you have a cluster of 8 or so machines I would use something like:

MulticastStats fallbackip:fallbackport multicastip:port,ttl
BackhandSelfRedirect On

Backhand byAge
Backhand removeSelf
Backhand byRandom
Backhand byLogWindow
Backhand byBusyChildren
Backhand addSelf

This will give you a list of servers (the first of which should only be
used) guaranteeing that the last server in the list is "yourself". But
yourself is really some fallback server and you will redirect there.

> Thanks in advance for any help.
>
> _______________________________________________
> backhand-users mailing list
> backhand-users@lists.backhand.org
> http://lists.backhand.org/mailman/listinfo/backhand-users
>
// Theo Schlossnagle
// Principal Engineer -- http://www.omniti.com/~jesus/
// Postal Engine -- http://www.postalengine.com/
// Ecelerity: fastest MTA on earth