Mailing List Archive

[mod_backhand-users] frontend server unavailable under heavy usage
hi there.

under heavy usage (around 190 req/sec) mod_backhand started to throw
these errors:
[Mon Nov 26 16:42:24 2001] [error] (4)Interrupted system call: Child
2274 failed to establish umbilical to moderator!
[Mon Nov 26 16:42:24 2001] [error] (4)Interrupted system call: Child
2273 failed to establish umbilical to moderator!
[Mon Nov 26 16:42:24 2001] [error] (4)Interrupted system call: Child
2272 failed to establish umbilical to moderator!
[Mon Nov 26 16:42:24 2001] [error] (4)Interrupted system call: Child
2264 failed to establish umbilical to moderator!
[Mon Nov 26 16:42:24 2001] [error] (4)Interrupted system call: Child
2262 failed to establish umbilical to moderator!
[Mon Nov 26 16:42:24 2001] [error] (4)Interrupted system call: Child
2261 failed to establish umbilical to moderator!
[Mon Nov 26 16:42:24 2001] [error] (4)Interrupted system call: Child
2260 failed to establish umbilical to moderator!
[Mon Nov 26 16:42:24 2001] [error] (4)Interrupted system call: Child
2259 failed to establish umbilical to moderator!
[Mon Nov 26 16:42:24 2001] [error] (4)Interrupted system call: Child
2258 failed to establish umbilical to moderator!

and frontend server was completely not available.

system load was below 1.00. where's the bottleneck? what resources
should be increased to overcome this problem?

from backhand-users list archive i found solution to increase listen
backlog when error was connection refused.
i did set that in source to 255. also i played with system syn_backlog
variable, increased and decreased it.
but that did not seem to help much.


18:12:44 glen@heart[2:364] logs$ cat
/proc/sys/net/ipv4/tcp_max_syn_backlog
384

i'm using mod backhand 1.2.0 release version.

ps:
what permissions should be in UnixSocketDir ?
i saw some files owned by root there (apache itself runs under uid www,
initially started by root to be able to bind() to port 80)
and unixsocketdir is /var/tmp dir with sticky +t bit. can that be a
problem?

--
glen
[mod_backhand-users] frontend server unavailable under heavy usage [ In reply to ]
On Monday, November 26, 2001, at 11:23 AM, glen wrote:
> under heavy usage (around 190 req/sec) mod_backhand started to throw
> these errors:
> [Mon Nov 26 16:42:24 2001] [error] (4)Interrupted system call: Child
> 2274 failed to establish umbilical to moderator!
> and frontend server was completely not available.
>
> system load was below 1.00. where's the bottleneck? what resources
> should be increased to overcome this problem?

The bottleneck is the moderator process. I don't have a problem
sustaining around 190 reqs/second on my server, but there are two things
you can try:

(1) renice the moderator process to -20 (the CVS version has a PID file
option which make this easier).
(2) Using the CVS release, turn connection pooling off.
"BackhandConnectionPools off"

> from backhand-users list archive i found solution to increase listen
> backlog when error was connection refused.
> i did set that in source to 255. also i played with system syn_backlog
> variable, increased and decreased it.
> but that did not seem to help much.
>
> 18:12:44 glen@heart[2:364] logs$ cat
> /proc/sys/net/ipv4/tcp_max_syn_backlog
> 384

The backlog that is having problems is the bparent AF_UNIX socket in the
UnixSocketDir. Many OSs don't allow you to increase that queue at all.

> i'm using mod backhand 1.2.0 release version.

> what permissions should be in UnixSocketDir ?
> i saw some files owned by root there (apache itself runs under uid www,
> initially started by root to be able to bind() to port 80)
> and unixsocketdir is /var/tmp dir with sticky +t bit. can that be a
> problem?

Any permissions/ownership that allows the apache user (nobody in my
case/www in yours) to read,write and delete files there.

--
Theo Schlossnagle
1024D/82844984/95FD 30F1 489E 4613 F22E 491A 7E88 364C 8284 4984
2047R/33131B65/71 F7 95 64 49 76 5D BA 3D 90 B9 9F BE 27 24 E7
[mod_backhand-users] frontend server unavailable under heavy usage [ In reply to ]
> > under heavy usage (around 190 req/sec) mod_backhand started to throw
> > these errors:
> > [Mon Nov 26 16:42:24 2001] [error] (4)Interrupted system call: Child
> > 2274 failed to establish umbilical to moderator!
> > and frontend server was completely not available.
> >
> > system load was below 1.00. where's the bottleneck? what resources
> > should be increased to overcome this problem?
>
> The bottleneck is the moderator process. I don't have a problem
> sustaining around 190 reqs/second on my server, but there are two things
> you can try:
>
> (1) renice the moderator process to -20 (the CVS version has a PID file
> option which make this easier).

This is pretty critical on FreeBSD for some reason. Would it be
possible to add a new configuration knob that'll have the moderator
renice itself to -20 before it drops root privs? -sc

--
Sean Chittenden
[mod_backhand-users] frontend server unavailable under heavy usage [ In reply to ]
On Monday, November 26, 2001, at 03:53 PM, Sean Chittenden wrote:
>> (1) renice the moderator process to -20 (the CVS version has a PID file
>> option which make this easier).
>
> This is pretty critical on FreeBSD for some reason. Would it be
> possible to add a new configuration knob that'll have the moderator
> renice itself to -20 before it drops root privs? -sc

Using the CVS version, I add the following line to httpd.conf:

BackhandModeratorPIDFile /var/run/moderator.pid

And the following lines to my apachectl right before the ";;" of the
start) clause:

sleep 2
renice -20 -p `cat /var/apache/backhand/moderator.pid`

When I start my server, it starts up, sleeps two second to make sure
everything is up and then renices it automatically. this seems to work
well for me.

--
Theo Schlossnagle
1024D/82844984/95FD 30F1 489E 4613 F22E 491A 7E88 364C 8284 4984
2047R/33131B65/71 F7 95 64 49 76 5D BA 3D 90 B9 9F BE 27 24 E7
[mod_backhand-users] frontend server unavailable under heavy usage [ In reply to ]
> >> (1) renice the moderator process to -20 (the CVS version has a PID file
> >> option which make this easier).
> >
> > This is pretty critical on FreeBSD for some reason. Would it be
> > possible to add a new configuration knob that'll have the moderator
> > renice itself to -20 before it drops root privs? -sc
>
> Using the CVS version, I add the following line to httpd.conf:
>
> BackhandModeratorPIDFile /var/run/moderator.pid
>
> And the following lines to my apachectl right before the ";;" of the
> start) clause:
>
> sleep 2
> renice -20 -p `cat /var/apache/backhand/moderator.pid`
>
> When I start my server, it starts up, sleeps two second to make sure
> everything is up and then renices it automatically. this seems to work
> well for me.

Ooh! I like. I'll tweak apachectl via the port so that it does this.
Nice tip, thanks! -sc

--
Sean Chittenden
[mod_backhand-users] frontend server unavailable under heavy usage [ In reply to ]
Today, Theo Schlossnagle wrote:

> > under heavy usage (around 190 req/sec) mod_backhand started to throw
> > these errors:
> > [Mon Nov 26 16:42:24 2001] [error] (4)Interrupted system call: Child
> > 2274 failed to establish umbilical to moderator!
> > and frontend server was completely not available.
> >
> > system load was below 1.00. where's the bottleneck? what resources
> > should be increased to overcome this problem?
>
> The bottleneck is the moderator process. I don't have a problem
> sustaining around 190 reqs/second on my server, but there are two things
> you can try:
>
> (1) renice the moderator process to -20 (the CVS version has a PID file
> option which make this easier).

i tried that. but it did not seem to help.
hmm, i have 1.2.0 version and the pidfile option is supported :)

> (2) Using the CVS release, turn connection pooling off.
> "BackhandConnectionPools off"
i'll try it tommorow :)


> > from backhand-users list archive i found solution to increase listen
> > backlog when error was connection refused.
> > i did set that in source to 255. also i played with system syn_backlog
> > variable, increased and decreased it.
> > but that did not seem to help much.
> >
> > 18:12:44 glen@heart[2:364] logs$ cat
> > /proc/sys/net/ipv4/tcp_max_syn_backlog
> > 384
>
> The backlog that is having problems is the bparent AF_UNIX socket in the
> UnixSocketDir. Many OSs don't allow you to increase that queue at all.

so the EINTR error is directly related to too many processes trying to
open that socket?

shouldn't the right behaviour be retry in case of EINTR error? currently
there is exit(), and apache keeps spawning childs and mod_backhand doing
exit().


> > i'm using mod backhand 1.2.0 release version.
>
> > what permissions should be in UnixSocketDir ?
> > i saw some files owned by root there (apache itself runs under uid www,
> > initially started by root to be able to bind() to port 80)
> > and unixsocketdir is /var/tmp dir with sticky +t bit. can that be a
> > problem?
>
> Any permissions/ownership that allows the apache user (nobody in my
> case/www in yours) to read,write and delete files there.

now that i changed socketdir OWNED by apache user (not just writable by
user), i don't see anymore bchild-* files:

total 4.0k
srwx------ 1 root root 0 Nov 26 11:05 bchild-00501=
srwx------ 1 root root 0 Nov 26 16:14 bchild-05452=
srwx------ 1 www www 0 Nov 19 15:54 bchild-21983=
srwx------ 1 www www 0 Nov 19 15:54 bchild-21984=
srwx------ 1 root root 0 Nov 26 09:42 bchild-32618=
srwxrwxr-x 1 www www 0 Nov 26 16:14 bparent=


--
glen