Mailing List Archive

[mod_backhand-users] (12)Not enough space: mod_backhand: Child 1000 disconnected.
What could cause the following to show up in the error log of a
backhanded server? And could it be related to another error which shows
up in the same error log that looks like this:

[Wed Apr 19 18:17:56 2000] [error] (9)Bad file number: mod_backhand:
MBCSP error (making request)
[Wed Apr 19 18:17:56 2000] [error] mod_backhand: could not get valid
connection
[Wed Apr 19 18:17:56 2000] [error] [client 149.48.24.32] File does not
exist: /cgi-bin/test.pl

This server is backhanded along with another one...the requests to the
other server go through just fine, no odd messages in the error log.
Some of the requests to the flaky server are OK (with some being
executed locally, others going off to the other server) but many result
in a 404 and the above error.

And could it be related to another error which shows up in the same
error log that looks like this:
(12)Not enough space: mod_backhand: Child 1000 disconnected.

The second error does not appear to be directly related to the other one
(they don't show up together all the time) but both errors are only
occuring on one of the two servers.

The server with the errors is a Sparc 1000 with 6 processors and just
under a gig of RAM. The one that's working fine is a Sparc 5 with 128
megs, so the "Not enough space" seems unlikely. Both are running Solaris
2.6, and identical (NFS mounted from the same place) versions of apache
1.3.12 and mod_backhand 1.0.9pre1.

--
Mike Cramer
PBS ONLINE -- http://www.pbs.org/
(703) 739-5019
[mod_backhand-users] (12)Not enough space: mod_backhand: Child 1000 disconnected. [ In reply to ]
OK, this is a tough one to say for sure. But, the fact that those are
powerful machines and the "Child 1000 disconnect" message tells me that
you have HARD_SERVER_LIMIT way up there. My guess is that you are
running out of file descriptors.

Compare the /etc/system file on your flaky server to your good server
and see if there are any disturbing differences.

The default hard limit to file descriptors on Solaris is 1024.. If you
set your HARD_SERVER_LIMIT to that, then mod_backhand will have
problems. The mod_backhand server currently uses select which is
limited to 1024 file descriptors in Solaris. Luckily I don't pay
attention to connections to other servers in the select statement, so
that limits you to 1024 - a few for statistics broadcasting and
collection and log writing... I think 1015 would be safe.

So, as a side note you want to add:
set rlim_fd_max = 4096
to your /etc/system to up the hard limit and then MAKE SURE you unlimit
your file descriptor usage before you run Apache. in csh: unlimit
descriptors and in sh: ulimit -n 4096

You need more file descriptors, because backhand has many more open than
it is selecting on... so set you hard limit to 4096, unlimit your usage
form the shell and I would set HARD_SERVER_LIMIT to 1024, but set
MaxServers to 1000 in your conf file.

This hopefully will help ;)

Mike Cramer wrote:
> [Wed Apr 19 18:17:56 2000] [error] (9)Bad file number: mod_backhand:
> MBCSP error (making request)
> [Wed Apr 19 18:17:56 2000] [error] mod_backhand: could not get valid
> connection
> [Wed Apr 19 18:17:56 2000] [error] [client 149.48.24.32] File does not
> exist: /cgi-bin/test.pl
> This server is backhanded along with another one...the requests to the
> other server go through just fine, no odd messages in the error log.
> Some of the requests to the flaky server are OK (with some being
> executed locally, others going off to the other server) but many result
> in a 404 and the above error.
> And could it be related to another error which shows up in the same
> error log that looks like this:
> (12)Not enough space: mod_backhand: Child 1000 disconnected.
> The second error does not appear to be directly related to the other one
> (they don't show up together all the time) but both errors are only
> occuring on one of the two servers.
> The server with the errors is a Sparc 1000 with 6 processors and just
> under a gig of RAM. The one that's working fine is a Sparc 5 with 128
> megs, so the "Not enough space" seems unlikely. Both are running Solaris
> 2.6, and identical (NFS mounted from the same place) versions of apache
> 1.3.12 and mod_backhand 1.0.9pre1.

--
Theo Schlossnagle
33131B65/2047/71 F7 95 64 49 76 5D BA 3D 90 B9 9F BE 27 24 E7