Mailing List Archive

(fwd) Re: Solaris 2.4 http server hangs on accept() during high load
Seen this?

>From: payne@OpenMarket.com (Andrew Payne)
>Date: 20 Jun 1995 15:08:11 -0400

In article <3s4uhi$kl3@news.xs4all.nl> erwin@dds.nl (Erwin Bolwidt) writes:

>>>We're running the Apache 0.6.2 http server on a dual sparc Sparcserver
>>>1000. The server is getting 150,000 - 160,000 hits a day and the number
>>>is still growing.
>
>>I ran into a similar problem with Apache 0.6.2 & Solaris 2.4 on a Sparc 20.
>>Any large number of connects (~8K in 15 seconds) will cause terrible problems
>>throughout the TCP networking code. The problem also happened with NCSA
>>httpd 1.3, 1.4, and Netscape's Commerce Server.
>
>>Unfortunately, I cant seem to convince Sun that there is a problem with their
>>OS, so as far as I know there is no official bug.

This is a long shot diagnosis, because I don't know the internal
architectures for the various servers, but Solaris does not support
multi-process accept() calls on the same file descriptor.

If the server does the following:

bind()
fork()
accept() /* in each listener process */

you can {panic,crash,hang} the Solaris kernel under very heavy load. We
ran into this problem when we were stress-testing our own WebServer.

The official workaround from Sun is to serialize the calls to accept().

--
Andrew Payne http://www.openmarket.com/personal/payne
Open Market, Inc.

--
| Ade Rixon, Elsevier Science Ltd | http://www.dcs.aber.ac.uk/~ajr/ |

"Gentlemen!! Let's broaden our minds!"
- The Joker


----- End Included Message -----