Hi,
I've attached four files. Two of which are debug dumps, one is the
conf file and the last one is a test case scenario that constantly
fails on my end. I hope this gives a little more information.
Furthermore, the dumps are from 3.17.5 which is the "closest" version
to 3.18.0 that I was able to find.
Both failed scenarios occur when lots of messages were being flooded
to rsyslogd at a very fast rate (look at logtest.c) The
my_arm_rsyslog_suicide_debug.txt received a sigsegv fault while
my_arm_rsyslog_sh_cannot_fork caused so many "Z [rsyslogd]" threads
that it took up so much memory that executing any command as simple as
'ls -l' would not work from the command line. I think the number of
threads grew as much as the number of messages. In the latter
scenario, after killing logtest.c, it didn't look like the those
zombies threads went away until I did a CTRL+C to the rsyslogd which
was running in the foreground since I use the "-dn" option.
This is on an embedded system that runs significantly slower than a
desktop or laptop so maybe it would be harder to reproduce on a
regular computer. I looked at all the parameters that I believe could
affect this and believe for the most part the defaults are more than
adequate. The main message queue never looked like it hit the high
water mark but it did hit the lower one. So, I don't think messages
were being dropped (not sure) or an overflow condition occurred.
The processor is ARM-based and it is using Linux kernel 2.6.16.12 and
compiled using GCC and the standard GNU C libraries version 3.4.5.
Rsyslog source code is cross-compiled using the following configure
line:
./configure --disable-zlib --disable-largefile
--enable-share=yes
--prefix=/
--host=arm-unknown-linux-gnu
ac_cv_func_malloc_0_nonnull=yes
ac_cv_func_realloc_0_nonnull=yes
ac_cv_func_lstat_dereferences_slashed_symlink=yes
ac_cv_func_stat_empty_string_bug=no
enable_debug=no
enable_rtinst=no
Lastly, the logtest was executed with just the "-s" parameter. It is a
simple C file that I came up with.
I took a look at the debug messages and it does not appear that new
threads are created via calls to wtpStartWrkr in wtp.c.
Any help I can bring to solve this issue, please let me know. I hope I
am not doing anything wrong here.
Thanks,
Scott
>
> On Wed, Jul 2, 2008 at 12:04 PM, Scott Phuong <mycleanjunk at gmail.com> wrote:
>> Hi Rainer,
>>
>> Thanks for your reply. Looking at the default settings (from the
>> online help's configuration page), they are what I wanted. The main
>> messages queue is set to fix sized array with 1 worker thread created
>> at maximum and action queues are direct mode which according to the
>> queue document page, means that there will not be a worker thread
>> created. Is my understanding correct? If yes, how do I quickly check
>> without using the -d option if the defaults are set correctly? Or what
>> do I look for in the debug messages that gets printed out to ensure
>> this?
>>
>> You also mentioned that version 3.18.0 is probably going to be
>> released as the stable version next week. I see on the webpage there
>> is a 3.17.4 and 3.17.5. Are these two versions similiar to 3.18.0?
>>
>> Also, how come I did not get your reply in my email inbox? My account
>> settings look correct.
>>
>> Thanks,
>>
>> Scott Phuong
>>
>> As for the syslog buffer size, that applies to syslogd and does not
>> apply to rsyslog.
>>
>>
>>
>> My configuration files do not change the Action queue or Worker queue
>> parameters at all. Looking at
>> On Wed, 2008-07-02 at 01:15 -0700, Scott Phuong wrote:
>>> Hi,
>>>
>>> I have 3.16.2 which was recently released. I see that under certain
>>> conditions rsyslogd spawns a lot of threads:
>>> 5949 root 11216 S rsyslogd
>>> 5950 root 11216 S rsyslogd
>>> 5951 root 11216 S rsyslogd
>>> 5952 root 11216 S rsyslogd
>>> 5953 root 11216 S rsyslogd
>>> 5954 root 11216 S rsyslogd
>>> 5985 root Z [rsyslogd]
>>> 6445 root Z [rsyslogd]
>>>
>>> I had to kill the rsyslogd and restart it. The first invocation had a
>>> pid of 219 before it had to be killed. The second invocation of pid
>>> which you see above starts with 5949. The difference is the amount of
>>> zombie threads that were invoked by rsyslogd before I had to kill the
>>> first invocation of it.
>>
>> I have no explanation yet for the zombies. They should not happen and so
>> far I have never seen them. We may need to go through a debug log (which
>> will become very large) to find out what's going on.
>>
>>> The question is under what conditions does rsyslogd spawn a new
>>> thread/process and why was it a zombie?
>>
>> Unfortunately, there is no quick answer. A quick one may be: when it
>> needs them, based on queue watermark settings and based on you
>> configuration. But to really understand it, you need to read this doc:
>>
>> http://www.rsyslog.com/doc-queues.html
>>
>> The doc also describes all the knobs that you can use to control thread
>> creation. There are many ;)
>>
>>> I am running rsyslogd in an
>>> embedded environment and not a regular laptop/desktop.
>>
>> Interesting use case...
>>
>>> In addition, I
>>> am using busybox and I believe the syslog buffer size is set to
>>
>> what do yo mean by "syslog buffer size"? The length of a receive buffer?
>> It is 2K, thus single messages up to 2K are supported. It can be changed
>> by modifying the MAXLINE define. Note that stock syslogd (and RFC3164)
>> support only up to 1K.
>>
>>> something very low or perhaps none at all. Would this be a factor?
>>> Furthermore, I ran rsyslogd with -c3 and also without -c3 and both
>>> cases happen.
>>
>> The compatibility modes do not affect queue operation.
>>
>>> Are these issues already known and fixed in a later version? Sorry, if
>>> I am asking the same questions or have the same issues as previous
>>> people but without the ability to search (or at least, I don't know
>>> how to) the archive, I don't know if my problem/questions has already
>>> been seen and/or resolved.
>>
>> If we need to find out about the zombies, we need to move on to the
>> current devel version. So I would give that a try in any case. 3.16.2
>> will (most probably) be replaced by 3.18.0 (based on the current beta)
>> next week. So I won't touch it any longer.
>>
>> Looking forward to your feedback,
>> Rainer
>>
>>>
>>> Thank you very much for your support.
>>>
>>> Scott
>>> _______________________________________________
>>> rsyslog mailing list
>>> http://lists.adiscon.net/mailman/listinfo/rsyslog
>>
>