Mailing List Archive

v1.19.1 is crashing
v1.19.1 hasn't been totally stable for us on RHEL5.. I think it's crashed
a couple of times this week. Here's the latest log entry from the crash:

*** glibc detected *** rsyslogd: corrupted double-linked list: 0xb3427a98 ***
======= Backtrace: =========
/lib/libc.so.6[0x4152cda9]
/lib/libc.so.6(cfree+0x90)[0x415305d0]
rsyslogd[0x804ddba]
rsyslogd(llExecFunc+0x3f)[0x805e86f]
rsyslogd[0x804d80a]
rsyslogd[0x804d938]
/lib/libpthread.so.0[0x416112db]
/lib/libc.so.6(clone+0x5e)[0x4159414e]

Is there any tricks for getting a coredump ? I've started it with
unlimited core size now, in case it goes down again..



-jf
v1.19.1 is crashing [ In reply to ]
On 8/31/07, Jan-Frode Myklebust <janfrode at tanso.net> wrote:
> v1.19.1 hasn't been totally stable for us on RHEL5.. I think it's crashed
> a couple of times this week. Here's the latest log entry from the crash:

rsyslog has never been quite stable for me since I
started using it in production (around version 1.17.x).
It's annoying, but I'm too lazy to debug it right now.
Every version is crashing every now and again. Under
load it can stay up a few hours or a few days, but I've
never seen it work for more than 3-4 days on end.
v1.19.1 is crashing [ In reply to ]
Jan-Frode Myklebust wrote:
> v1.19.1 hasn't been totally stable for us on RHEL5.. I think it's crashed
> a couple of times this week. Here's the latest log entry from the crash:
>
> *** glibc detected *** rsyslogd: corrupted double-linked list: 0xb3427a98 ***
> ======= Backtrace: =========
> /lib/libc.so.6[0x4152cda9]
> /lib/libc.so.6(cfree+0x90)[0x415305d0]
> rsyslogd[0x804ddba]
> rsyslogd(llExecFunc+0x3f)[0x805e86f]
> rsyslogd[0x804d80a]
> rsyslogd[0x804d938]
> /lib/libpthread.so.0[0x416112db]
> /lib/libc.so.6(clone+0x5e)[0x4159414e]
>
> Is there any tricks for getting a coredump ? I've started it with
> unlimited core size now, in case it goes down again..
>
Hi,

could you please provide some more info on your configuration?
Configuration file, options used, log entries preceding the crash, ...
If logging forwarded messages, is the remote logger also rsyslog? Does
it use any templates?
v1.19.1 is crashing [ In reply to ]
On 2007-08-31, theinric at redhat.com <theinric at redhat.com> wrote:
>
> could you please provide some more info on your configuration?
> Configuration file,

#################################################################################
$ grep -v ^# /etc/rsyslog.conf|grep -v ^$
$template DailyPerHostLogs,"/var/log/syslog/%$YEAR%/%$MONTH%/%$DAY%/%HOSTNAME%.log"
*.* -?DailyPerHostLogs
$template MaillogTemplate,"%timegenerated::fulltime% %HOSTNAME% %syslogtag%: %msg%\n"
$template HourlyMaillog,"/var/log/syslog/maillog/%$YEAR%/%$MONTH%/%$DAY%/maillog-%$YEAR%%$MONTH%%$DAY%%$HOUR%.log"
mail.* -?HourlyMaillog;MaillogTemplate
$template precise,"%timegenerated::fulltime% %HOSTNAME% %syslogfacility-text%/%syslogseverity-text% %syslogtag% %msg%\n"
*.* -/var/log/syslog/everything;precise
mail.* ~
$template PerAppLogs,"/var/log/syslog/apps/%programname%.log"
*.* -?PerAppLogs
:msg, contains, "ServeRAID" -/var/log/syslog/apps/serveraid.log
:HOSTNAME, !isequal, "loghost1" ~
*.info;mail.none;authpriv.none;cron.none /var/log/messages
authpriv.* /var/log/secure
mail.* -/var/log/maillog
cron.* /var/log/cron
*.emerg *
uucp,news.crit /var/log/spooler
local7.* /var/log/boot.log
#################################################################################


> options used,

$ grep -v ^# /etc/sysconfig/rsyslog
SYSLOGD_OPTIONS="-m 0 -r514"
KLOGD_OPTIONS="-x"
SYSLOG_UMASK=077

> log entries preceding the crash, ...

It's a quite busy log server, with about 70 active old style syslog servers
sending logs to it. The second it crashed it wrote 111 log-messages.. (273
the second before), mostly various postfix daemons, and I'd need to anonymize
them before sharing.. Can't see anything special.

> If logging forwarded messages, is the remote logger also rsyslog?

No, all are RHEL3/4/5 with their default syslogd server.


-jf
v1.19.1 is crashing [ In reply to ]
Another one of these, this time with v1.19.3:

*** glibc detected *** rsyslogd: corrupted double-linked list: 0xae3fa998 ***
======= Backtrace: =========
/lib/libc.so.6[0x4152ce3e]
/lib/libc.so.6(cfree+0x90)[0x415305d0]
rsyslogd(MsgDestruct+0x73)[0x8057393]
rsyslogd[0x804de0a]
rsyslogd(llExecFunc+0x3f)[0x805ea3f]
rsyslogd[0x804d86a]
rsyslogd[0x804d997]
/lib/libpthread.so.0[0x416112db]
/lib/libc.so.6(clone+0x5e)[0x4159414e]

And I didn't get any core-file, maybe because the v1.19.3 overwrote my
"ulimit -c unlimited" change to the initscript... Ooops :-)


-jf
v1.19.1 is crashing [ In reply to ]
Hi,

I have just release 1.19.4 and hope that the fixes also address your
problem. I'd appreciate if you could try it out.

Rainer

> -----Original Message-----
> From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog-
> bounces at lists.adiscon.com] On Behalf Of Jan-Frode Myklebust
> Sent: Monday, September 03, 2007 10:47 AM
> To: rsyslog at lists.adiscon.com
> Subject: Re: [rsyslog] v1.19.1 is crashing
>
>
> Another one of these, this time with v1.19.3:
>
> *** glibc detected *** rsyslogd: corrupted double-linked list:
> 0xae3fa998 ***
> ======= Backtrace: =========
> /lib/libc.so.6[0x4152ce3e]
> /lib/libc.so.6(cfree+0x90)[0x415305d0]
> rsyslogd(MsgDestruct+0x73)[0x8057393]
> rsyslogd[0x804de0a]
> rsyslogd(llExecFunc+0x3f)[0x805ea3f]
> rsyslogd[0x804d86a]
> rsyslogd[0x804d997]
> /lib/libpthread.so.0[0x416112db]
> /lib/libc.so.6(clone+0x5e)[0x4159414e]
>
> And I didn't get any core-file, maybe because the v1.19.3 overwrote my
> "ulimit -c unlimited" change to the initscript... Ooops :-)
>
>
> -jf
>
> _______________________________________________
> rsyslog mailing list
> http://lists.adiscon.net/mailman/listinfo/rsyslog
v1.19.1 is crashing [ In reply to ]
Hi,

I noticed that one problem I received patches for (and that are now
included in 1.19.4) is rooted in something that is not fully patched. I
think I've found that root cause now (thanks to the info with that
patches). The bottom line, however, is that 1.19.4 may still have some
stability issues. However, they should surface now only in very obscure
cases (but what is obscure...).

I'd still appreciate if you could apply 1.19.4 and tell me the outcome.
I am now working on fixing the root cause. That might take a short
while, as I am thinking about the best *design* to fix the issue.

Rainer

> -----Original Message-----
> From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog-
> bounces at lists.adiscon.com] On Behalf Of Rainer Gerhards
> Sent: Tuesday, September 04, 2007 5:59 PM
> To: rsyslog-users
> Subject: Re: [rsyslog] v1.19.1 is crashing
>
> Hi,
>
> I have just release 1.19.4 and hope that the fixes also address your
> problem. I'd appreciate if you could try it out.
>
> Rainer
>
> > -----Original Message-----
> > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog-
> > bounces at lists.adiscon.com] On Behalf Of Jan-Frode Myklebust
> > Sent: Monday, September 03, 2007 10:47 AM
> > To: rsyslog at lists.adiscon.com
> > Subject: Re: [rsyslog] v1.19.1 is crashing
> >
> >
> > Another one of these, this time with v1.19.3:
> >
> > *** glibc detected *** rsyslogd: corrupted double-linked list:
> > 0xae3fa998 ***
> > ======= Backtrace: =========
> > /lib/libc.so.6[0x4152ce3e]
> > /lib/libc.so.6(cfree+0x90)[0x415305d0]
> > rsyslogd(MsgDestruct+0x73)[0x8057393]
> > rsyslogd[0x804de0a]
> > rsyslogd(llExecFunc+0x3f)[0x805ea3f]
> > rsyslogd[0x804d86a]
> > rsyslogd[0x804d997]
> > /lib/libpthread.so.0[0x416112db]
> > /lib/libc.so.6(clone+0x5e)[0x4159414e]
> >
> > And I didn't get any core-file, maybe because the v1.19.3 overwrote
> my
> > "ulimit -c unlimited" change to the initscript... Ooops :-)
> >
> >
> > -jf
> >
> > _______________________________________________
> > rsyslog mailing list
> > http://lists.adiscon.net/mailman/listinfo/rsyslog
> _______________________________________________
> rsyslog mailing list
> http://lists.adiscon.net/mailman/listinfo/rsyslog
v1.19.1 is crashing [ In reply to ]
On 2007-09-04, Rainer Gerhards <rgerhards at hq.adiscon.com> wrote:
>
> I'd still appreciate if you could apply 1.19.4 and tell me the outcome.
> I am now working on fixing the root cause. That might take a short
> while, as I am thinking about the best *design* to fix the issue.

OK, thanks. I've upgraded my loghost to 1.19.4 now, will let you
know if it fails again.


-jf
v1.19.1 is crashing [ In reply to ]
You have an error in your config file, but it's probably harmless:
%timegenerated::fulltime%

The option fulltime doesn't exist and it looks like it never did. Additionally,
a colon is missing. I've noticed that this definition is actually present in
sample.conf, so you've probably picked it up there. (It looks like it has been
there at least from 0.8.1)
I've did some testing and it probably doesn't have any impact except for a
warning message in debug mode.

Jan-Frode Myklebust wrote:
> On 2007-08-31, theinric at redhat.com <theinric at redhat.com> wrote:
>> could you please provide some more info on your configuration?
>> Configuration file,
>
> #################################################################################
> $ grep -v ^# /etc/rsyslog.conf|grep -v ^$
> $template DailyPerHostLogs,"/var/log/syslog/%$YEAR%/%$MONTH%/%$DAY%/%HOSTNAME%.log"
> *.* -?DailyPerHostLogs
> $template MaillogTemplate,"%timegenerated::fulltime% %HOSTNAME% %syslogtag%: %msg%\n"
> $template HourlyMaillog,"/var/log/syslog/maillog/%$YEAR%/%$MONTH%/%$DAY%/maillog-%$YEAR%%$MONTH%%$DAY%%$HOUR%.log"
> mail.* -?HourlyMaillog;MaillogTemplate
> $template precise,"%timegenerated::fulltime% %HOSTNAME% %syslogfacility-text%/%syslogseverity-text% %syslogtag% %msg%\n"
> *.* -/var/log/syslog/everything;precise
> mail.* ~
> $template PerAppLogs,"/var/log/syslog/apps/%programname%.log"
> *.* -?PerAppLogs
> :msg, contains, "ServeRAID" -/var/log/syslog/apps/serveraid.log
> :HOSTNAME, !isequal, "loghost1" ~
> *.info;mail.none;authpriv.none;cron.none /var/log/messages
> authpriv.* /var/log/secure
> mail.* -/var/log/maillog
> cron.* /var/log/cron
> *.emerg *
> uucp,news.crit /var/log/spooler
> local7.* /var/log/boot.log
> #################################################################################
>
>
>> options used,
>
> $ grep -v ^# /etc/sysconfig/rsyslog
> SYSLOGD_OPTIONS="-m 0 -r514"
> KLOGD_OPTIONS="-x"
> SYSLOG_UMASK=077
>
>> log entries preceding the crash, ...
>
> It's a quite busy log server, with about 70 active old style syslog servers
> sending logs to it. The second it crashed it wrote 111 log-messages.. (273
> the second before), mostly various postfix daemons, and I'd need to anonymize
> them before sharing.. Can't see anything special.
>
>> If logging forwarded messages, is the remote logger also rsyslog?
>
> No, all are RHEL3/4/5 with their default syslogd server.
>
>
> -jf
>
> _______________________________________________
> rsyslog mailing list
> http://lists.adiscon.net/mailman/listinfo/rsyslog
v1.19.1 is crashing [ In reply to ]
I have checked on fulltime, it is an error in sample.conf - there is no
need for this option and I think it was actually not present. I'll
remove that sample.

Rainer

> -----Original Message-----
> From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog-
> bounces at lists.adiscon.com] On Behalf Of theinric at redhat.com
> Sent: Wednesday, September 05, 2007 11:53 AM
> To: rsyslog-users
> Subject: Re: [rsyslog] v1.19.1 is crashing
>
> You have an error in your config file, but it's probably harmless:
> %timegenerated::fulltime%
>
> The option fulltime doesn't exist and it looks like it never did.
> Additionally,
> a colon is missing. I've noticed that this definition is actually
> present in
> sample.conf, so you've probably picked it up there. (It looks like it
> has been
> there at least from 0.8.1)
> I've did some testing and it probably doesn't have any impact except
> for a
> warning message in debug mode.
>
> Jan-Frode Myklebust wrote:
> > On 2007-08-31, theinric at redhat.com <theinric at redhat.com> wrote:
> >> could you please provide some more info on your configuration?
> >> Configuration file,
> >
> >
>
#######################################################################
> ##########
> > $ grep -v ^# /etc/rsyslog.conf|grep -v ^$
> > $template
>
DailyPerHostLogs,"/var/log/syslog/%$YEAR%/%$MONTH%/%$DAY%/%HOSTNAME%.lo
> g"
> > *.* -?DailyPerHostLogs
> > $template MaillogTemplate,"%timegenerated::fulltime% %HOSTNAME%
> %syslogtag%: %msg%\n"
> > $template
>
HourlyMaillog,"/var/log/syslog/maillog/%$YEAR%/%$MONTH%/%$DAY%/maillog-
> %$YEAR%%$MONTH%%$DAY%%$HOUR%.log"
> > mail.* -?HourlyMaillog;MaillogTemplate
> > $template precise,"%timegenerated::fulltime% %HOSTNAME%
> %syslogfacility-text%/%syslogseverity-text% %syslogtag% %msg%\n"
> > *.* -/var/log/syslog/everything;precise
> > mail.* ~
> > $template PerAppLogs,"/var/log/syslog/apps/%programname%.log"
> > *.* -?PerAppLogs
> > :msg, contains, "ServeRAID" -
> /var/log/syslog/apps/serveraid.log
> > :HOSTNAME, !isequal, "loghost1" ~
> > *.info;mail.none;authpriv.none;cron.none
> /var/log/messages
> > authpriv.*
> /var/log/secure
> > mail.* -
> /var/log/maillog
> > cron.*
/var/log/cron
> > *.emerg *
> > uucp,news.crit
> /var/log/spooler
> > local7.*
> /var/log/boot.log
> >
>
#######################################################################
> ##########
> >
> >
> >> options used,
> >
> > $ grep -v ^# /etc/sysconfig/rsyslog
> > SYSLOGD_OPTIONS="-m 0 -r514"
> > KLOGD_OPTIONS="-x"
> > SYSLOG_UMASK=077
> >
> >> log entries preceding the crash, ...
> >
> > It's a quite busy log server, with about 70 active old style syslog
> servers
> > sending logs to it. The second it crashed it wrote 111
log-messages..
> (273
> > the second before), mostly various postfix daemons, and I'd need to
> anonymize
> > them before sharing.. Can't see anything special.
> >
> >> If logging forwarded messages, is the remote logger also rsyslog?
> >
> > No, all are RHEL3/4/5 with their default syslogd server.
> >
> >
> > -jf
> >
> > _______________________________________________
> > rsyslog mailing list
> > http://lists.adiscon.net/mailman/listinfo/rsyslog
>
> _______________________________________________
> rsyslog mailing list
> http://lists.adiscon.net/mailman/listinfo/rsyslog
v1.19.1 is crashing [ In reply to ]
Am Dienstag, den 04.09.2007, 18:58 +0200 schrieb Rainer Gerhards:
> Hi,
>
> I noticed that one problem I received patches for (and that are now
> included in 1.19.4) is rooted in something that is not fully patched. I
> think I've found that root cause now (thanks to the info with that
> patches). The bottom line, however, is that 1.19.4 may still have some
> stability issues. However, they should surface now only in very obscure
> cases (but what is obscure...).
Well, at least it was running for almost two hours.....

I'm running RHEL5. Here is my config:
$AllowedSender UDP, 1.2.3.0/24
$AllowedSender TCP, 1.2.3.0/24
$ModLoad MySQL

$template clamavFile,"/home/local/log/%$YEAR%/%$MONTH%/%$DAY%/%HOSTNAME
%/clamav"
$template eximFile,"/home/local/log/%$YEAR%/%$MONTH%/%$DAY%/%HOSTNAME
%/exim"
$template avFile,"/home/local/log/%$YEAR%/%$MONTH%/%$DAY%/%HOSTNAME%/av"


*.* /home/local/log/all
!rsyslogd
:programname, contains, "rsyslogd" /home/local/log/rsyslogd

!spamd
:msg, contains, "prefork: child states" ~
:msg, contains, "spamd: got connection
over /opt/antispam/var/socket/spamd" ~
:msg, contains, "spamd: checking message (unknown) for exim:427" ~
:msg, contains, "spamd: handled cleanup of child pid" ~
mail.* /home/local/log/mail

!clamd
:msg, contains, "No stats for Database check - forcing reload" ~
:msg, contains, "Reading databases from /var/clamav" ~
:msg, contains, "Database correctly reloaded" ~
:msg, contains, "SelfCheck: Database status OK." ~
local5.* ?clamavFile
!exim
*.* ?eximFile
:msg, contains, "malware detected" ?avFile


--
CU,
Patrick.
v1.19.1 is crashing [ In reply to ]
On 2007-09-05, Jan-Frode Myklebust <janfrode at tanso.net> wrote:
> On 2007-09-04, Rainer Gerhards <rgerhards at hq.adiscon.com> wrote:
>>
>> I'd still appreciate if you could apply 1.19.4 and tell me the outcome.
>> I am now working on fixing the root cause. That might take a short
>> while, as I am thinking about the best *design* to fix the issue.
>
> OK, thanks. I've upgraded my loghost to 1.19.4 now, will let you
> know if it fails again.

It failed again yesterday:

*** glibc detected *** rsyslogd: corrupted double-linked list: 0xb7209028 ***
======= Backtrace: =========
/lib/libc.so.6[0x4152ce3e]
/lib/libc.so.6(cfree+0x90)[0x415305d0]
rsyslogd(MsgDestruct+0x73)[0x8057e93]
rsyslogd[0x804de4a]
rsyslogd(llExecFunc+0x3f)[0x805eb0f]
rsyslogd[0x804d8aa]
rsyslogd[0x804d9d7]
/lib/libpthread.so.0[0x416112db]
/lib/libc.so.6(clone+0x5e)[0x4159414e]

I have "mon" monitoring that rsyslogd is running, and restart it when
it fails. "mon" restarted rsyslogd twice (Thu Sep 6 20:38, and Fri
Sep 7 03:27), but I can't find any backtrace from the second crash..


-jf
v1.19.1 is crashing [ In reply to ]
Thanks for the feedback. I will probably release a new version today. It
has an important fix, which hopefully solves this issue. The bad thing
is that I can not reproduce the problem in my lab, so I am basically
back to reviewing code and listening to your feedback ;) I have one more
area (in the same class) under suspicion. But maybe I do not change that
before trying out the current code change.

As a side-note, the *actual* root cause was a too-complex internal API,
which lead to wrong calling sequences in some parts of the code. I have
now re-structured the API and revisited all places where it was called.
There is another similar API and this is what I am currently reviewing.
I am not sure I like to change that API without real need, because it is
used a lot and any such change of course has new bug potential.

Rainer

> -----Original Message-----
> From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog-
> bounces at lists.adiscon.com] On Behalf Of Jan-Frode Myklebust
> Sent: Friday, September 07, 2007 8:51 AM
> To: rsyslog at lists.adiscon.com
> Subject: Re: [rsyslog] v1.19.1 is crashing
>
> On 2007-09-05, Jan-Frode Myklebust <janfrode at tanso.net> wrote:
> > On 2007-09-04, Rainer Gerhards <rgerhards at hq.adiscon.com> wrote:
> >>
> >> I'd still appreciate if you could apply 1.19.4 and tell me the
> outcome.
> >> I am now working on fixing the root cause. That might take a short
> >> while, as I am thinking about the best *design* to fix the issue.
> >
> > OK, thanks. I've upgraded my loghost to 1.19.4 now, will let you
> > know if it fails again.
>
> It failed again yesterday:
>
> *** glibc detected *** rsyslogd: corrupted double-linked list:
> 0xb7209028 ***
> ======= Backtrace: =========
> /lib/libc.so.6[0x4152ce3e]
> /lib/libc.so.6(cfree+0x90)[0x415305d0]
> rsyslogd(MsgDestruct+0x73)[0x8057e93]
> rsyslogd[0x804de4a]
> rsyslogd(llExecFunc+0x3f)[0x805eb0f]
> rsyslogd[0x804d8aa]
> rsyslogd[0x804d9d7]
> /lib/libpthread.so.0[0x416112db]
> /lib/libc.so.6(clone+0x5e)[0x4159414e]
>
> I have "mon" monitoring that rsyslogd is running, and restart it when
> it fails. "mon" restarted rsyslogd twice (Thu Sep 6 20:38, and Fri
> Sep 7 03:27), but I can't find any backtrace from the second crash..
>
>
> -jf
>
> _______________________________________________
> rsyslog mailing list
> http://lists.adiscon.net/mailman/listinfo/rsyslog