Mailing List Archive

Re: abort case
Lorenzo,

I have done a change to the message handler, it is now using an atomic
memory access operation where it previously did not. I have just pushed
that change as part of the helgrind branch.

I would appreciate if you could give it a try and send me (via private
mail) a helgrind log.

I don't think that this change is sufficient to resolve the issue. But I
would like to take a granular approach so that I can be sure to find the
real root cause.

Thanks,
Rainer

On Tue, 2008-09-23 at 14:04 +0200, Rainer Gerhards wrote:
> ---------- Forwarded message ----------
> From: Lorenzo M. Catucci <lorenzo@sancho.ccd.uniroma2.it>
> Date: Tue, Sep 23, 2008 at 2:00 PM
> Subject: Re: Fwd: [Valgrind-users] helgrind points to race in
> pthread_mutex_lock?
> To: Rainer Gerhards <rgerhards@gmail.com>
>
> RG> Lorenzo,
> RG>
> RG> any chance you could do what is described below? (I guess "no", but I
> RG> thought I ask ;)).
> RG>
>
> It's running! I didn't recompile, since I didn't find anything new within
> the helgrind branch.
>
> I'm enclosing the very first "race" message, let me know if this does seem
> better.
>
> Thank you, and keep on course. Yours,
>
> lorenzo


_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
Re: abort case [ In reply to ]
On Tue, 23 Sep 2008, Rainer Gerhards wrote:

RG> Lorenzo,
RG>
RG> I have done a change to the message handler, it is now using an atomic
RG> memory access operation where it previously did not. I have just pushed
RG> that change as part of the helgrind branch.
RG>

Recompiled/reinstalled/restarted. make did recompile msg.c, I don't know
why rsyslogd didn't get relinked, but I forced relink by removing rsyslogd
from tools.

This time, the server crashed even though it was being helground...

Find enclosed the full debug log.

Yours,

lorenzo

+-------------------------+----------------------------------------------+
| Lorenzo M. Catucci | Centro di Calcolo e Documentazione |
| catucci@ccd.uniroma2.it | Università degli Studi di Roma "Tor Vergata" |
| | Via O. Raimondo 18 ** I-00173 ROMA ** ITALY |
| Tel. +39 06 7259 2255 | Fax. +39 06 7259 2125 |
+-------------------------+----------------------------------------------+
Re: abort case [ In reply to ]
Lorenzo,

I have created a new version with one slight change, to be found in the helgrind branch.

There is also a new valgrind tool called drd inside the valgrind development tree. I think you already downloaded that tree. If so, could you please replace

Valgrind --tool=helgrind .. rsyslogd ...

with

Valgrind --tool=drd .. rsyslogd ...

Drd does an even better job than helgrind. I also changed the source to remove some debug-system related warning, which otherwise would clutter up the error message. But unfortunately, in my lab I did not find any more problems, except for the small change I mentioned. But that one affects program termination (and in a very subtle way), so it should not change anything for you. I'd still be interested in a new run, including debug info, from you.

Thanks,
Rainer

> -----Original Message-----
> From: rsyslog-bounces@lists.adiscon.com [mailto:rsyslog-
> bounces@lists.adiscon.com] On Behalf Of Lorenzo M. Catucci
> Sent: Tuesday, September 23, 2008 3:09 PM
> To: rsyslog-users
> Subject: Re: [rsyslog] abort case
>
> On Tue, 23 Sep 2008, Rainer Gerhards wrote:
>
> RG> Lorenzo,
> RG>
> RG> I have done a change to the message handler, it is now using an
> atomic
> RG> memory access operation where it previously did not. I have just
> pushed
> RG> that change as part of the helgrind branch.
> RG>
>
> Recompiled/reinstalled/restarted. make did recompile msg.c, I don't
> know
> why rsyslogd didn't get relinked, but I forced relink by removing
> rsyslogd
> from tools.
>
> This time, the server crashed even though it was being helground...
>
> Find enclosed the full debug log.
>
> Yours,
>
> lorenzo
>
> +-------------------------+--------------------------------------------
> --+
> | Lorenzo M. Catucci | Centro di Calcolo e Documentazione
> |
> | catucci@ccd.uniroma2.it | Università degli Studi di Roma "Tor
> Vergata" |
> | | Via O. Raimondo 18 ** I-00173 ROMA **
> ITALY |
> | Tel. +39 06 7259 2255 | Fax. +39 06 7259 2125
> |
> +-------------------------+--------------------------------------------
> --+
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
Re: abort case [ In reply to ]
[going back to the list with Lorenzo's permission]

Hi Lorenzo,

finally... I did not realize that the queue is running in direct mode, but now I saw it and can reproduce a failure. That should make things much better. Anyhow, it is very unusual that the main message queue runs in direct mode, so I guess it is a side-effect of a config file. In those files you sent me, I don't see it, but I see a .d directory is included. May there be something? In any case, I hope I will be able to fix the bug now that I can reproduce it.

Looks like a productive day today :)

Thanks for your persistence,
Rainer

> -----Original Message-----
> From: Lorenzo M. Catucci [mailto:lorenzo@sancho.ccd.uniroma2.it]
> Sent: Wednesday, October 01, 2008 1:46 PM
> To: Rainer Gerhards
> Subject: Re: [rsyslog] abort case
>
> Sorry for the delay, I've just run helgrind's head, bau as you can see,
> it
> crashed almost as soon as I started.
>
> The log is enclosed; I'm now checking-out head and will retry.
>
> Yours,
>
> lorenzo
>
> RG> I have created a new version with one slight change, to be found in
> RG> the helgrind branch.
> RG>
> RG> There is also a new valgrind tool called drd inside the valgrind
> RG> development tree. I think you already downloaded that tree. If so,
> RG> could you please replace
> RG>
> RG> Valgrind --tool=helgrind .. rsyslogd ...
> RG>
> RG> with
> RG>
> RG> Valgrind --tool=drd .. rsyslogd ...
> RG>
> RG> Drd does an even better job than helgrind. I also changed the
> source
> RG> to remove some debug-system related warning, which otherwise would
> RG> clutter up the error message. But unfortunately, in my lab I did
> not
> RG> find any more problems, except for the small change I mentioned.
> But
> RG> that one affects program termination (and in a very subtle way), so
> it
> RG> should not change anything for you. I'd still be interested in a
> new
> RG> run, including debug info, from you.
> RG>
>
>
> +-------------------------+--------------------------------------------
> --+
> | Lorenzo M. Catucci | Centro di Calcolo e Documentazione
> |
> | catucci@ccd.uniroma2.it | Università degli Studi di Roma "Tor
> Vergata" |
> | | Via O. Raimondo 18 ** I-00173 ROMA **
> ITALY |
> | Tel. +39 06 7259 2255 | Fax. +39 06 7259 2125
> |
> +-------------------------+--------------------------------------------
> --+
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com
Re: abort case [ In reply to ]
As I thought... stupid error. Please pull helgrind branch again and give it another try ;)

Rainer

> -----Original Message-----
> From: rsyslog-bounces@lists.adiscon.com [mailto:rsyslog-
> bounces@lists.adiscon.com] On Behalf Of Rainer Gerhards
> Sent: Wednesday, October 01, 2008 6:52 PM
> To: Lorenzo M. Catucci
> Cc: rsyslog-users
> Subject: Re: [rsyslog] abort case
>
> [going back to the list with Lorenzo's permission]
>
> Hi Lorenzo,
>
> finally... I did not realize that the queue is running in direct mode,
> but now I saw it and can reproduce a failure. That should make things
> much better. Anyhow, it is very unusual that the main message queue
> runs in direct mode, so I guess it is a side-effect of a config file.
> In those files you sent me, I don't see it, but I see a .d directory is
> included. May there be something? In any case, I hope I will be able to
> fix the bug now that I can reproduce it.
>
> Looks like a productive day today :)
>
> Thanks for your persistence,
> Rainer
>
> > -----Original Message-----
> > From: Lorenzo M. Catucci [mailto:lorenzo@sancho.ccd.uniroma2.it]
> > Sent: Wednesday, October 01, 2008 1:46 PM
> > To: Rainer Gerhards
> > Subject: Re: [rsyslog] abort case
> >
> > Sorry for the delay, I've just run helgrind's head, bau as you can
> see,
> > it
> > crashed almost as soon as I started.
> >
> > The log is enclosed; I'm now checking-out head and will retry.
> >
> > Yours,
> >
> > lorenzo
> >
> > RG> I have created a new version with one slight change, to be found
> in
> > RG> the helgrind branch.
> > RG>
> > RG> There is also a new valgrind tool called drd inside the valgrind
> > RG> development tree. I think you already downloaded that tree. If
> so,
> > RG> could you please replace
> > RG>
> > RG> Valgrind --tool=helgrind .. rsyslogd ...
> > RG>
> > RG> with
> > RG>
> > RG> Valgrind --tool=drd .. rsyslogd ...
> > RG>
> > RG> Drd does an even better job than helgrind. I also changed the
> > source
> > RG> to remove some debug-system related warning, which otherwise
> would
> > RG> clutter up the error message. But unfortunately, in my lab I did
> > not
> > RG> find any more problems, except for the small change I mentioned.
> > But
> > RG> that one affects program termination (and in a very subtle way),
> so
> > it
> > RG> should not change anything for you. I'd still be interested in a
> > new
> > RG> run, including debug info, from you.
> > RG>
> >
> >
> > +-------------------------+------------------------------------------
> --
> > --+
> > | Lorenzo M. Catucci | Centro di Calcolo e Documentazione
> > |
> > | catucci@ccd.uniroma2.it | Università degli Studi di Roma "Tor
> > Vergata" |
> > | | Via O. Raimondo 18 ** I-00173 ROMA **
> > ITALY |
> > | Tel. +39 06 7259 2255 | Fax. +39 06 7259 2125
> > |
> > +-------------------------+------------------------------------------
> --
> > --+
> _______________________________________________
> rsyslog mailing list
> http://lists.adiscon.net/mailman/listinfo/rsyslog
> http://www.rsyslog.com
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com
Re: abort case [ In reply to ]
On Wed, 1 Oct 2008, Rainer Gerhards wrote:

RG> [going back to the list with Lorenzo's permission]
RG>
[...]
RG>
RG> finally... I did not realize that the queue is running in direct mode,
RG> but now I saw it and can reproduce a failure. That should make things
RG> much better. Anyhow, it is very unusual that the main message queue
RG> runs in direct mode, so I guess it is a side-effect of a config file.
RG>

No, they are really the only config files present on the server
(rsyslog.conf is in /etc, the other one is the only thing inside
/etc/rsyslog.d/)

Really, I wouldn't go back to a saner configuration until we zeroed on the
bug... still, nowhere I touched the MainQueue options!

Hope to hear from you soon! Yours,

lorenzo

RG>
RG> In those files you sent me, I don't see it, but I see a .d directory
RG> is included. May there be something? In any case, I hope I will be
RG> able to fix the bug now that I can reproduce it.
RG>



+-------------------------+----------------------------------------------+
| Lorenzo M. Catucci | Centro di Calcolo e Documentazione |
| catucci@ccd.uniroma2.it | Università degli Studi di Roma "Tor Vergata" |
| | Via O. Raimondo 18 ** I-00173 ROMA ** ITALY |
| Tel. +39 06 7259 2255 | Fax. +39 06 7259 2125 |
+-------------------------+----------------------------------------------+
Re: abort case [ In reply to ]
It just crashed under drd (sending both direct and through the list, it
should get to you!). I've restarted under helgrind, just for the sake of
not wasting the upcoming night...

Hear you tomorrow! Yours,

lorenzo

On Wed, 1 Oct 2008, Rainer Gerhards wrote:

RG> As I thought... stupid error. Please pull helgrind branch again and give it another try ;)
RG>
RG> Rainer
RG>
RG> > -----Original Message-----
RG> > From: rsyslog-bounces@lists.adiscon.com [mailto:rsyslog-
RG> > bounces@lists.adiscon.com] On Behalf Of Rainer Gerhards
RG> > Sent: Wednesday, October 01, 2008 6:52 PM
RG> > To: Lorenzo M. Catucci
RG> > Cc: rsyslog-users
RG> > Subject: Re: [rsyslog] abort case
RG> >
RG> > [going back to the list with Lorenzo's permission]
RG> >
RG> > Hi Lorenzo,
RG> >
RG> > finally... I did not realize that the queue is running in direct mode,
RG> > but now I saw it and can reproduce a failure. That should make things
RG> > much better. Anyhow, it is very unusual that the main message queue
RG> > runs in direct mode, so I guess it is a side-effect of a config file.
RG> > In those files you sent me, I don't see it, but I see a .d directory is
RG> > included. May there be something? In any case, I hope I will be able to
RG> > fix the bug now that I can reproduce it.
RG> >
RG> > Looks like a productive day today :)
RG> >
RG> > Thanks for your persistence,
RG> > Rainer
RG> >
RG> > > -----Original Message-----
RG> > > From: Lorenzo M. Catucci [mailto:lorenzo@sancho.ccd.uniroma2.it]
RG> > > Sent: Wednesday, October 01, 2008 1:46 PM
RG> > > To: Rainer Gerhards
RG> > > Subject: Re: [rsyslog] abort case
RG> > >
RG> > > Sorry for the delay, I've just run helgrind's head, bau as you can
RG> > see,
RG> > > it
RG> > > crashed almost as soon as I started.
RG> > >
RG> > > The log is enclosed; I'm now checking-out head and will retry.
RG> > >
RG> > > Yours,
RG> > >
RG> > > lorenzo
RG> > >
RG> > > RG> I have created a new version with one slight change, to be found
RG> > in
RG> > > RG> the helgrind branch.
RG> > > RG>
RG> > > RG> There is also a new valgrind tool called drd inside the valgrind
RG> > > RG> development tree. I think you already downloaded that tree. If
RG> > so,
RG> > > RG> could you please replace
RG> > > RG>
RG> > > RG> Valgrind --tool=helgrind .. rsyslogd ...
RG> > > RG>
RG> > > RG> with
RG> > > RG>
RG> > > RG> Valgrind --tool=drd .. rsyslogd ...
RG> > > RG>
RG> > > RG> Drd does an even better job than helgrind. I also changed the
RG> > > source
RG> > > RG> to remove some debug-system related warning, which otherwise
RG> > would
RG> > > RG> clutter up the error message. But unfortunately, in my lab I did
RG> > > not
RG> > > RG> find any more problems, except for the small change I mentioned.
RG> > > But
RG> > > RG> that one affects program termination (and in a very subtle way),
RG> > so
RG> > > it
RG> > > RG> should not change anything for you. I'd still be interested in a
RG> > > new
RG> > > RG> run, including debug info, from you.
RG> > > RG>
RG> > >
RG> > >
RG> > > +-------------------------+------------------------------------------
RG> > --
RG> > > --+
RG> > > | Lorenzo M. Catucci | Centro di Calcolo e Documentazione
RG> > > |
RG> > > | catucci@ccd.uniroma2.it | Università degli Studi di Roma "Tor
RG> > > Vergata" |
RG> > > | | Via O. Raimondo 18 ** I-00173 ROMA **
RG> > > ITALY |
RG> > > | Tel. +39 06 7259 2255 | Fax. +39 06 7259 2125
RG> > > |
RG> > > +-------------------------+------------------------------------------
RG> > --
RG> > > --+
RG> > _______________________________________________
RG> > rsyslog mailing list
RG> > http://lists.adiscon.net/mailman/listinfo/rsyslog
RG> > http://www.rsyslog.com
RG> _______________________________________________
RG> rsyslog mailing list
RG> http://lists.adiscon.net/mailman/listinfo/rsyslog
RG> http://www.rsyslog.com
RG>

+-------------------------+----------------------------------------------+
| Lorenzo M. Catucci | Centro di Calcolo e Documentazione |
| catucci@ccd.uniroma2.it | Università degli Studi di Roma "Tor Vergata" |
| | Via O. Raimondo 18 ** I-00173 ROMA ** ITALY |
| Tel. +39 06 7259 2255 | Fax. +39 06 7259 2125 |
+-------------------------+----------------------------------------------+
Re: abort case [ In reply to ]
This time I attach the file! Sorry!

On Wed, 1 Oct 2008, Lorenzo M. Catucci wrote:

LMC> It just crashed under drd (sending both direct and through the list, it
LMC> should get to you!). I've restarted under helgrind, just for the sake of
LMC> not wasting the upcoming night...
LMC>
LMC> Hear you tomorrow! Yours,
LMC>
LMC> lorenzo
LMC>
LMC> On Wed, 1 Oct 2008, Rainer Gerhards wrote:
LMC>
LMC> RG> As I thought... stupid error. Please pull helgrind branch again and give it another try ;)
LMC> RG>
LMC> RG> Rainer
LMC> RG>
LMC> RG> > -----Original Message-----
LMC> RG> > From: rsyslog-bounces@lists.adiscon.com [mailto:rsyslog-
LMC> RG> > bounces@lists.adiscon.com] On Behalf Of Rainer Gerhards
LMC> RG> > Sent: Wednesday, October 01, 2008 6:52 PM
LMC> RG> > To: Lorenzo M. Catucci
LMC> RG> > Cc: rsyslog-users
LMC> RG> > Subject: Re: [rsyslog] abort case
LMC> RG> >
LMC> RG> > [going back to the list with Lorenzo's permission]
LMC> RG> >
LMC> RG> > Hi Lorenzo,
LMC> RG> >
LMC> RG> > finally... I did not realize that the queue is running in direct mode,
LMC> RG> > but now I saw it and can reproduce a failure. That should make things
LMC> RG> > much better. Anyhow, it is very unusual that the main message queue
LMC> RG> > runs in direct mode, so I guess it is a side-effect of a config file.
LMC> RG> > In those files you sent me, I don't see it, but I see a .d directory is
LMC> RG> > included. May there be something? In any case, I hope I will be able to
LMC> RG> > fix the bug now that I can reproduce it.
LMC> RG> >
LMC> RG> > Looks like a productive day today :)
LMC> RG> >
LMC> RG> > Thanks for your persistence,
LMC> RG> > Rainer
LMC> RG> >
LMC> RG> > > -----Original Message-----
LMC> RG> > > From: Lorenzo M. Catucci [mailto:lorenzo@sancho.ccd.uniroma2.it]
LMC> RG> > > Sent: Wednesday, October 01, 2008 1:46 PM
LMC> RG> > > To: Rainer Gerhards
LMC> RG> > > Subject: Re: [rsyslog] abort case
LMC> RG> > >
LMC> RG> > > Sorry for the delay, I've just run helgrind's head, bau as you can
LMC> RG> > see,
LMC> RG> > > it
LMC> RG> > > crashed almost as soon as I started.
LMC> RG> > >
LMC> RG> > > The log is enclosed; I'm now checking-out head and will retry.
LMC> RG> > >
LMC> RG> > > Yours,
LMC> RG> > >
LMC> RG> > > lorenzo
LMC> RG> > >
LMC> RG> > > RG> I have created a new version with one slight change, to be found
LMC> RG> > in
LMC> RG> > > RG> the helgrind branch.
LMC> RG> > > RG>
LMC> RG> > > RG> There is also a new valgrind tool called drd inside the valgrind
LMC> RG> > > RG> development tree. I think you already downloaded that tree. If
LMC> RG> > so,
LMC> RG> > > RG> could you please replace
LMC> RG> > > RG>
LMC> RG> > > RG> Valgrind --tool=helgrind .. rsyslogd ...
LMC> RG> > > RG>
LMC> RG> > > RG> with
LMC> RG> > > RG>
LMC> RG> > > RG> Valgrind --tool=drd .. rsyslogd ...
LMC> RG> > > RG>
LMC> RG> > > RG> Drd does an even better job than helgrind. I also changed the
LMC> RG> > > source
LMC> RG> > > RG> to remove some debug-system related warning, which otherwise
LMC> RG> > would
LMC> RG> > > RG> clutter up the error message. But unfortunately, in my lab I did
LMC> RG> > > not
LMC> RG> > > RG> find any more problems, except for the small change I mentioned.
LMC> RG> > > But
LMC> RG> > > RG> that one affects program termination (and in a very subtle way),
LMC> RG> > so
LMC> RG> > > it
LMC> RG> > > RG> should not change anything for you. I'd still be interested in a
LMC> RG> > > new
LMC> RG> > > RG> run, including debug info, from you.
LMC> RG> > > RG>
LMC> RG> > >
LMC> RG> > >
LMC> RG> > > +-------------------------+------------------------------------------
LMC> RG> > --
LMC> RG> > > --+
LMC> RG> > > | Lorenzo M. Catucci | Centro di Calcolo e Documentazione
LMC> RG> > > |
LMC> RG> > > | catucci@ccd.uniroma2.it | Università degli Studi di Roma "Tor
LMC> RG> > > Vergata" |
LMC> RG> > > | | Via O. Raimondo 18 ** I-00173 ROMA **
LMC> RG> > > ITALY |
LMC> RG> > > | Tel. +39 06 7259 2255 | Fax. +39 06 7259 2125
LMC> RG> > > |
LMC> RG> > > +-------------------------+------------------------------------------
LMC> RG> > --
LMC> RG> > > --+
LMC> RG> > _______________________________________________
LMC> RG> > rsyslog mailing list
LMC> RG> > http://lists.adiscon.net/mailman/listinfo/rsyslog
LMC> RG> > http://www.rsyslog.com
LMC> RG> _______________________________________________
LMC> RG> rsyslog mailing list
LMC> RG> http://lists.adiscon.net/mailman/listinfo/rsyslog
LMC> RG> http://www.rsyslog.com
LMC> RG>
LMC>
LMC> +-------------------------+----------------------------------------------+
LMC> | Lorenzo M. Catucci | Centro di Calcolo e Documentazione |
LMC> | catucci@ccd.uniroma2.it | Università degli Studi di Roma "Tor Vergata" |
LMC> | | Via O. Raimondo 18 ** I-00173 ROMA ** ITALY |
LMC> | Tel. +39 06 7259 2255 | Fax. +39 06 7259 2125 |
LMC> +-------------------------+----------------------------------------------+

+-------------------------+----------------------------------------------+
| Lorenzo M. Catucci | Centro di Calcolo e Documentazione |
| catucci@ccd.uniroma2.it | Università degli Studi di Roma "Tor Vergata" |
| | Via O. Raimondo 18 ** I-00173 ROMA ** ITALY |
| Tel. +39 06 7259 2255 | Fax. +39 06 7259 2125 |
+-------------------------+----------------------------------------------+