Mailing List Archive

Segfault after recursion too deeply
Hi there.

Just as an interesting fact, after a recent change in configuration I
started receiving segfaults.

segfault at 7fb3ed112ff8 ip 00007fb4323946fe sp 00007fb3ed113000 error 6
in libfastjson.so.4.2.0[7fb432392000+a000]

After some digging I pinpointed the culprit (but I must say that the
origin of the segfault - libfastjson was puzzling here).

My configuration is relatively complicated (over 6,5k lines, including
lookups) so I won't go into much detail here but the main thing is that
in the course of message processing I do a lookup on a specific part of
the message to decide which ruleset to run. Something like this:

ruleset(name="do_a_lookup")

{

    set $.ruleset = lookup ("hostname-to-ruleset", $fromhost-ip);

    call_indirect $.ruleset;

}

And unfortunately for some messages my lookup returned a "do_a_lookup"
ruleset name to run again and again (of course, since between subsequent
calls the origin of the message didn't change).

It was definitely my mistake and I was quite lucky that it occured quite
quickly after I introduced the change (if it happened with messages I
get just once a day or even once a week, it'd be very hard to troubleshoot).

You might want to think about a ruleset call "depth counter" in order to
at least emit a warning if the recursion gets too deep. But then again -
my case is very unusual and it might not be worth introducing such
change into the code.


Best regards

--
Mariusz Kruk
Ekspert ds. Bezpiecze?stwa IT
COMP S.A.
Pion Cyberbezpiecze?stwa i Zarz?dzania Ryzykiem
e-mail: mariusz.kruk@comp.com.pl
e-mail: mariusz.kruk@safecomp.com
tel: +48 608 623 299

_______________________________________________
rsyslog mailing list
https://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.
Re: Segfault after recursion too deeply [ In reply to ]
a simple test to prevent calling the current ruleset seems like it ould be
doable, but when you start talking chains of rulesets, tracking the full depth
to prevent more indirect loops may be more difficult (or it may not be, Rainer
would need to comment on this)

I think it would be reasonable to have the default be 'no recursion', only allow
calling a ruleset once, with a possible config option to allow a configurable
recursion depth.

when the depth is exceeded, I would have rsyslog log a message and treat the
call as a noop (the other possible option would be to treat it as a stop), and
it would be fantastic if there was a 'bad logs' file that could be defined that
we could have messages that trigger bad things like this written to (in json
format with all metadata, etc) for debugging purposes

David Lang

On Fri, 16 Jul 2021, Mariusz Kruk via rsyslog
wrote:

> Date: Fri, 16 Jul 2021 14:23:32 +0200
> From: Mariusz Kruk via rsyslog <rsyslog@lists.adiscon.com>
> To: rsyslog@lists.adiscon.com
> Cc: Mariusz Kruk <mkr@safecomp.com>
> Subject: [rsyslog] Segfault after recursion too deeply
>
> Hi there.
>
> Just as an interesting fact, after a recent change in configuration I
> started receiving segfaults.
>
> segfault at 7fb3ed112ff8 ip 00007fb4323946fe sp 00007fb3ed113000 error 6
> in libfastjson.so.4.2.0[7fb432392000+a000]
>
> After some digging I pinpointed the culprit (but I must say that the
> origin of the segfault - libfastjson was puzzling here).
>
> My configuration is relatively complicated (over 6,5k lines, including
> lookups) so I won't go into much detail here but the main thing is that
> in the course of message processing I do a lookup on a specific part of
> the message to decide which ruleset to run. Something like this:
>
> ruleset(name="do_a_lookup")
>
> {
>
>     set $.ruleset = lookup ("hostname-to-ruleset", $fromhost-ip);
>
>     call_indirect $.ruleset;
>
> }
>
> And unfortunately for some messages my lookup returned a "do_a_lookup"
> ruleset name to run again and again (of course, since between subsequent
> calls the origin of the message didn't change).
>
> It was definitely my mistake and I was quite lucky that it occured quite
> quickly after I introduced the change (if it happened with messages I
> get just once a day or even once a week, it'd be very hard to troubleshoot).
>
> You might want to think about a ruleset call "depth counter" in order to
> at least emit a warning if the recursion gets too deep. But then again -
> my case is very unusual and it might not be worth introducing such
> change into the code.
>
>
> Best regards
>
>
_______________________________________________
rsyslog mailing list
https://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.
Re: Segfault after recursion too deeply [ In reply to ]
A wasn't even thinking about testing for calling current ruleset since
it seems overly complicated.

Just a simple counter increased by one on every call/call_indirect and a
check whether or not it exceeds a given (configurable?) limit.

But then again - as I said - my use case is quite complicated and the
whole first mail was more of a way to archive the problem since someone
might encounter it on their own and not be able to pinpont the cause
quickly. The case of ruleset looping might be so rare that it's not
worth adding such check.

Best regards

MK


On 16.07.2021 19:15, David Lang wrote:
> a simple test to prevent calling the current ruleset seems like it
> ould be doable, but when you start talking chains of rulesets,
> tracking the full depth to prevent more indirect loops may be more
> difficult (or it may not be, Rainer would need to comment on this)
>
> I think it would be reasonable to have the default be 'no recursion',
> only allow calling a ruleset once, with a possible config option to
> allow a configurable recursion depth.
>
> when the depth is exceeded, I would have rsyslog log a message and
> treat the call as a noop (the other possible option would be to treat
> it as a stop), and it would be fantastic if there was a 'bad logs'
> file that could be defined that we could have messages that trigger
> bad things like this written to (in json format with all metadata,
> etc) for debugging purposes
>
> David Lang
>
>  On Fri, 16 Jul 2021, Mariusz Kruk via rsyslog wrote:
>
>> Date: Fri, 16 Jul 2021 14:23:32 +0200
>> From: Mariusz Kruk via rsyslog <rsyslog@lists.adiscon.com>
>> To: rsyslog@lists.adiscon.com
>> Cc: Mariusz Kruk <mkr@safecomp.com>
>> Subject: [rsyslog] Segfault after recursion too deeply
>>
>> Hi there.
>>
>> Just as an interesting fact, after a recent change in configuration I
>> started receiving segfaults.
>>
>> segfault at 7fb3ed112ff8 ip 00007fb4323946fe sp 00007fb3ed113000
>> error 6 in libfastjson.so.4.2.0[7fb432392000+a000]
>>
>> After some digging I pinpointed the culprit (but I must say that the
>> origin of the segfault - libfastjson was puzzling here).
>>
>> My configuration is relatively complicated (over 6,5k lines,
>> including lookups) so I won't go into much detail here but the main
>> thing is that in the course of message processing I do a lookup on a
>> specific part of the message to decide which ruleset to run.
>> Something like this:
>>
>> ruleset(name="do_a_lookup")
>>
>> {
>>
>>     set $.ruleset = lookup ("hostname-to-ruleset", $fromhost-ip);
>>
>>     call_indirect $.ruleset;
>>
>> }
>>
>> And unfortunately for some messages my lookup returned a
>> "do_a_lookup" ruleset name to run again and again (of course, since
>> between subsequent calls the origin of the message didn't change).
>>
>> It was definitely my mistake and I was quite lucky that it occured
>> quite quickly after I introduced the change (if it happened with
>> messages I get just once a day or even once a week, it'd be very hard
>> to troubleshoot).
>>
>> You might want to think about a ruleset call "depth counter" in order
>> to at least emit a warning if the recursion gets too deep. But then
>> again - my case is very unusual and it might not be worth introducing
>> such change into the code.
>>
>>
>> Best regards
>>
>>
_______________________________________________
rsyslog mailing list
https://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.
Re: Segfault after recursion too deeply [ In reply to ]
I think that should be doable without too much trouble. I would
appreciate a github issue tracker.

Side-Note: I never intended to support general purpose loops, but of
course there is always a way to do it ;-)

Rainer

El sáb, 17 jul 2021 a las 16:17, Mariusz Kruk via rsyslog
(<rsyslog@lists.adiscon.com>) escribió:
>
> A wasn't even thinking about testing for calling current ruleset since
> it seems overly complicated.
>
> Just a simple counter increased by one on every call/call_indirect and a
> check whether or not it exceeds a given (configurable?) limit.
>
> But then again - as I said - my use case is quite complicated and the
> whole first mail was more of a way to archive the problem since someone
> might encounter it on their own and not be able to pinpont the cause
> quickly. The case of ruleset looping might be so rare that it's not
> worth adding such check.
>
> Best regards
>
> MK
>
>
> On 16.07.2021 19:15, David Lang wrote:
> > a simple test to prevent calling the current ruleset seems like it
> > ould be doable, but when you start talking chains of rulesets,
> > tracking the full depth to prevent more indirect loops may be more
> > difficult (or it may not be, Rainer would need to comment on this)
> >
> > I think it would be reasonable to have the default be 'no recursion',
> > only allow calling a ruleset once, with a possible config option to
> > allow a configurable recursion depth.
> >
> > when the depth is exceeded, I would have rsyslog log a message and
> > treat the call as a noop (the other possible option would be to treat
> > it as a stop), and it would be fantastic if there was a 'bad logs'
> > file that could be defined that we could have messages that trigger
> > bad things like this written to (in json format with all metadata,
> > etc) for debugging purposes
> >
> > David Lang
> >
> > On Fri, 16 Jul 2021, Mariusz Kruk via rsyslog wrote:
> >
> >> Date: Fri, 16 Jul 2021 14:23:32 +0200
> >> From: Mariusz Kruk via rsyslog <rsyslog@lists.adiscon.com>
> >> To: rsyslog@lists.adiscon.com
> >> Cc: Mariusz Kruk <mkr@safecomp.com>
> >> Subject: [rsyslog] Segfault after recursion too deeply
> >>
> >> Hi there.
> >>
> >> Just as an interesting fact, after a recent change in configuration I
> >> started receiving segfaults.
> >>
> >> segfault at 7fb3ed112ff8 ip 00007fb4323946fe sp 00007fb3ed113000
> >> error 6 in libfastjson.so.4.2.0[7fb432392000+a000]
> >>
> >> After some digging I pinpointed the culprit (but I must say that the
> >> origin of the segfault - libfastjson was puzzling here).
> >>
> >> My configuration is relatively complicated (over 6,5k lines,
> >> including lookups) so I won't go into much detail here but the main
> >> thing is that in the course of message processing I do a lookup on a
> >> specific part of the message to decide which ruleset to run.
> >> Something like this:
> >>
> >> ruleset(name="do_a_lookup")
> >>
> >> {
> >>
> >> set $.ruleset = lookup ("hostname-to-ruleset", $fromhost-ip);
> >>
> >> call_indirect $.ruleset;
> >>
> >> }
> >>
> >> And unfortunately for some messages my lookup returned a
> >> "do_a_lookup" ruleset name to run again and again (of course, since
> >> between subsequent calls the origin of the message didn't change).
> >>
> >> It was definitely my mistake and I was quite lucky that it occured
> >> quite quickly after I introduced the change (if it happened with
> >> messages I get just once a day or even once a week, it'd be very hard
> >> to troubleshoot).
> >>
> >> You might want to think about a ruleset call "depth counter" in order
> >> to at least emit a warning if the recursion gets too deep. But then
> >> again - my case is very unusual and it might not be worth introducing
> >> such change into the code.
> >>
> >>
> >> Best regards
> >>
> >>
> _______________________________________________
> rsyslog mailing list
> https://lists.adiscon.net/mailman/listinfo/rsyslog
> http://www.rsyslog.com/professional-services/
> What's up with rsyslog? Follow https://twitter.com/rgerhards
> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.
_______________________________________________
rsyslog mailing list
https://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.
Re: Segfault after recursion too deeply [ In reply to ]
Well. I understand that it's always a delicate balance between speed
(and thus implementing only the important and efficient functionalities)
and flexibility.

For now I managed with what is built-in. The case I'm talking about is
not even a loop as such. The loop occured as a side effect of a slight
misconfiguration (as I wrote earlier - the lookup was _not_ supposed to
point back to the calling ruleset). The general idea was to have a
branching execution path. The loop was an accident. But yes, I do have a
loop mechanism in one of my installation but luckily I process only
dozens of events per second there, not thousands. ;-)

I'll create an issue in a minute.

Best regards,

MK

On 18.07.2021 19:16, Rainer Gerhards wrote:
> I think that should be doable without too much trouble. I would
> appreciate a github issue tracker.
>
> Side-Note: I never intended to support general purpose loops, but of
> course there is always a way to do it ;-)
>
> Rainer
>
> El sáb, 17 jul 2021 a las 16:17, Mariusz Kruk via rsyslog
> (<rsyslog@lists.adiscon.com>) escribió:
>> A wasn't even thinking about testing for calling current ruleset since
>> it seems overly complicated.
>>
>> Just a simple counter increased by one on every call/call_indirect and a
>> check whether or not it exceeds a given (configurable?) limit.
>>
>> But then again - as I said - my use case is quite complicated and the
>> whole first mail was more of a way to archive the problem since someone
>> might encounter it on their own and not be able to pinpont the cause
>> quickly. The case of ruleset looping might be so rare that it's not
>> worth adding such check.
>>
>> Best regards
>>
>> MK
>>
>>
>> On 16.07.2021 19:15, David Lang wrote:
>>> a simple test to prevent calling the current ruleset seems like it
>>> ould be doable, but when you start talking chains of rulesets,
>>> tracking the full depth to prevent more indirect loops may be more
>>> difficult (or it may not be, Rainer would need to comment on this)
>>>
>>> I think it would be reasonable to have the default be 'no recursion',
>>> only allow calling a ruleset once, with a possible config option to
>>> allow a configurable recursion depth.
>>>
>>> when the depth is exceeded, I would have rsyslog log a message and
>>> treat the call as a noop (the other possible option would be to treat
>>> it as a stop), and it would be fantastic if there was a 'bad logs'
>>> file that could be defined that we could have messages that trigger
>>> bad things like this written to (in json format with all metadata,
>>> etc) for debugging purposes
>>>
>>> David Lang
>>>
>>> On Fri, 16 Jul 2021, Mariusz Kruk via rsyslog wrote:
>>>
>>>> Date: Fri, 16 Jul 2021 14:23:32 +0200
>>>> From: Mariusz Kruk via rsyslog <rsyslog@lists.adiscon.com>
>>>> To: rsyslog@lists.adiscon.com
>>>> Cc: Mariusz Kruk <mkr@safecomp.com>
>>>> Subject: [rsyslog] Segfault after recursion too deeply
>>>>
>>>> Hi there.
>>>>
>>>> Just as an interesting fact, after a recent change in configuration I
>>>> started receiving segfaults.
>>>>
>>>> segfault at 7fb3ed112ff8 ip 00007fb4323946fe sp 00007fb3ed113000
>>>> error 6 in libfastjson.so.4.2.0[7fb432392000+a000]
>>>>
>>>> After some digging I pinpointed the culprit (but I must say that the
>>>> origin of the segfault - libfastjson was puzzling here).
>>>>
>>>> My configuration is relatively complicated (over 6,5k lines,
>>>> including lookups) so I won't go into much detail here but the main
>>>> thing is that in the course of message processing I do a lookup on a
>>>> specific part of the message to decide which ruleset to run.
>>>> Something like this:
>>>>
>>>> ruleset(name="do_a_lookup")
>>>>
>>>> {
>>>>
>>>> set $.ruleset = lookup ("hostname-to-ruleset", $fromhost-ip);
>>>>
>>>> call_indirect $.ruleset;
>>>>
>>>> }
>>>>
>>>> And unfortunately for some messages my lookup returned a
>>>> "do_a_lookup" ruleset name to run again and again (of course, since
>>>> between subsequent calls the origin of the message didn't change).
>>>>
>>>> It was definitely my mistake and I was quite lucky that it occured
>>>> quite quickly after I introduced the change (if it happened with
>>>> messages I get just once a day or even once a week, it'd be very hard
>>>> to troubleshoot).
>>>>
>>>> You might want to think about a ruleset call "depth counter" in order
>>>> to at least emit a warning if the recursion gets too deep. But then
>>>> again - my case is very unusual and it might not be worth introducing
>>>> such change into the code.
>>>>
>>>>
>>>> Best regards
>>>>
>>>>
>> _______________________________________________
>> rsyslog mailing list
>> https://lists.adiscon.net/mailman/listinfo/rsyslog
>> http://www.rsyslog.com/professional-services/
>> What's up with rsyslog? Follow https://twitter.com/rgerhards
>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.
_______________________________________________
rsyslog mailing list
https://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.
Re: Segfault after recursion too deeply [ In reply to ]
On Sun, 18 Jul 2021, Rainer Gerhards via rsyslog wrote:

> I think that should be doable without too much trouble. I would
> appreciate a github issue tracker.
>
> Side-Note: I never intended to support general purpose loops, but of
> course there is always a way to do it ;-)

that's why I suggested making the default trigger at the first attempt to loop
:-)

David Lang


> Rainer
>
> El sáb, 17 jul 2021 a las 16:17, Mariusz Kruk via rsyslog
> (<rsyslog@lists.adiscon.com>) escribió:
>>
>> A wasn't even thinking about testing for calling current ruleset since
>> it seems overly complicated.
>>
>> Just a simple counter increased by one on every call/call_indirect and a
>> check whether or not it exceeds a given (configurable?) limit.
>>
>> But then again - as I said - my use case is quite complicated and the
>> whole first mail was more of a way to archive the problem since someone
>> might encounter it on their own and not be able to pinpont the cause
>> quickly. The case of ruleset looping might be so rare that it's not
>> worth adding such check.
>>
>> Best regards
>>
>> MK
>>
>>
>> On 16.07.2021 19:15, David Lang wrote:
>> > a simple test to prevent calling the current ruleset seems like it
>> > ould be doable, but when you start talking chains of rulesets,
>> > tracking the full depth to prevent more indirect loops may be more
>> > difficult (or it may not be, Rainer would need to comment on this)
>> >
>> > I think it would be reasonable to have the default be 'no recursion',
>> > only allow calling a ruleset once, with a possible config option to
>> > allow a configurable recursion depth.
>> >
>> > when the depth is exceeded, I would have rsyslog log a message and
>> > treat the call as a noop (the other possible option would be to treat
>> > it as a stop), and it would be fantastic if there was a 'bad logs'
>> > file that could be defined that we could have messages that trigger
>> > bad things like this written to (in json format with all metadata,
>> > etc) for debugging purposes
>> >
>> > David Lang
>> >
>> > On Fri, 16 Jul 2021, Mariusz Kruk via rsyslog wrote:
>> >
>> >> Date: Fri, 16 Jul 2021 14:23:32 +0200
>> >> From: Mariusz Kruk via rsyslog <rsyslog@lists.adiscon.com>
>> >> To: rsyslog@lists.adiscon.com
>> >> Cc: Mariusz Kruk <mkr@safecomp.com>
>> >> Subject: [rsyslog] Segfault after recursion too deeply
>> >>
>> >> Hi there.
>> >>
>> >> Just as an interesting fact, after a recent change in configuration I
>> >> started receiving segfaults.
>> >>
>> >> segfault at 7fb3ed112ff8 ip 00007fb4323946fe sp 00007fb3ed113000
>> >> error 6 in libfastjson.so.4.2.0[7fb432392000+a000]
>> >>
>> >> After some digging I pinpointed the culprit (but I must say that the
>> >> origin of the segfault - libfastjson was puzzling here).
>> >>
>> >> My configuration is relatively complicated (over 6,5k lines,
>> >> including lookups) so I won't go into much detail here but the main
>> >> thing is that in the course of message processing I do a lookup on a
>> >> specific part of the message to decide which ruleset to run.
>> >> Something like this:
>> >>
>> >> ruleset(name="do_a_lookup")
>> >>
>> >> {
>> >>
>> >> set $.ruleset = lookup ("hostname-to-ruleset", $fromhost-ip);
>> >>
>> >> call_indirect $.ruleset;
>> >>
>> >> }
>> >>
>> >> And unfortunately for some messages my lookup returned a
>> >> "do_a_lookup" ruleset name to run again and again (of course, since
>> >> between subsequent calls the origin of the message didn't change).
>> >>
>> >> It was definitely my mistake and I was quite lucky that it occured
>> >> quite quickly after I introduced the change (if it happened with
>> >> messages I get just once a day or even once a week, it'd be very hard
>> >> to troubleshoot).
>> >>
>> >> You might want to think about a ruleset call "depth counter" in order
>> >> to at least emit a warning if the recursion gets too deep. But then
>> >> again - my case is very unusual and it might not be worth introducing
>> >> such change into the code.
>> >>
>> >>
>> >> Best regards
>> >>
>> >>
>> _______________________________________________
>> rsyslog mailing list
>> https://lists.adiscon.net/mailman/listinfo/rsyslog
>> http://www.rsyslog.com/professional-services/
>> What's up with rsyslog? Follow https://twitter.com/rgerhards
>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.
> _______________________________________________
> rsyslog mailing list
> https://lists.adiscon.net/mailman/listinfo/rsyslog
> http://www.rsyslog.com/professional-services/
> What's up with rsyslog? Follow https://twitter.com/rgerhards
> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.
_______________________________________________
rsyslog mailing list
https://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.