Mailing List Archive

Memory Leak?
Good day

I am trying to diagnose and resolve an issue whereby the memory consumed by the rsyslog daemon increases linearly over time. This continues until it consumes most of the memory (including swap) of the system and the service has to be restarted to free up memory. There are two servers with identical configurations. What I noticed is that the server receiving a higher volume of messages also consumes memory at a higher rate. In other word it appears as if the message rate, or message volume, is directly proportional to the rate at which memory is consumed.

Below is the version information for the rsyslogd daemon:
rsyslogd 8.2310.0 (aka 2023.10) compiled with:
PLATFORM: x86_64-pc-linux-gnu
PLATFORM (lsb_release -d):
FEATURE_REGEXP: Yes
GSSAPI Kerberos 5 support: No
FEATURE_DEBUG (debug build, slow code): No
32bit Atomic operations supported: Yes
64bit Atomic operations supported: Yes
memory allocator: system default
Runtime Instrumentation (slow code): No
uuid support: Yes
systemd support: Yes
Config file: /etc/rsyslog.conf
PID file: /var/run/rsyslogd.pid
Number of Bits in RainerScript integers: 64

It is running on Debian 12 servers.

To provide you with more background detail, initially I configured three listeners: one UDP (port 514), one TCP (port 514) and one TLS (port 6514). A single system was configured to push logs to the TLS port and that worked well (no increase in memory usage over time). Recently I added another UDP listener (port 10514) and started configured a number of systems to push their logs to this port, but since then I've observed the described gradual memory increase.

This new listener is configured as follows: A ruleset was created and bound to this listener (the ruleset doesn't have its own queue). The ruleset first runs the mmutf8fix action then calls a different ruleset (named "normalise"), which normalises the data (just sets specific variables that is later used in a template to construct a JSON object). After the call to the "normalise" ruleset returns, a mmnormalize action is performed and some additional variables are set. Lastly the ruleset (the one bound to the listener) then calls yet another ruleset (named "kafka_output"), which is used to construct a JSON object from the various variables and uses the omkafka action to push this to a Kafka cluster.

The flow of the above can be visualised as:
Source -> Syslog Server [10514/UDP] -> [listener ruleset] -> [normalise ruleset] -> [kafka_output ruleset]

It should also be noted the original listeners are configured in much the same way, apart from having calls to even more rulesets. I haven't tested if the UDP listener on port 514 exhibits the same behaviour (it isn't currently being used).

This rsyslog daemon is also used to capture locally generated logs and the statistics (impstats) module is also loaded.

What can I do to troubleshoot what's causing this "memory leak"?

Kind Regards
---

_______________________________________________
rsyslog mailing list
https://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.
Re: Memory Leak? [ In reply to ]
Look at the queue sizes in impstats. Are they ever-increasing?

Rainer

El mié, 6 dic 2023 a las 14:30, Adriaan de Waal via rsyslog
(<rsyslog@lists.adiscon.com>) escribió:
>
> Good day
>
> I am trying to diagnose and resolve an issue whereby the memory consumed by the rsyslog daemon increases linearly over time. This continues until it consumes most of the memory (including swap) of the system and the service has to be restarted to free up memory. There are two servers with identical configurations. What I noticed is that the server receiving a higher volume of messages also consumes memory at a higher rate. In other word it appears as if the message rate, or message volume, is directly proportional to the rate at which memory is consumed.
>
> Below is the version information for the rsyslogd daemon:
> rsyslogd 8.2310.0 (aka 2023.10) compiled with:
> PLATFORM: x86_64-pc-linux-gnu
> PLATFORM (lsb_release -d):
> FEATURE_REGEXP: Yes
> GSSAPI Kerberos 5 support: No
> FEATURE_DEBUG (debug build, slow code): No
> 32bit Atomic operations supported: Yes
> 64bit Atomic operations supported: Yes
> memory allocator: system default
> Runtime Instrumentation (slow code): No
> uuid support: Yes
> systemd support: Yes
> Config file: /etc/rsyslog.conf
> PID file: /var/run/rsyslogd.pid
> Number of Bits in RainerScript integers: 64
>
> It is running on Debian 12 servers.
>
> To provide you with more background detail, initially I configured three listeners: one UDP (port 514), one TCP (port 514) and one TLS (port 6514). A single system was configured to push logs to the TLS port and that worked well (no increase in memory usage over time). Recently I added another UDP listener (port 10514) and started configured a number of systems to push their logs to this port, but since then I've observed the described gradual memory increase.
>
> This new listener is configured as follows: A ruleset was created and bound to this listener (the ruleset doesn't have its own queue). The ruleset first runs the mmutf8fix action then calls a different ruleset (named "normalise"), which normalises the data (just sets specific variables that is later used in a template to construct a JSON object). After the call to the "normalise" ruleset returns, a mmnormalize action is performed and some additional variables are set. Lastly the ruleset (the one bound to the listener) then calls yet another ruleset (named "kafka_output"), which is used to construct a JSON object from the various variables and uses the omkafka action to push this to a Kafka cluster.
>
> The flow of the above can be visualised as:
> Source -> Syslog Server [10514/UDP] -> [listener ruleset] -> [normalise ruleset] -> [kafka_output ruleset]
>
> It should also be noted the original listeners are configured in much the same way, apart from having calls to even more rulesets. I haven't tested if the UDP listener on port 514 exhibits the same behaviour (it isn't currently being used).
>
> This rsyslog daemon is also used to capture locally generated logs and the statistics (impstats) module is also loaded.
>
> What can I do to troubleshoot what's causing this "memory leak"?
>
> Kind Regards
> ---
>
> _______________________________________________
> rsyslog mailing list
> https://lists.adiscon.net/mailman/listinfo/rsyslog
> http://www.rsyslog.com/professional-services/
> What's up with rsyslog? Follow https://twitter.com/rgerhards
> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.
_______________________________________________
rsyslog mailing list
https://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.
Re: Memory Leak? [ In reply to ]
Good day

Looking at the main Q statistics, the size remains mostly constant around 30. The max queue size currently sits at 400. There is also a queue (linked list + disk assisted) configured for the omkafka action, with the size not really going above single digits (and the DA queue stats remain at 0). Also note I completely disabled the omkafka action's queue previously as a test, but that didn't make a difference. There are no other queues.

Kind Regards
---

________________________________
From: Rainer Gerhards <rgerhards@hq.adiscon.com>
Sent: 06 December 2023 17:05
To: rsyslog-users <rsyslog@lists.adiscon.com>
Cc: Adriaan de Waal <adriaan.dewaal@nymbis.cloud>
Subject: Re: [rsyslog] Memory Leak?

Look at the queue sizes in impstats. Are they ever-increasing?

Rainer

El mi?, 6 dic 2023 a las 14:30, Adriaan de Waal via rsyslog
(<rsyslog@lists.adiscon.com>) escribi?:
>
> Good day
>
> I am trying to diagnose and resolve an issue whereby the memory consumed by the rsyslog daemon increases linearly over time. This continues until it consumes most of the memory (including swap) of the system and the service has to be restarted to free up memory. There are two servers with identical configurations. What I noticed is that the server receiving a higher volume of messages also consumes memory at a higher rate. In other word it appears as if the message rate, or message volume, is directly proportional to the rate at which memory is consumed.
>
> Below is the version information for the rsyslogd daemon:
> rsyslogd 8.2310.0 (aka 2023.10) compiled with:
> PLATFORM: x86_64-pc-linux-gnu
> PLATFORM (lsb_release -d):
> FEATURE_REGEXP: Yes
> GSSAPI Kerberos 5 support: No
> FEATURE_DEBUG (debug build, slow code): No
> 32bit Atomic operations supported: Yes
> 64bit Atomic operations supported: Yes
> memory allocator: system default
> Runtime Instrumentation (slow code): No
> uuid support: Yes
> systemd support: Yes
> Config file: /etc/rsyslog.conf
> PID file: /var/run/rsyslogd.pid
> Number of Bits in RainerScript integers: 64
>
> It is running on Debian 12 servers.
>
> To provide you with more background detail, initially I configured three listeners: one UDP (port 514), one TCP (port 514) and one TLS (port 6514). A single system was configured to push logs to the TLS port and that worked well (no increase in memory usage over time). Recently I added another UDP listener (port 10514) and started configured a number of systems to push their logs to this port, but since then I've observed the described gradual memory increase.
>
> This new listener is configured as follows: A ruleset was created and bound to this listener (the ruleset doesn't have its own queue). The ruleset first runs the mmutf8fix action then calls a different ruleset (named "normalise"), which normalises the data (just sets specific variables that is later used in a template to construct a JSON object). After the call to the "normalise" ruleset returns, a mmnormalize action is performed and some additional variables are set. Lastly the ruleset (the one bound to the listener) then calls yet another ruleset (named "kafka_output"), which is used to construct a JSON object from the various variables and uses the omkafka action to push this to a Kafka cluster.
>
> The flow of the above can be visualised as:
> Source -> Syslog Server [10514/UDP] -> [listener ruleset] -> [normalise ruleset] -> [kafka_output ruleset]
>
> It should also be noted the original listeners are configured in much the same way, apart from having calls to even more rulesets. I haven't tested if the UDP listener on port 514 exhibits the same behaviour (it isn't currently being used).
>
> This rsyslog daemon is also used to capture locally generated logs and the statistics (impstats) module is also loaded.
>
> What can I do to troubleshoot what's causing this "memory leak"?
>
> Kind Regards
> ---
>
> _______________________________________________
> rsyslog mailing list
> https://lists.adiscon.net/mailman/listinfo/rsyslog
> http://www.rsyslog.com/professional-services/
> What's up with rsyslog? Follow https://twitter.com/rgerhards
> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.
_______________________________________________
rsyslog mailing list
https://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.
Re: Memory Leak? [ In reply to ]
Great.

Do you know valgrind? If so, you could run rsyslog under valgrind control,
best in the forground. When you terminate rsyslog, valgrind will show leak
stats of any.

Rainer

Sent from phone, thus brief.

Adriaan de Waal <adriaan.dewaal@nymbis.cloud> schrieb am Mi., 6. Dez. 2023,
19:56:

> Good day
>
> Looking at the main Q statistics, the size remains mostly constant around
> 30. The max queue size currently sits at 400. There is also a queue (linked
> list + disk assisted) configured for the omkafka action, with the size not
> really going above single digits (and the DA queue stats remain at 0). Also
> note I completely disabled the omkafka action's queue previously as a test,
> but that didn't make a difference. There are no other queues.
>
> Kind Regards
> ---
>
> ------------------------------
> *From:* Rainer Gerhards <rgerhards@hq.adiscon.com>
> *Sent:* 06 December 2023 17:05
> *To:* rsyslog-users <rsyslog@lists.adiscon.com>
> *Cc:* Adriaan de Waal <adriaan.dewaal@nymbis.cloud>
> *Subject:* Re: [rsyslog] Memory Leak?
>
> Look at the queue sizes in impstats. Are they ever-increasing?
>
> Rainer
>
> El mié, 6 dic 2023 a las 14:30, Adriaan de Waal via rsyslog
> (<rsyslog@lists.adiscon.com>) escribió:
> >
> > Good day
> >
> > I am trying to diagnose and resolve an issue whereby the memory consumed
> by the rsyslog daemon increases linearly over time. This continues until it
> consumes most of the memory (including swap) of the system and the service
> has to be restarted to free up memory. There are two servers with identical
> configurations. What I noticed is that the server receiving a higher volume
> of messages also consumes memory at a higher rate. In other word it appears
> as if the message rate, or message volume, is directly proportional to the
> rate at which memory is consumed.
> >
> > Below is the version information for the rsyslogd daemon:
> > rsyslogd 8.2310.0 (aka 2023.10) compiled with:
> > PLATFORM: x86_64-pc-linux-gnu
> > PLATFORM (lsb_release -d):
> > FEATURE_REGEXP: Yes
> > GSSAPI Kerberos 5 support: No
> > FEATURE_DEBUG (debug build, slow code): No
> > 32bit Atomic operations supported: Yes
> > 64bit Atomic operations supported: Yes
> > memory allocator: system default
> > Runtime Instrumentation (slow code): No
> > uuid support: Yes
> > systemd support: Yes
> > Config file: /etc/rsyslog.conf
> > PID file: /var/run/rsyslogd.pid
> > Number of Bits in RainerScript integers: 64
> >
> > It is running on Debian 12 servers.
> >
> > To provide you with more background detail, initially I configured three
> listeners: one UDP (port 514), one TCP (port 514) and one TLS (port 6514).
> A single system was configured to push logs to the TLS port and that worked
> well (no increase in memory usage over time). Recently I added another UDP
> listener (port 10514) and started configured a number of systems to push
> their logs to this port, but since then I've observed the described gradual
> memory increase.
> >
> > This new listener is configured as follows: A ruleset was created and
> bound to this listener (the ruleset doesn't have its own queue). The
> ruleset first runs the mmutf8fix action then calls a different ruleset
> (named "normalise"), which normalises the data (just sets specific
> variables that is later used in a template to construct a JSON object).
> After the call to the "normalise" ruleset returns, a mmnormalize action is
> performed and some additional variables are set. Lastly the ruleset (the
> one bound to the listener) then calls yet another ruleset (named
> "kafka_output"), which is used to construct a JSON object from the various
> variables and uses the omkafka action to push this to a Kafka cluster.
> >
> > The flow of the above can be visualised as:
> > Source -> Syslog Server [10514/UDP] -> [listener ruleset] -> [normalise
> ruleset] -> [kafka_output ruleset]
> >
> > It should also be noted the original listeners are configured in much
> the same way, apart from having calls to even more rulesets. I haven't
> tested if the UDP listener on port 514 exhibits the same behaviour (it
> isn't currently being used).
> >
> > This rsyslog daemon is also used to capture locally generated logs and
> the statistics (impstats) module is also loaded.
> >
> > What can I do to troubleshoot what's causing this "memory leak"?
> >
> > Kind Regards
> > ---
> >
> > _______________________________________________
> > rsyslog mailing list
> > https://lists.adiscon.net/mailman/listinfo/rsyslog
> > http://www.rsyslog.com/professional-services/
> > What's up with rsyslog? Follow https://twitter.com/rgerhards
> > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
> DON'T LIKE THAT.
>
_______________________________________________
rsyslog mailing list
https://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.
Re: Memory Leak? [ In reply to ]
Good day

I ran rsyslog under Valgrind for a while and got the following output:

==31532== Memcheck, a memory error detector
==31532== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
==31532== Using Valgrind-3.19.0 and LibVEX; rerun with -h for copyright info
==31532== Command: /usr/sbin/rsyslogd -n
==31532== Parent PID: 1
==31532==
==31532==
==31532== HEAP SUMMARY:
==31532== in use at exit: 55,152,508 bytes in 215,413 blocks
==31532== total heap usage: 36,459,066 allocs, 36,243,653 frees, 10,340,685,855 bytes allocated
==31532==
==31532== 32 bytes in 1 blocks are definitely lost in loss record 35 of 101
==31532== at 0x48407B4: malloc (vg_replace_malloc.c:381)
==31532== by 0x54FE7B6: ???
==31532== by 0x54FE8D4: ???
==31532== by 0x550796F: ???
==31532== by 0x54FC760: ???
==31532== by 0x545EE78: ???
==31532== by 0x534271E: ???
==31532== by 0x5311A00: ???
==31532== by 0x5313067: ???
==31532== by 0x53133F8: ???
==31532== by 0x173891: ??? (in /usr/sbin/rsyslogd)
==31532== by 0x1751BC: ??? (in /usr/sbin/rsyslogd)
==31532==
==31532== 48 bytes in 1 blocks are definitely lost in loss record 53 of 101
==31532== at 0x48455EF: calloc (vg_replace_malloc.c:1328)
==31532== by 0x16DBF2: propConstruct (in /usr/sbin/rsyslogd)
==31532== by 0x52E4199: ???
==31532== by 0x52E63DF: ???
==31532== by 0x52E6C32: ???
==31532== by 0x4ACB043: start_thread (pthread_create.c:442)
==31532== by 0x4B4A87F: clone (clone.S:100)
==31532==
==31532== 48 bytes in 1 blocks are definitely lost in loss record 54 of 101
==31532== at 0x48455EF: calloc (vg_replace_malloc.c:1328)
==31532== by 0x16DBF2: propConstruct (in /usr/sbin/rsyslogd)
==31532== by 0x52E41D5: ???
==31532== by 0x52E63DF: ???
==31532== by 0x52E6C32: ???
==31532== by 0x4ACB043: start_thread (pthread_create.c:442)
==31532== by 0x4B4A87F: clone (clone.S:100)
==31532==
==31532== 104 bytes in 1 blocks are definitely lost in loss record 62 of 101
==31532== at 0x48407B4: malloc (vg_replace_malloc.c:381)
==31532== by 0x5500EDE: ???
==31532== by 0x54FC78D: ???
==31532== by 0x545EE78: ???
==31532== by 0x534271E: ???
==31532== by 0x5311A00: ???
==31532== by 0x5313067: ???
==31532== by 0x53133F8: ???
==31532== by 0x173891: ??? (in /usr/sbin/rsyslogd)
==31532== by 0x1751BC: ??? (in /usr/sbin/rsyslogd)
==31532== by 0x17564F: ??? (in /usr/sbin/rsyslogd)
==31532== by 0x1683A1: ??? (in /usr/sbin/rsyslogd)
==31532==
==31532== 128 (16 direct, 112 indirect) bytes in 1 blocks are definitely lost in loss record 69 of 101
==31532== at 0x48407B4: malloc (vg_replace_malloc.c:381)
==31532== by 0x5507861: ???
==31532== by 0x5507B49: ???
==31532== by 0x54FC760: ???
==31532== by 0x545EE78: ???
==31532== by 0x534271E: ???
==31532== by 0x5311A00: ???
==31532== by 0x5313067: ???
==31532== by 0x53133F8: ???
==31532== by 0x173891: ??? (in /usr/sbin/rsyslogd)
==31532== by 0x1751BC: ??? (in /usr/sbin/rsyslogd)
==31532== by 0x17564F: ??? (in /usr/sbin/rsyslogd)
==31532==
==31532== 205 bytes in 10 blocks are definitely lost in loss record 74 of 101
==31532== at 0x48407B4: malloc (vg_replace_malloc.c:381)
==31532== by 0x4AE07F9: strdup (strdup.c:42)
==31532== by 0x132E7B: yylex (in /usr/sbin/rsyslogd)
==31532== by 0x12ED47: yyparse (in /usr/sbin/rsyslogd)
==31532== by 0x1459AC: ??? (in /usr/sbin/rsyslogd)
==31532== by 0x12130A: ??? (in /usr/sbin/rsyslogd)
==31532== by 0x11F0E8: main (in /usr/sbin/rsyslogd)
==31532==
==31532== 205 bytes in 10 blocks are definitely lost in loss record 75 of 101
==31532== at 0x48407B4: malloc (vg_replace_malloc.c:381)
==31532== by 0x4AE07F9: strdup (strdup.c:42)
==31532== by 0x150E0C: msgPropDescrFill (in /usr/sbin/rsyslogd)
==31532== by 0x13B71A: cnffuncexistsNew (in /usr/sbin/rsyslogd)
==31532== by 0x12F585: yyparse (in /usr/sbin/rsyslogd)
==31532== by 0x1459AC: ??? (in /usr/sbin/rsyslogd)
==31532== by 0x12130A: ??? (in /usr/sbin/rsyslogd)
==31532== by 0x11F0E8: main (in /usr/sbin/rsyslogd)
==31532==
==31532== 404 (392 direct, 12 indirect) bytes in 1 blocks are definitely lost in loss record 82 of 101
==31532== at 0x48407B4: malloc (vg_replace_malloc.c:381)
==31532== by 0x54FEFA6: ???
==31532== by 0x54FF22E: ???
==31532== by 0x54FC70D: ???
==31532== by 0x545EE78: ???
==31532== by 0x534271E: ???
==31532== by 0x5311A00: ???
==31532== by 0x5313067: ???
==31532== by 0x53133F8: ???
==31532== by 0x173891: ??? (in /usr/sbin/rsyslogd)
==31532== by 0x1751BC: ??? (in /usr/sbin/rsyslogd)
==31532== by 0x17564F: ??? (in /usr/sbin/rsyslogd)
==31532==
==31532== 512 bytes in 2 blocks are possibly lost in loss record 85 of 101
==31532== at 0x48455EF: calloc (vg_replace_malloc.c:1328)
==31532== by 0x495BF12: fjson_object_new_object (in /usr/lib/x86_64-linux-gnu/libfastjson.so.4.3.0)
==31532== by 0x5A51864: ???
==31532== by 0x5A516B2: ???
==31532== by 0x5A516B2: ???
==31532== by 0x5A51D8A: ???
==31532== by 0x5A44B8B: ???
==31532== by 0x173891: ??? (in /usr/sbin/rsyslogd)
==31532== by 0x1751BC: ??? (in /usr/sbin/rsyslogd)
==31532== by 0x175478: ??? (in /usr/sbin/rsyslogd)
==31532== by 0x16C82E: ??? (in /usr/sbin/rsyslogd)
==31532== by 0x16D4F4: ??? (in /usr/sbin/rsyslogd)
==31532==
==31532== 904 (32 direct, 872 indirect) bytes in 1 blocks are definitely lost in loss record 88 of 101
==31532== at 0x48407B4: malloc (vg_replace_malloc.c:381)
==31532== by 0x54FC6A8: ???
==31532== by 0x545EE78: ???
==31532== by 0x534271E: ???
==31532== by 0x5311A00: ???
==31532== by 0x5313067: ???
==31532== by 0x53133F8: ???
==31532== by 0x173891: ??? (in /usr/sbin/rsyslogd)
==31532== by 0x1751BC: ??? (in /usr/sbin/rsyslogd)
==31532== by 0x17564F: ??? (in /usr/sbin/rsyslogd)
==31532== by 0x1683A1: ??? (in /usr/sbin/rsyslogd)
==31532== by 0x164FB0: wtiWorker (in /usr/sbin/rsyslogd)
==31532==
==31532== 1,792 bytes in 7 blocks are possibly lost in loss record 90 of 101
==31532== at 0x48455EF: calloc (vg_replace_malloc.c:1328)
==31532== by 0x495BF12: fjson_object_new_object (in /usr/lib/x86_64-linux-gnu/libfastjson.so.4.3.0)
==31532== by 0x5A51864: ???
==31532== by 0x5A51D8A: ???
==31532== by 0x5A44B8B: ???
==31532== by 0x173891: ??? (in /usr/sbin/rsyslogd)
==31532== by 0x1751BC: ??? (in /usr/sbin/rsyslogd)
==31532== by 0x175478: ??? (in /usr/sbin/rsyslogd)
==31532== by 0x16C82E: ??? (in /usr/sbin/rsyslogd)
==31532== by 0x16D4F4: ??? (in /usr/sbin/rsyslogd)
==31532== by 0x11FD1B: ??? (in /usr/sbin/rsyslogd)
==31532== by 0x1683A1: ??? (in /usr/sbin/rsyslogd)
==31532==
==31532== 4,112 bytes in 1 blocks are definitely lost in loss record 93 of 101
==31532== at 0x48407B4: malloc (vg_replace_malloc.c:381)
==31532== by 0x54FA8CA: ???
==31532== by 0x54FF24A: ???
==31532== by 0x54FC70D: ???
==31532== by 0x545EE78: ???
==31532== by 0x534271E: ???
==31532== by 0x5311A00: ???
==31532== by 0x5313067: ???
==31532== by 0x53133F8: ???
==31532== by 0x173891: ??? (in /usr/sbin/rsyslogd)
==31532== by 0x1751BC: ??? (in /usr/sbin/rsyslogd)
==31532== by 0x17564F: ??? (in /usr/sbin/rsyslogd)
==31532==
==31532== 8,364 (1,000 direct, 7,364 indirect) bytes in 1 blocks are definitely lost in loss record 96 of 101
==31532== at 0x48455EF: calloc (vg_replace_malloc.c:1328)
==31532== by 0x535E8FB: ???
==31532== by 0x5360158: ???
==31532== by 0x530FDF2: ???
==31532== by 0x5310E65: ???
==31532== by 0x5313481: ???
==31532== by 0x173891: ??? (in /usr/sbin/rsyslogd)
==31532== by 0x1751BC: ??? (in /usr/sbin/rsyslogd)
==31532== by 0x17564F: ??? (in /usr/sbin/rsyslogd)
==31532== by 0x1683A1: ??? (in /usr/sbin/rsyslogd)
==31532== by 0x164FB0: wtiWorker (in /usr/sbin/rsyslogd)
==31532== by 0x1637A9: ??? (in /usr/sbin/rsyslogd)
==31532==
==31532== 6,657,280 bytes in 26,005 blocks are definitely lost in loss record 100 of 101
==31532== at 0x48455EF: calloc (vg_replace_malloc.c:1328)
==31532== by 0x495BF12: fjson_object_new_object (in /usr/lib/x86_64-linux-gnu/libfastjson.so.4.3.0)
==31532== by 0x5A51864: ???
==31532== by 0x5A516B2: ???
==31532== by 0x5A516B2: ???
==31532== by 0x5A51D8A: ???
==31532== by 0x5A44B8B: ???
==31532== by 0x173891: ??? (in /usr/sbin/rsyslogd)
==31532== by 0x1751BC: ??? (in /usr/sbin/rsyslogd)
==31532== by 0x175478: ??? (in /usr/sbin/rsyslogd)
==31532== by 0x16C82E: ??? (in /usr/sbin/rsyslogd)
==31532== by 0x16D4F4: ??? (in /usr/sbin/rsyslogd)
==31532==
==31532== 48,420,608 bytes in 189,143 blocks are definitely lost in loss record 101 of 101
==31532== at 0x48455EF: calloc (vg_replace_malloc.c:1328)
==31532== by 0x495BF12: fjson_object_new_object (in /usr/lib/x86_64-linux-gnu/libfastjson.so.4.3.0)
==31532== by 0x5A51864: ???
==31532== by 0x5A51D8A: ???
==31532== by 0x5A44B8B: ???
==31532== by 0x173891: ??? (in /usr/sbin/rsyslogd)
==31532== by 0x1751BC: ??? (in /usr/sbin/rsyslogd)
==31532== by 0x175478: ??? (in /usr/sbin/rsyslogd)
==31532== by 0x16C82E: ??? (in /usr/sbin/rsyslogd)
==31532== by 0x16D4F4: ??? (in /usr/sbin/rsyslogd)
==31532== by 0x11FD1B: ??? (in /usr/sbin/rsyslogd)
==31532== by 0x1683A1: ??? (in /usr/sbin/rsyslogd)
==31532==
==31532== LEAK SUMMARY:
==31532== definitely lost: 55,084,082 bytes in 215,177 blocks
==31532== indirectly lost: 8,360 bytes in 68 blocks
==31532== possibly lost: 2,304 bytes in 9 blocks
==31532== still reachable: 57,762 bytes in 159 blocks
==31532== suppressed: 0 bytes in 0 blocks
==31532== Reachable blocks (those to which a pointer was found) are not shown.
==31532== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==31532==
==31532== For lists of detected and suppressed errors, rerun with: -s
==31532== ERROR SUMMARY: 15 errors from 15 contexts (suppressed: 0 from 0)

If you need me to run this for longer I can do that as well, or if there is anything else I should do.

***
As an additional test, I stopped shipping logs to the relevant listener (10514/UDP) for a few hours and monitored the memory usage, which remained constant. The syslog-TLS (port 6514) listener remained in use throughout with one system shipping logs to it. This seems to show that something related to the UDP listener on port 10514 is where the problem lies.

Kind Regards
---
________________________________
From: Rainer Gerhards <rgerhards@hq.adiscon.com>
Sent: 06 December 2023 21:12
To: Adriaan de Waal <adriaan.dewaal@nymbis.cloud>
Cc: rsyslog-users <rsyslog@lists.adiscon.com>
Subject: Re: [rsyslog] Memory Leak?

Great.

Do you know valgrind? If so, you could run rsyslog under valgrind control, best in the forground. When you terminate rsyslog, valgrind will show leak stats of any.

Rainer

Sent from phone, thus brief.

Adriaan de Waal <adriaan.dewaal@nymbis.cloud> schrieb am Mi., 6. Dez. 2023, 19:56:
Good day

Looking at the main Q statistics, the size remains mostly constant around 30. The max queue size currently sits at 400. There is also a queue (linked list + disk assisted) configured for the omkafka action, with the size not really going above single digits (and the DA queue stats remain at 0). Also note I completely disabled the omkafka action's queue previously as a test, but that didn't make a difference. There are no other queues.

Kind Regards
---

________________________________
From: Rainer Gerhards <rgerhards@hq.adiscon.com<mailto:rgerhards@hq.adiscon.com>>
Sent: 06 December 2023 17:05
To: rsyslog-users <rsyslog@lists.adiscon.com<mailto:rsyslog@lists.adiscon.com>>
Cc: Adriaan de Waal <adriaan.dewaal@nymbis.cloud>
Subject: Re: [rsyslog] Memory Leak?

Look at the queue sizes in impstats. Are they ever-increasing?

Rainer

El mi?, 6 dic 2023 a las 14:30, Adriaan de Waal via rsyslog
(<rsyslog@lists.adiscon.com<mailto:rsyslog@lists.adiscon.com>>) escribi?:
>
> Good day
>
> I am trying to diagnose and resolve an issue whereby the memory consumed by the rsyslog daemon increases linearly over time. This continues until it consumes most of the memory (including swap) of the system and the service has to be restarted to free up memory. There are two servers with identical configurations. What I noticed is that the server receiving a higher volume of messages also consumes memory at a higher rate. In other word it appears as if the message rate, or message volume, is directly proportional to the rate at which memory is consumed.
>
> Below is the version information for the rsyslogd daemon:
> rsyslogd 8.2310.0 (aka 2023.10) compiled with:
> PLATFORM: x86_64-pc-linux-gnu
> PLATFORM (lsb_release -d):
> FEATURE_REGEXP: Yes
> GSSAPI Kerberos 5 support: No
> FEATURE_DEBUG (debug build, slow code): No
> 32bit Atomic operations supported: Yes
> 64bit Atomic operations supported: Yes
> memory allocator: system default
> Runtime Instrumentation (slow code): No
> uuid support: Yes
> systemd support: Yes
> Config file: /etc/rsyslog.conf
> PID file: /var/run/rsyslogd.pid
> Number of Bits in RainerScript integers: 64
>
> It is running on Debian 12 servers.
>
> To provide you with more background detail, initially I configured three listeners: one UDP (port 514), one TCP (port 514) and one TLS (port 6514). A single system was configured to push logs to the TLS port and that worked well (no increase in memory usage over time). Recently I added another UDP listener (port 10514) and started configured a number of systems to push their logs to this port, but since then I've observed the described gradual memory increase.
>
> This new listener is configured as follows: A ruleset was created and bound to this listener (the ruleset doesn't have its own queue). The ruleset first runs the mmutf8fix action then calls a different ruleset (named "normalise"), which normalises the data (just sets specific variables that is later used in a template to construct a JSON object). After the call to the "normalise" ruleset returns, a mmnormalize action is performed and some additional variables are set. Lastly the ruleset (the one bound to the listener) then calls yet another ruleset (named "kafka_output"), which is used to construct a JSON object from the various variables and uses the omkafka action to push this to a Kafka cluster.
>
> The flow of the above can be visualised as:
> Source -> Syslog Server [10514/UDP] -> [listener ruleset] -> [normalise ruleset] -> [kafka_output ruleset]
>
> It should also be noted the original listeners are configured in much the same way, apart from having calls to even more rulesets. I haven't tested if the UDP listener on port 514 exhibits the same behaviour (it isn't currently being used).
>
> This rsyslog daemon is also used to capture locally generated logs and the statistics (impstats) module is also loaded.
>
> What can I do to troubleshoot what's causing this "memory leak"?
>
> Kind Regards
> ---
>
> _______________________________________________
> rsyslog mailing list
> https://lists.adiscon.net/mailman/listinfo/rsyslog
> http://www.rsyslog.com/professional-services/
> What's up with rsyslog? Follow https://twitter.com/rgerhards
> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.
_______________________________________________
rsyslog mailing list
https://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.
Re: Memory Leak? [ In reply to ]
Thx, that is useful. Obviously we have a leak:

> 48,420,608 bytes in 189,143 blocks are definitely lost

unfortunately, we do not have symbols, so we do not know exactly where
it happens. Can you install debug symbols? Usually there is a package
rsyslog-debuginfo (or the like), which should at least contain the
symbols for the static (non-plugin) parts of rsyslog. That would help
with the stacktrace and ease bug hunting.

Rainer

El jue, 7 dic 2023 a las 11:41, Adriaan de Waal
(<adriaan.dewaal@nymbis.cloud>) escribió:
>
> Good day
>
> I ran rsyslog under Valgrind for a while and got the following output:
>
> ==31532== Memcheck, a memory error detector
> ==31532== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
> ==31532== Using Valgrind-3.19.0 and LibVEX; rerun with -h for copyright info
> ==31532== Command: /usr/sbin/rsyslogd -n
> ==31532== Parent PID: 1
> ==31532==
> ==31532==
> ==31532== HEAP SUMMARY:
> ==31532== in use at exit: 55,152,508 bytes in 215,413 blocks
> ==31532== total heap usage: 36,459,066 allocs, 36,243,653 frees, 10,340,685,855 bytes allocated
> ==31532==
> ==31532== 32 bytes in 1 blocks are definitely lost in loss record 35 of 101
> ==31532== at 0x48407B4: malloc (vg_replace_malloc.c:381)
> ==31532== by 0x54FE7B6: ???
> ==31532== by 0x54FE8D4: ???
> ==31532== by 0x550796F: ???
> ==31532== by 0x54FC760: ???
> ==31532== by 0x545EE78: ???
> ==31532== by 0x534271E: ???
> ==31532== by 0x5311A00: ???
> ==31532== by 0x5313067: ???
> ==31532== by 0x53133F8: ???
> ==31532== by 0x173891: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x1751BC: ??? (in /usr/sbin/rsyslogd)
> ==31532==
> ==31532== 48 bytes in 1 blocks are definitely lost in loss record 53 of 101
> ==31532== at 0x48455EF: calloc (vg_replace_malloc.c:1328)
> ==31532== by 0x16DBF2: propConstruct (in /usr/sbin/rsyslogd)
> ==31532== by 0x52E4199: ???
> ==31532== by 0x52E63DF: ???
> ==31532== by 0x52E6C32: ???
> ==31532== by 0x4ACB043: start_thread (pthread_create.c:442)
> ==31532== by 0x4B4A87F: clone (clone.S:100)
> ==31532==
> ==31532== 48 bytes in 1 blocks are definitely lost in loss record 54 of 101
> ==31532== at 0x48455EF: calloc (vg_replace_malloc.c:1328)
> ==31532== by 0x16DBF2: propConstruct (in /usr/sbin/rsyslogd)
> ==31532== by 0x52E41D5: ???
> ==31532== by 0x52E63DF: ???
> ==31532== by 0x52E6C32: ???
> ==31532== by 0x4ACB043: start_thread (pthread_create.c:442)
> ==31532== by 0x4B4A87F: clone (clone.S:100)
> ==31532==
> ==31532== 104 bytes in 1 blocks are definitely lost in loss record 62 of 101
> ==31532== at 0x48407B4: malloc (vg_replace_malloc.c:381)
> ==31532== by 0x5500EDE: ???
> ==31532== by 0x54FC78D: ???
> ==31532== by 0x545EE78: ???
> ==31532== by 0x534271E: ???
> ==31532== by 0x5311A00: ???
> ==31532== by 0x5313067: ???
> ==31532== by 0x53133F8: ???
> ==31532== by 0x173891: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x1751BC: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x17564F: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x1683A1: ??? (in /usr/sbin/rsyslogd)
> ==31532==
> ==31532== 128 (16 direct, 112 indirect) bytes in 1 blocks are definitely lost in loss record 69 of 101
> ==31532== at 0x48407B4: malloc (vg_replace_malloc.c:381)
> ==31532== by 0x5507861: ???
> ==31532== by 0x5507B49: ???
> ==31532== by 0x54FC760: ???
> ==31532== by 0x545EE78: ???
> ==31532== by 0x534271E: ???
> ==31532== by 0x5311A00: ???
> ==31532== by 0x5313067: ???
> ==31532== by 0x53133F8: ???
> ==31532== by 0x173891: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x1751BC: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x17564F: ??? (in /usr/sbin/rsyslogd)
> ==31532==
> ==31532== 205 bytes in 10 blocks are definitely lost in loss record 74 of 101
> ==31532== at 0x48407B4: malloc (vg_replace_malloc.c:381)
> ==31532== by 0x4AE07F9: strdup (strdup.c:42)
> ==31532== by 0x132E7B: yylex (in /usr/sbin/rsyslogd)
> ==31532== by 0x12ED47: yyparse (in /usr/sbin/rsyslogd)
> ==31532== by 0x1459AC: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x12130A: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x11F0E8: main (in /usr/sbin/rsyslogd)
> ==31532==
> ==31532== 205 bytes in 10 blocks are definitely lost in loss record 75 of 101
> ==31532== at 0x48407B4: malloc (vg_replace_malloc.c:381)
> ==31532== by 0x4AE07F9: strdup (strdup.c:42)
> ==31532== by 0x150E0C: msgPropDescrFill (in /usr/sbin/rsyslogd)
> ==31532== by 0x13B71A: cnffuncexistsNew (in /usr/sbin/rsyslogd)
> ==31532== by 0x12F585: yyparse (in /usr/sbin/rsyslogd)
> ==31532== by 0x1459AC: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x12130A: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x11F0E8: main (in /usr/sbin/rsyslogd)
> ==31532==
> ==31532== 404 (392 direct, 12 indirect) bytes in 1 blocks are definitely lost in loss record 82 of 101
> ==31532== at 0x48407B4: malloc (vg_replace_malloc.c:381)
> ==31532== by 0x54FEFA6: ???
> ==31532== by 0x54FF22E: ???
> ==31532== by 0x54FC70D: ???
> ==31532== by 0x545EE78: ???
> ==31532== by 0x534271E: ???
> ==31532== by 0x5311A00: ???
> ==31532== by 0x5313067: ???
> ==31532== by 0x53133F8: ???
> ==31532== by 0x173891: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x1751BC: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x17564F: ??? (in /usr/sbin/rsyslogd)
> ==31532==
> ==31532== 512 bytes in 2 blocks are possibly lost in loss record 85 of 101
> ==31532== at 0x48455EF: calloc (vg_replace_malloc.c:1328)
> ==31532== by 0x495BF12: fjson_object_new_object (in /usr/lib/x86_64-linux-gnu/libfastjson.so.4.3.0)
> ==31532== by 0x5A51864: ???
> ==31532== by 0x5A516B2: ???
> ==31532== by 0x5A516B2: ???
> ==31532== by 0x5A51D8A: ???
> ==31532== by 0x5A44B8B: ???
> ==31532== by 0x173891: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x1751BC: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x175478: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x16C82E: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x16D4F4: ??? (in /usr/sbin/rsyslogd)
> ==31532==
> ==31532== 904 (32 direct, 872 indirect) bytes in 1 blocks are definitely lost in loss record 88 of 101
> ==31532== at 0x48407B4: malloc (vg_replace_malloc.c:381)
> ==31532== by 0x54FC6A8: ???
> ==31532== by 0x545EE78: ???
> ==31532== by 0x534271E: ???
> ==31532== by 0x5311A00: ???
> ==31532== by 0x5313067: ???
> ==31532== by 0x53133F8: ???
> ==31532== by 0x173891: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x1751BC: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x17564F: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x1683A1: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x164FB0: wtiWorker (in /usr/sbin/rsyslogd)
> ==31532==
> ==31532== 1,792 bytes in 7 blocks are possibly lost in loss record 90 of 101
> ==31532== at 0x48455EF: calloc (vg_replace_malloc.c:1328)
> ==31532== by 0x495BF12: fjson_object_new_object (in /usr/lib/x86_64-linux-gnu/libfastjson.so.4.3.0)
> ==31532== by 0x5A51864: ???
> ==31532== by 0x5A51D8A: ???
> ==31532== by 0x5A44B8B: ???
> ==31532== by 0x173891: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x1751BC: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x175478: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x16C82E: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x16D4F4: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x11FD1B: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x1683A1: ??? (in /usr/sbin/rsyslogd)
> ==31532==
> ==31532== 4,112 bytes in 1 blocks are definitely lost in loss record 93 of 101
> ==31532== at 0x48407B4: malloc (vg_replace_malloc.c:381)
> ==31532== by 0x54FA8CA: ???
> ==31532== by 0x54FF24A: ???
> ==31532== by 0x54FC70D: ???
> ==31532== by 0x545EE78: ???
> ==31532== by 0x534271E: ???
> ==31532== by 0x5311A00: ???
> ==31532== by 0x5313067: ???
> ==31532== by 0x53133F8: ???
> ==31532== by 0x173891: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x1751BC: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x17564F: ??? (in /usr/sbin/rsyslogd)
> ==31532==
> ==31532== 8,364 (1,000 direct, 7,364 indirect) bytes in 1 blocks are definitely lost in loss record 96 of 101
> ==31532== at 0x48455EF: calloc (vg_replace_malloc.c:1328)
> ==31532== by 0x535E8FB: ???
> ==31532== by 0x5360158: ???
> ==31532== by 0x530FDF2: ???
> ==31532== by 0x5310E65: ???
> ==31532== by 0x5313481: ???
> ==31532== by 0x173891: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x1751BC: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x17564F: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x1683A1: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x164FB0: wtiWorker (in /usr/sbin/rsyslogd)
> ==31532== by 0x1637A9: ??? (in /usr/sbin/rsyslogd)
> ==31532==
> ==31532== 6,657,280 bytes in 26,005 blocks are definitely lost in loss record 100 of 101
> ==31532== at 0x48455EF: calloc (vg_replace_malloc.c:1328)
> ==31532== by 0x495BF12: fjson_object_new_object (in /usr/lib/x86_64-linux-gnu/libfastjson.so.4.3.0)
> ==31532== by 0x5A51864: ???
> ==31532== by 0x5A516B2: ???
> ==31532== by 0x5A516B2: ???
> ==31532== by 0x5A51D8A: ???
> ==31532== by 0x5A44B8B: ???
> ==31532== by 0x173891: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x1751BC: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x175478: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x16C82E: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x16D4F4: ??? (in /usr/sbin/rsyslogd)
> ==31532==
> ==31532== 48,420,608 bytes in 189,143 blocks are definitely lost in loss record 101 of 101
> ==31532== at 0x48455EF: calloc (vg_replace_malloc.c:1328)
> ==31532== by 0x495BF12: fjson_object_new_object (in /usr/lib/x86_64-linux-gnu/libfastjson.so.4.3.0)
> ==31532== by 0x5A51864: ???
> ==31532== by 0x5A51D8A: ???
> ==31532== by 0x5A44B8B: ???
> ==31532== by 0x173891: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x1751BC: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x175478: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x16C82E: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x16D4F4: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x11FD1B: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x1683A1: ??? (in /usr/sbin/rsyslogd)
> ==31532==
> ==31532== LEAK SUMMARY:
> ==31532== definitely lost: 55,084,082 bytes in 215,177 blocks
> ==31532== indirectly lost: 8,360 bytes in 68 blocks
> ==31532== possibly lost: 2,304 bytes in 9 blocks
> ==31532== still reachable: 57,762 bytes in 159 blocks
> ==31532== suppressed: 0 bytes in 0 blocks
> ==31532== Reachable blocks (those to which a pointer was found) are not shown.
> ==31532== To see them, rerun with: --leak-check=full --show-leak-kinds=all
> ==31532==
> ==31532== For lists of detected and suppressed errors, rerun with: -s
> ==31532== ERROR SUMMARY: 15 errors from 15 contexts (suppressed: 0 from 0)
>
> If you need me to run this for longer I can do that as well, or if there is anything else I should do.
>
> ***
> As an additional test, I stopped shipping logs to the relevant listener (10514/UDP) for a few hours and monitored the memory usage, which remained constant. The syslog-TLS (port 6514) listener remained in use throughout with one system shipping logs to it. This seems to show that something related to the UDP listener on port 10514 is where the problem lies.
>
> Kind Regards
> ---
> ________________________________
> From: Rainer Gerhards <rgerhards@hq.adiscon.com>
> Sent: 06 December 2023 21:12
> To: Adriaan de Waal <adriaan.dewaal@nymbis.cloud>
> Cc: rsyslog-users <rsyslog@lists.adiscon.com>
> Subject: Re: [rsyslog] Memory Leak?
>
> Great.
>
> Do you know valgrind? If so, you could run rsyslog under valgrind control, best in the forground. When you terminate rsyslog, valgrind will show leak stats of any.
>
> Rainer
>
> Sent from phone, thus brief.
>
> Adriaan de Waal <adriaan.dewaal@nymbis.cloud> schrieb am Mi., 6. Dez. 2023, 19:56:
>
> Good day
>
> Looking at the main Q statistics, the size remains mostly constant around 30. The max queue size currently sits at 400. There is also a queue (linked list + disk assisted) configured for the omkafka action, with the size not really going above single digits (and the DA queue stats remain at 0). Also note I completely disabled the omkafka action's queue previously as a test, but that didn't make a difference. There are no other queues.
>
> Kind Regards
> ---
>
> ________________________________
> From: Rainer Gerhards <rgerhards@hq.adiscon.com>
> Sent: 06 December 2023 17:05
> To: rsyslog-users <rsyslog@lists.adiscon.com>
> Cc: Adriaan de Waal <adriaan.dewaal@nymbis.cloud>
> Subject: Re: [rsyslog] Memory Leak?
>
> Look at the queue sizes in impstats. Are they ever-increasing?
>
> Rainer
>
> El mié, 6 dic 2023 a las 14:30, Adriaan de Waal via rsyslog
> (<rsyslog@lists.adiscon.com>) escribió:
> >
> > Good day
> >
> > I am trying to diagnose and resolve an issue whereby the memory consumed by the rsyslog daemon increases linearly over time. This continues until it consumes most of the memory (including swap) of the system and the service has to be restarted to free up memory. There are two servers with identical configurations. What I noticed is that the server receiving a higher volume of messages also consumes memory at a higher rate. In other word it appears as if the message rate, or message volume, is directly proportional to the rate at which memory is consumed.
> >
> > Below is the version information for the rsyslogd daemon:
> > rsyslogd 8.2310.0 (aka 2023.10) compiled with:
> > PLATFORM: x86_64-pc-linux-gnu
> > PLATFORM (lsb_release -d):
> > FEATURE_REGEXP: Yes
> > GSSAPI Kerberos 5 support: No
> > FEATURE_DEBUG (debug build, slow code): No
> > 32bit Atomic operations supported: Yes
> > 64bit Atomic operations supported: Yes
> > memory allocator: system default
> > Runtime Instrumentation (slow code): No
> > uuid support: Yes
> > systemd support: Yes
> > Config file: /etc/rsyslog.conf
> > PID file: /var/run/rsyslogd.pid
> > Number of Bits in RainerScript integers: 64
> >
> > It is running on Debian 12 servers.
> >
> > To provide you with more background detail, initially I configured three listeners: one UDP (port 514), one TCP (port 514) and one TLS (port 6514). A single system was configured to push logs to the TLS port and that worked well (no increase in memory usage over time). Recently I added another UDP listener (port 10514) and started configured a number of systems to push their logs to this port, but since then I've observed the described gradual memory increase.
> >
> > This new listener is configured as follows: A ruleset was created and bound to this listener (the ruleset doesn't have its own queue). The ruleset first runs the mmutf8fix action then calls a different ruleset (named "normalise"), which normalises the data (just sets specific variables that is later used in a template to construct a JSON object). After the call to the "normalise" ruleset returns, a mmnormalize action is performed and some additional variables are set. Lastly the ruleset (the one bound to the listener) then calls yet another ruleset (named "kafka_output"), which is used to construct a JSON object from the various variables and uses the omkafka action to push this to a Kafka cluster.
> >
> > The flow of the above can be visualised as:
> > Source -> Syslog Server [10514/UDP] -> [listener ruleset] -> [normalise ruleset] -> [kafka_output ruleset]
> >
> > It should also be noted the original listeners are configured in much the same way, apart from having calls to even more rulesets. I haven't tested if the UDP listener on port 514 exhibits the same behaviour (it isn't currently being used).
> >
> > This rsyslog daemon is also used to capture locally generated logs and the statistics (impstats) module is also loaded.
> >
> > What can I do to troubleshoot what's causing this "memory leak"?
> >
> > Kind Regards
> > ---
> >
> > _______________________________________________
> > rsyslog mailing list
> > https://lists.adiscon.net/mailman/listinfo/rsyslog
> > http://www.rsyslog.com/professional-services/
> > What's up with rsyslog? Follow https://twitter.com/rgerhards
> > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.
_______________________________________________
rsyslog mailing list
https://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.
Re: Memory Leak? [ In reply to ]
Good day

I found debug symbol packages for:

* rsyslog
* rsyslog-gnutls
* rsyslog-imptcp
* rsyslog-kafka

All these were installed and I re-ran rsyslog under Valgrind for a while and this is the result:

==48067== Memcheck, a memory error detector
==48067== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
==48067== Using Valgrind-3.19.0 and LibVEX; rerun with -h for copyright info
==48067== Command: /usr/sbin/rsyslogd -n
==48067== Parent PID: 1
==48067==
==48067==
==48067== HEAP SUMMARY:
==48067== in use at exit: 3,569,788 bytes in 13,918 blocks
==48067== total heap usage: 3,724,046 allocs, 3,710,128 frees, 1,141,461,392 bytes allocated
==48067==
==48067== 32 bytes in 1 blocks are definitely lost in loss record 35 of 98
==48067== at 0x48407B4: malloc (vg_replace_malloc.c:381)
==48067== by 0x54FE7B6: ???
==48067== by 0x54FE8D4: ???
==48067== by 0x550796F: ???
==48067== by 0x54FC760: ???
==48067== by 0x545EE78: ???
==48067== by 0x534271E: ???
==48067== by 0x5311A00: ???
==48067== by 0x5313067: ???
==48067== by 0x53133F8: ???
==48067== by 0x173891: actionCallDoAction (action.c:1276)
==48067== by 0x173891: actionProcessMessage.part.0 (action.c:1318)
==48067== by 0x1751BC: actionProcessMessage (action.c:1689)
==48067== by 0x1751BC: processMsgMain.part.0 (action.c:1688)
==48067==
==48067== 48 bytes in 1 blocks are definitely lost in loss record 53 of 98
==48067== at 0x48455EF: calloc (vg_replace_malloc.c:1328)
==48067== by 0x16DBF2: propConstruct (prop.c:56)
==48067== by 0x52E4199: ???
==48067== by 0x52E63DF: ???
==48067== by 0x52E6C32: ???
==48067== by 0x4ACB043: start_thread (pthread_create.c:442)
==48067== by 0x4B4A87F: clone (clone.S:100)
==48067==
==48067== 48 bytes in 1 blocks are definitely lost in loss record 54 of 98
==48067== at 0x48455EF: calloc (vg_replace_malloc.c:1328)
==48067== by 0x16DBF2: propConstruct (prop.c:56)
==48067== by 0x52E41D5: ???
==48067== by 0x52E63DF: ???
==48067== by 0x52E6C32: ???
==48067== by 0x4ACB043: start_thread (pthread_create.c:442)
==48067== by 0x4B4A87F: clone (clone.S:100)
==48067==
==48067== 104 bytes in 1 blocks are definitely lost in loss record 62 of 98
==48067== at 0x48407B4: malloc (vg_replace_malloc.c:381)
==48067== by 0x5500EDE: ???
==48067== by 0x54FC78D: ???
==48067== by 0x545EE78: ???
==48067== by 0x534271E: ???
==48067== by 0x5311A00: ???
==48067== by 0x5313067: ???
==48067== by 0x53133F8: ???
==48067== by 0x173891: actionCallDoAction (action.c:1276)
==48067== by 0x173891: actionProcessMessage.part.0 (action.c:1318)
==48067== by 0x1751BC: actionProcessMessage (action.c:1689)
==48067== by 0x1751BC: processMsgMain.part.0 (action.c:1688)
==48067== by 0x17564F: processMsgMain (action.c:1678)
==48067== by 0x17564F: processBatchMain (action.c:1722)
==48067== by 0x1683A1: ConsumerReg (queue.c:2158)
==48067==
==48067== 128 (16 direct, 112 indirect) bytes in 1 blocks are definitely lost in loss record 69 of 98
==48067== at 0x48407B4: malloc (vg_replace_malloc.c:381)
==48067== by 0x5507861: ???
==48067== by 0x5507B49: ???
==48067== by 0x54FC760: ???
==48067== by 0x545EE78: ???
==48067== by 0x534271E: ???
==48067== by 0x5311A00: ???
==48067== by 0x5313067: ???
==48067== by 0x53133F8: ???
==48067== by 0x173891: actionCallDoAction (action.c:1276)
==48067== by 0x173891: actionProcessMessage.part.0 (action.c:1318)
==48067== by 0x1751BC: actionProcessMessage (action.c:1689)
==48067== by 0x1751BC: processMsgMain.part.0 (action.c:1688)
==48067== by 0x17564F: processMsgMain (action.c:1678)
==48067== by 0x17564F: processBatchMain (action.c:1722)
==48067==
==48067== 205 bytes in 10 blocks are definitely lost in loss record 74 of 98
==48067== at 0x48407B4: malloc (vg_replace_malloc.c:381)
==48067== by 0x4AE07F9: strdup (strdup.c:42)
==48067== by 0x132E7B: yylex (lexer.l:297)
==48067== by 0x12ED47: yyparse (grammar.c:1535)
==48067== by 0x1459AC: load (rsconf.c:1494)
==48067== by 0x12130A: initAll (rsyslogd.c:1636)
==48067== by 0x11F0E8: main (rsyslogd.c:2337)
==48067==
==48067== 205 bytes in 10 blocks are definitely lost in loss record 75 of 98
==48067== at 0x48407B4: malloc (vg_replace_malloc.c:381)
==48067== by 0x4AE07F9: strdup (strdup.c:42)
==48067== by 0x150E0C: msgPropDescrFill (msg.c:5431)
==48067== by 0x13B71A: cnffuncexistsNew (rainerscript.c:5351)
==48067== by 0x12F585: yyparse (grammar.y:222)
==48067== by 0x1459AC: load (rsconf.c:1494)
==48067== by 0x12130A: initAll (rsyslogd.c:1636)
==48067== by 0x11F0E8: main (rsyslogd.c:2337)
==48067==
==48067== 404 (392 direct, 12 indirect) bytes in 1 blocks are definitely lost in loss record 81 of 98
==48067== at 0x48407B4: malloc (vg_replace_malloc.c:381)
==48067== by 0x54FEFA6: ???
==48067== by 0x54FF22E: ???
==48067== by 0x54FC70D: ???
==48067== by 0x545EE78: ???
==48067== by 0x534271E: ???
==48067== by 0x5311A00: ???
==48067== by 0x5313067: ???
==48067== by 0x53133F8: ???
==48067== by 0x173891: actionCallDoAction (action.c:1276)
==48067== by 0x173891: actionProcessMessage.part.0 (action.c:1318)
==48067== by 0x1751BC: actionProcessMessage (action.c:1689)
==48067== by 0x1751BC: processMsgMain.part.0 (action.c:1688)
==48067== by 0x17564F: processMsgMain (action.c:1678)
==48067== by 0x17564F: processBatchMain (action.c:1722)
==48067==
==48067== 904 (32 direct, 872 indirect) bytes in 1 blocks are definitely lost in loss record 86 of 98
==48067== at 0x48407B4: malloc (vg_replace_malloc.c:381)
==48067== by 0x54FC6A8: ???
==48067== by 0x545EE78: ???
==48067== by 0x534271E: ???
==48067== by 0x5311A00: ???
==48067== by 0x5313067: ???
==48067== by 0x53133F8: ???
==48067== by 0x173891: actionCallDoAction (action.c:1276)
==48067== by 0x173891: actionProcessMessage.part.0 (action.c:1318)
==48067== by 0x1751BC: actionProcessMessage (action.c:1689)
==48067== by 0x1751BC: processMsgMain.part.0 (action.c:1688)
==48067== by 0x17564F: processMsgMain (action.c:1678)
==48067== by 0x17564F: processBatchMain (action.c:1722)
==48067== by 0x1683A1: ConsumerReg (queue.c:2158)
==48067== by 0x164FB0: wtiWorker (wti.c:430)
==48067==
==48067== 4,112 bytes in 1 blocks are definitely lost in loss record 90 of 98
==48067== at 0x48407B4: malloc (vg_replace_malloc.c:381)
==48067== by 0x54FA8CA: ???
==48067== by 0x54FF24A: ???
==48067== by 0x54FC70D: ???
==48067== by 0x545EE78: ???
==48067== by 0x534271E: ???
==48067== by 0x5311A00: ???
==48067== by 0x5313067: ???
==48067== by 0x53133F8: ???
==48067== by 0x173891: actionCallDoAction (action.c:1276)
==48067== by 0x173891: actionProcessMessage.part.0 (action.c:1318)
==48067== by 0x1751BC: actionProcessMessage (action.c:1689)
==48067== by 0x1751BC: processMsgMain.part.0 (action.c:1688)
==48067== by 0x17564F: processMsgMain (action.c:1678)
==48067== by 0x17564F: processBatchMain (action.c:1722)
==48067==
==48067== 8,364 (1,000 direct, 7,364 indirect) bytes in 1 blocks are definitely lost in loss record 93 of 98
==48067== at 0x48455EF: calloc (vg_replace_malloc.c:1328)
==48067== by 0x535E8FB: ???
==48067== by 0x5360158: ???
==48067== by 0x530FDF2: ???
==48067== by 0x5310E65: ???
==48067== by 0x5313481: ???
==48067== by 0x173891: actionCallDoAction (action.c:1276)
==48067== by 0x173891: actionProcessMessage.part.0 (action.c:1318)
==48067== by 0x1751BC: actionProcessMessage (action.c:1689)
==48067== by 0x1751BC: processMsgMain.part.0 (action.c:1688)
==48067== by 0x17564F: processMsgMain (action.c:1678)
==48067== by 0x17564F: processBatchMain (action.c:1722)
==48067== by 0x1683A1: ConsumerReg (queue.c:2158)
==48067== by 0x164FB0: wtiWorker (wti.c:430)
==48067== by 0x1637A9: wtpWorker (wtp.c:435)
==48067==
==48067== 207,872 bytes in 812 blocks are definitely lost in loss record 97 of 98
==48067== at 0x48455EF: calloc (vg_replace_malloc.c:1328)
==48067== by 0x495BF12: fjson_object_new_object (in /usr/lib/x86_64-linux-gnu/libfastjson.so.4.3.0)
==48067== by 0x5A51864: ???
==48067== by 0x5A516B2: ???
==48067== by 0x5A516B2: ???
==48067== by 0x5A51D8A: ???
==48067== by 0x5A44B8B: ???
==48067== by 0x173891: actionCallDoAction (action.c:1276)
==48067== by 0x173891: actionProcessMessage.part.0 (action.c:1318)
==48067== by 0x1751BC: actionProcessMessage (action.c:1689)
==48067== by 0x1751BC: processMsgMain.part.0 (action.c:1688)
==48067== by 0x175478: processMsgMain (action.c:1678)
==48067== by 0x175478: doSubmitToActionQ (action.c:1855)
==48067== by 0x16C82E: execAct (ruleset.c:209)
==48067== by 0x16C82E: scriptExec (ruleset.c:599)
==48067== by 0x16D4F4: processBatch (ruleset.c:660)
==48067==
==48067== 3,289,856 bytes in 12,851 blocks are definitely lost in loss record 98 of 98
==48067== at 0x48455EF: calloc (vg_replace_malloc.c:1328)
==48067== by 0x495BF12: fjson_object_new_object (in /usr/lib/x86_64-linux-gnu/libfastjson.so.4.3.0)
==48067== by 0x5A51864: ???
==48067== by 0x5A51D8A: ???
==48067== by 0x5A44B8B: ???
==48067== by 0x173891: actionCallDoAction (action.c:1276)
==48067== by 0x173891: actionProcessMessage.part.0 (action.c:1318)
==48067== by 0x1751BC: actionProcessMessage (action.c:1689)
==48067== by 0x1751BC: processMsgMain.part.0 (action.c:1688)
==48067== by 0x175478: processMsgMain (action.c:1678)
==48067== by 0x175478: doSubmitToActionQ (action.c:1855)
==48067== by 0x16C82E: execAct (ruleset.c:209)
==48067== by 0x16C82E: scriptExec (ruleset.c:599)
==48067== by 0x16D4F4: processBatch (ruleset.c:660)
==48067== by 0x11FD1B: msgConsumer (rsyslogd.c:779)
==48067== by 0x1683A1: ConsumerReg (queue.c:2158)
==48067==
==48067== LEAK SUMMARY:
==48067== definitely lost: 3,503,922 bytes in 13,692 blocks
==48067== indirectly lost: 8,360 bytes in 68 blocks
==48067== possibly lost: 0 bytes in 0 blocks
==48067== still reachable: 57,506 bytes in 158 blocks
==48067== suppressed: 0 bytes in 0 blocks
==48067== Reachable blocks (those to which a pointer was found) are not shown.
==48067== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==48067==
==48067== For lists of detected and suppressed errors, rerun with: -s
==48067== ERROR SUMMARY: 13 errors from 13 contexts (suppressed: 0 from 0)

Kind Regards
---

________________________________
From: Rainer Gerhards <rgerhards@hq.adiscon.com>
Sent: 07 December 2023 13:39
To: Adriaan de Waal <adriaan.dewaal@nymbis.cloud>
Cc: rsyslog-users <rsyslog@lists.adiscon.com>
Subject: Re: [rsyslog] Memory Leak?

Thx, that is useful. Obviously we have a leak:

> 48,420,608 bytes in 189,143 blocks are definitely lost

unfortunately, we do not have symbols, so we do not know exactly where
it happens. Can you install debug symbols? Usually there is a package
rsyslog-debuginfo (or the like), which should at least contain the
symbols for the static (non-plugin) parts of rsyslog. That would help
with the stacktrace and ease bug hunting.

Rainer

El jue, 7 dic 2023 a las 11:41, Adriaan de Waal
(<adriaan.dewaal@nymbis.cloud>) escribi?:
>
> Good day
>
> I ran rsyslog under Valgrind for a while and got the following output:
>
> ==31532== Memcheck, a memory error detector
> ==31532== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
> ==31532== Using Valgrind-3.19.0 and LibVEX; rerun with -h for copyright info
> ==31532== Command: /usr/sbin/rsyslogd -n
> ==31532== Parent PID: 1
> ==31532==
> ==31532==
> ==31532== HEAP SUMMARY:
> ==31532== in use at exit: 55,152,508 bytes in 215,413 blocks
> ==31532== total heap usage: 36,459,066 allocs, 36,243,653 frees, 10,340,685,855 bytes allocated
> ==31532==
> ==31532== 32 bytes in 1 blocks are definitely lost in loss record 35 of 101
> ==31532== at 0x48407B4: malloc (vg_replace_malloc.c:381)
> ==31532== by 0x54FE7B6: ???
> ==31532== by 0x54FE8D4: ???
> ==31532== by 0x550796F: ???
> ==31532== by 0x54FC760: ???
> ==31532== by 0x545EE78: ???
> ==31532== by 0x534271E: ???
> ==31532== by 0x5311A00: ???
> ==31532== by 0x5313067: ???
> ==31532== by 0x53133F8: ???
> ==31532== by 0x173891: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x1751BC: ??? (in /usr/sbin/rsyslogd)
> ==31532==
> ==31532== 48 bytes in 1 blocks are definitely lost in loss record 53 of 101
> ==31532== at 0x48455EF: calloc (vg_replace_malloc.c:1328)
> ==31532== by 0x16DBF2: propConstruct (in /usr/sbin/rsyslogd)
> ==31532== by 0x52E4199: ???
> ==31532== by 0x52E63DF: ???
> ==31532== by 0x52E6C32: ???
> ==31532== by 0x4ACB043: start_thread (pthread_create.c:442)
> ==31532== by 0x4B4A87F: clone (clone.S:100)
> ==31532==
> ==31532== 48 bytes in 1 blocks are definitely lost in loss record 54 of 101
> ==31532== at 0x48455EF: calloc (vg_replace_malloc.c:1328)
> ==31532== by 0x16DBF2: propConstruct (in /usr/sbin/rsyslogd)
> ==31532== by 0x52E41D5: ???
> ==31532== by 0x52E63DF: ???
> ==31532== by 0x52E6C32: ???
> ==31532== by 0x4ACB043: start_thread (pthread_create.c:442)
> ==31532== by 0x4B4A87F: clone (clone.S:100)
> ==31532==
> ==31532== 104 bytes in 1 blocks are definitely lost in loss record 62 of 101
> ==31532== at 0x48407B4: malloc (vg_replace_malloc.c:381)
> ==31532== by 0x5500EDE: ???
> ==31532== by 0x54FC78D: ???
> ==31532== by 0x545EE78: ???
> ==31532== by 0x534271E: ???
> ==31532== by 0x5311A00: ???
> ==31532== by 0x5313067: ???
> ==31532== by 0x53133F8: ???
> ==31532== by 0x173891: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x1751BC: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x17564F: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x1683A1: ??? (in /usr/sbin/rsyslogd)
> ==31532==
> ==31532== 128 (16 direct, 112 indirect) bytes in 1 blocks are definitely lost in loss record 69 of 101
> ==31532== at 0x48407B4: malloc (vg_replace_malloc.c:381)
> ==31532== by 0x5507861: ???
> ==31532== by 0x5507B49: ???
> ==31532== by 0x54FC760: ???
> ==31532== by 0x545EE78: ???
> ==31532== by 0x534271E: ???
> ==31532== by 0x5311A00: ???
> ==31532== by 0x5313067: ???
> ==31532== by 0x53133F8: ???
> ==31532== by 0x173891: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x1751BC: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x17564F: ??? (in /usr/sbin/rsyslogd)
> ==31532==
> ==31532== 205 bytes in 10 blocks are definitely lost in loss record 74 of 101
> ==31532== at 0x48407B4: malloc (vg_replace_malloc.c:381)
> ==31532== by 0x4AE07F9: strdup (strdup.c:42)
> ==31532== by 0x132E7B: yylex (in /usr/sbin/rsyslogd)
> ==31532== by 0x12ED47: yyparse (in /usr/sbin/rsyslogd)
> ==31532== by 0x1459AC: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x12130A: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x11F0E8: main (in /usr/sbin/rsyslogd)
> ==31532==
> ==31532== 205 bytes in 10 blocks are definitely lost in loss record 75 of 101
> ==31532== at 0x48407B4: malloc (vg_replace_malloc.c:381)
> ==31532== by 0x4AE07F9: strdup (strdup.c:42)
> ==31532== by 0x150E0C: msgPropDescrFill (in /usr/sbin/rsyslogd)
> ==31532== by 0x13B71A: cnffuncexistsNew (in /usr/sbin/rsyslogd)
> ==31532== by 0x12F585: yyparse (in /usr/sbin/rsyslogd)
> ==31532== by 0x1459AC: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x12130A: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x11F0E8: main (in /usr/sbin/rsyslogd)
> ==31532==
> ==31532== 404 (392 direct, 12 indirect) bytes in 1 blocks are definitely lost in loss record 82 of 101
> ==31532== at 0x48407B4: malloc (vg_replace_malloc.c:381)
> ==31532== by 0x54FEFA6: ???
> ==31532== by 0x54FF22E: ???
> ==31532== by 0x54FC70D: ???
> ==31532== by 0x545EE78: ???
> ==31532== by 0x534271E: ???
> ==31532== by 0x5311A00: ???
> ==31532== by 0x5313067: ???
> ==31532== by 0x53133F8: ???
> ==31532== by 0x173891: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x1751BC: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x17564F: ??? (in /usr/sbin/rsyslogd)
> ==31532==
> ==31532== 512 bytes in 2 blocks are possibly lost in loss record 85 of 101
> ==31532== at 0x48455EF: calloc (vg_replace_malloc.c:1328)
> ==31532== by 0x495BF12: fjson_object_new_object (in /usr/lib/x86_64-linux-gnu/libfastjson.so.4.3.0)
> ==31532== by 0x5A51864: ???
> ==31532== by 0x5A516B2: ???
> ==31532== by 0x5A516B2: ???
> ==31532== by 0x5A51D8A: ???
> ==31532== by 0x5A44B8B: ???
> ==31532== by 0x173891: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x1751BC: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x175478: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x16C82E: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x16D4F4: ??? (in /usr/sbin/rsyslogd)
> ==31532==
> ==31532== 904 (32 direct, 872 indirect) bytes in 1 blocks are definitely lost in loss record 88 of 101
> ==31532== at 0x48407B4: malloc (vg_replace_malloc.c:381)
> ==31532== by 0x54FC6A8: ???
> ==31532== by 0x545EE78: ???
> ==31532== by 0x534271E: ???
> ==31532== by 0x5311A00: ???
> ==31532== by 0x5313067: ???
> ==31532== by 0x53133F8: ???
> ==31532== by 0x173891: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x1751BC: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x17564F: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x1683A1: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x164FB0: wtiWorker (in /usr/sbin/rsyslogd)
> ==31532==
> ==31532== 1,792 bytes in 7 blocks are possibly lost in loss record 90 of 101
> ==31532== at 0x48455EF: calloc (vg_replace_malloc.c:1328)
> ==31532== by 0x495BF12: fjson_object_new_object (in /usr/lib/x86_64-linux-gnu/libfastjson.so.4.3.0)
> ==31532== by 0x5A51864: ???
> ==31532== by 0x5A51D8A: ???
> ==31532== by 0x5A44B8B: ???
> ==31532== by 0x173891: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x1751BC: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x175478: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x16C82E: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x16D4F4: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x11FD1B: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x1683A1: ??? (in /usr/sbin/rsyslogd)
> ==31532==
> ==31532== 4,112 bytes in 1 blocks are definitely lost in loss record 93 of 101
> ==31532== at 0x48407B4: malloc (vg_replace_malloc.c:381)
> ==31532== by 0x54FA8CA: ???
> ==31532== by 0x54FF24A: ???
> ==31532== by 0x54FC70D: ???
> ==31532== by 0x545EE78: ???
> ==31532== by 0x534271E: ???
> ==31532== by 0x5311A00: ???
> ==31532== by 0x5313067: ???
> ==31532== by 0x53133F8: ???
> ==31532== by 0x173891: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x1751BC: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x17564F: ??? (in /usr/sbin/rsyslogd)
> ==31532==
> ==31532== 8,364 (1,000 direct, 7,364 indirect) bytes in 1 blocks are definitely lost in loss record 96 of 101
> ==31532== at 0x48455EF: calloc (vg_replace_malloc.c:1328)
> ==31532== by 0x535E8FB: ???
> ==31532== by 0x5360158: ???
> ==31532== by 0x530FDF2: ???
> ==31532== by 0x5310E65: ???
> ==31532== by 0x5313481: ???
> ==31532== by 0x173891: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x1751BC: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x17564F: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x1683A1: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x164FB0: wtiWorker (in /usr/sbin/rsyslogd)
> ==31532== by 0x1637A9: ??? (in /usr/sbin/rsyslogd)
> ==31532==
> ==31532== 6,657,280 bytes in 26,005 blocks are definitely lost in loss record 100 of 101
> ==31532== at 0x48455EF: calloc (vg_replace_malloc.c:1328)
> ==31532== by 0x495BF12: fjson_object_new_object (in /usr/lib/x86_64-linux-gnu/libfastjson.so.4.3.0)
> ==31532== by 0x5A51864: ???
> ==31532== by 0x5A516B2: ???
> ==31532== by 0x5A516B2: ???
> ==31532== by 0x5A51D8A: ???
> ==31532== by 0x5A44B8B: ???
> ==31532== by 0x173891: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x1751BC: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x175478: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x16C82E: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x16D4F4: ??? (in /usr/sbin/rsyslogd)
> ==31532==
> ==31532== 48,420,608 bytes in 189,143 blocks are definitely lost in loss record 101 of 101
> ==31532== at 0x48455EF: calloc (vg_replace_malloc.c:1328)
> ==31532== by 0x495BF12: fjson_object_new_object (in /usr/lib/x86_64-linux-gnu/libfastjson.so.4.3.0)
> ==31532== by 0x5A51864: ???
> ==31532== by 0x5A51D8A: ???
> ==31532== by 0x5A44B8B: ???
> ==31532== by 0x173891: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x1751BC: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x175478: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x16C82E: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x16D4F4: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x11FD1B: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x1683A1: ??? (in /usr/sbin/rsyslogd)
> ==31532==
> ==31532== LEAK SUMMARY:
> ==31532== definitely lost: 55,084,082 bytes in 215,177 blocks
> ==31532== indirectly lost: 8,360 bytes in 68 blocks
> ==31532== possibly lost: 2,304 bytes in 9 blocks
> ==31532== still reachable: 57,762 bytes in 159 blocks
> ==31532== suppressed: 0 bytes in 0 blocks
> ==31532== Reachable blocks (those to which a pointer was found) are not shown.
> ==31532== To see them, rerun with: --leak-check=full --show-leak-kinds=all
> ==31532==
> ==31532== For lists of detected and suppressed errors, rerun with: -s
> ==31532== ERROR SUMMARY: 15 errors from 15 contexts (suppressed: 0 from 0)
>
> If you need me to run this for longer I can do that as well, or if there is anything else I should do.
>
> ***
> As an additional test, I stopped shipping logs to the relevant listener (10514/UDP) for a few hours and monitored the memory usage, which remained constant. The syslog-TLS (port 6514) listener remained in use throughout with one system shipping logs to it. This seems to show that something related to the UDP listener on port 10514 is where the problem lies.
>
> Kind Regards
> ---
> ________________________________
> From: Rainer Gerhards <rgerhards@hq.adiscon.com>
> Sent: 06 December 2023 21:12
> To: Adriaan de Waal <adriaan.dewaal@nymbis.cloud>
> Cc: rsyslog-users <rsyslog@lists.adiscon.com>
> Subject: Re: [rsyslog] Memory Leak?
>
> Great.
>
> Do you know valgrind? If so, you could run rsyslog under valgrind control, best in the forground. When you terminate rsyslog, valgrind will show leak stats of any.
>
> Rainer
>
> Sent from phone, thus brief.
>
> Adriaan de Waal <adriaan.dewaal@nymbis.cloud> schrieb am Mi., 6. Dez. 2023, 19:56:
>
> Good day
>
> Looking at the main Q statistics, the size remains mostly constant around 30. The max queue size currently sits at 400. There is also a queue (linked list + disk assisted) configured for the omkafka action, with the size not really going above single digits (and the DA queue stats remain at 0). Also note I completely disabled the omkafka action's queue previously as a test, but that didn't make a difference. There are no other queues.
>
> Kind Regards
> ---
>
> ________________________________
> From: Rainer Gerhards <rgerhards@hq.adiscon.com>
> Sent: 06 December 2023 17:05
> To: rsyslog-users <rsyslog@lists.adiscon.com>
> Cc: Adriaan de Waal <adriaan.dewaal@nymbis.cloud>
> Subject: Re: [rsyslog] Memory Leak?
>
> Look at the queue sizes in impstats. Are they ever-increasing?
>
> Rainer
>
> El mi?, 6 dic 2023 a las 14:30, Adriaan de Waal via rsyslog
> (<rsyslog@lists.adiscon.com>) escribi?:
> >
> > Good day
> >
> > I am trying to diagnose and resolve an issue whereby the memory consumed by the rsyslog daemon increases linearly over time. This continues until it consumes most of the memory (including swap) of the system and the service has to be restarted to free up memory. There are two servers with identical configurations. What I noticed is that the server receiving a higher volume of messages also consumes memory at a higher rate. In other word it appears as if the message rate, or message volume, is directly proportional to the rate at which memory is consumed.
> >
> > Below is the version information for the rsyslogd daemon:
> > rsyslogd 8.2310.0 (aka 2023.10) compiled with:
> > PLATFORM: x86_64-pc-linux-gnu
> > PLATFORM (lsb_release -d):
> > FEATURE_REGEXP: Yes
> > GSSAPI Kerberos 5 support: No
> > FEATURE_DEBUG (debug build, slow code): No
> > 32bit Atomic operations supported: Yes
> > 64bit Atomic operations supported: Yes
> > memory allocator: system default
> > Runtime Instrumentation (slow code): No
> > uuid support: Yes
> > systemd support: Yes
> > Config file: /etc/rsyslog.conf
> > PID file: /var/run/rsyslogd.pid
> > Number of Bits in RainerScript integers: 64
> >
> > It is running on Debian 12 servers.
> >
> > To provide you with more background detail, initially I configured three listeners: one UDP (port 514), one TCP (port 514) and one TLS (port 6514). A single system was configured to push logs to the TLS port and that worked well (no increase in memory usage over time). Recently I added another UDP listener (port 10514) and started configured a number of systems to push their logs to this port, but since then I've observed the described gradual memory increase.
> >
> > This new listener is configured as follows: A ruleset was created and bound to this listener (the ruleset doesn't have its own queue). The ruleset first runs the mmutf8fix action then calls a different ruleset (named "normalise"), which normalises the data (just sets specific variables that is later used in a template to construct a JSON object). After the call to the "normalise" ruleset returns, a mmnormalize action is performed and some additional variables are set. Lastly the ruleset (the one bound to the listener) then calls yet another ruleset (named "kafka_output"), which is used to construct a JSON object from the various variables and uses the omkafka action to push this to a Kafka cluster.
> >
> > The flow of the above can be visualised as:
> > Source -> Syslog Server [10514/UDP] -> [listener ruleset] -> [normalise ruleset] -> [kafka_output ruleset]
> >
> > It should also be noted the original listeners are configured in much the same way, apart from having calls to even more rulesets. I haven't tested if the UDP listener on port 514 exhibits the same behaviour (it isn't currently being used).
> >
> > This rsyslog daemon is also used to capture locally generated logs and the statistics (impstats) module is also loaded.
> >
> > What can I do to troubleshoot what's causing this "memory leak"?
> >
> > Kind Regards
> > ---
> >
> > _______________________________________________
> > rsyslog mailing list
> > https://lists.adiscon.net/mailman/listinfo/rsyslog
> > http://www.rsyslog.com/professional-services/
> > What's up with rsyslog? Follow https://twitter.com/rgerhards
> > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.
_______________________________________________
rsyslog mailing list
https://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.
Re: Memory Leak? [ In reply to ]
Good day

I'd just like to follow up whether there is anything else I should try?

Kind Regards
---
________________________________
From: rsyslog <rsyslog-bounces@lists.adiscon.com> on behalf of Adriaan de Waal via rsyslog <rsyslog@lists.adiscon.com>
Sent: 07 December 2023 14:56
To: Rainer Gerhards <rgerhards@hq.adiscon.com>
Cc: Adriaan de Waal <adriaan.dewaal@nymbis.cloud>; rsyslog-users <rsyslog@lists.adiscon.com>
Subject: Re: [rsyslog] Memory Leak?

Good day

I found debug symbol packages for:

* rsyslog
* rsyslog-gnutls
* rsyslog-imptcp
* rsyslog-kafka

All these were installed and I re-ran rsyslog under Valgrind for a while and this is the result:

==48067== Memcheck, a memory error detector
==48067== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
==48067== Using Valgrind-3.19.0 and LibVEX; rerun with -h for copyright info
==48067== Command: /usr/sbin/rsyslogd -n
==48067== Parent PID: 1
==48067==
==48067==
==48067== HEAP SUMMARY:
==48067== in use at exit: 3,569,788 bytes in 13,918 blocks
==48067== total heap usage: 3,724,046 allocs, 3,710,128 frees, 1,141,461,392 bytes allocated
==48067==
==48067== 32 bytes in 1 blocks are definitely lost in loss record 35 of 98
==48067== at 0x48407B4: malloc (vg_replace_malloc.c:381)
==48067== by 0x54FE7B6: ???
==48067== by 0x54FE8D4: ???
==48067== by 0x550796F: ???
==48067== by 0x54FC760: ???
==48067== by 0x545EE78: ???
==48067== by 0x534271E: ???
==48067== by 0x5311A00: ???
==48067== by 0x5313067: ???
==48067== by 0x53133F8: ???
==48067== by 0x173891: actionCallDoAction (action.c:1276)
==48067== by 0x173891: actionProcessMessage.part.0 (action.c:1318)
==48067== by 0x1751BC: actionProcessMessage (action.c:1689)
==48067== by 0x1751BC: processMsgMain.part.0 (action.c:1688)
==48067==
==48067== 48 bytes in 1 blocks are definitely lost in loss record 53 of 98
==48067== at 0x48455EF: calloc (vg_replace_malloc.c:1328)
==48067== by 0x16DBF2: propConstruct (prop.c:56)
==48067== by 0x52E4199: ???
==48067== by 0x52E63DF: ???
==48067== by 0x52E6C32: ???
==48067== by 0x4ACB043: start_thread (pthread_create.c:442)
==48067== by 0x4B4A87F: clone (clone.S:100)
==48067==
==48067== 48 bytes in 1 blocks are definitely lost in loss record 54 of 98
==48067== at 0x48455EF: calloc (vg_replace_malloc.c:1328)
==48067== by 0x16DBF2: propConstruct (prop.c:56)
==48067== by 0x52E41D5: ???
==48067== by 0x52E63DF: ???
==48067== by 0x52E6C32: ???
==48067== by 0x4ACB043: start_thread (pthread_create.c:442)
==48067== by 0x4B4A87F: clone (clone.S:100)
==48067==
==48067== 104 bytes in 1 blocks are definitely lost in loss record 62 of 98
==48067== at 0x48407B4: malloc (vg_replace_malloc.c:381)
==48067== by 0x5500EDE: ???
==48067== by 0x54FC78D: ???
==48067== by 0x545EE78: ???
==48067== by 0x534271E: ???
==48067== by 0x5311A00: ???
==48067== by 0x5313067: ???
==48067== by 0x53133F8: ???
==48067== by 0x173891: actionCallDoAction (action.c:1276)
==48067== by 0x173891: actionProcessMessage.part.0 (action.c:1318)
==48067== by 0x1751BC: actionProcessMessage (action.c:1689)
==48067== by 0x1751BC: processMsgMain.part.0 (action.c:1688)
==48067== by 0x17564F: processMsgMain (action.c:1678)
==48067== by 0x17564F: processBatchMain (action.c:1722)
==48067== by 0x1683A1: ConsumerReg (queue.c:2158)
==48067==
==48067== 128 (16 direct, 112 indirect) bytes in 1 blocks are definitely lost in loss record 69 of 98
==48067== at 0x48407B4: malloc (vg_replace_malloc.c:381)
==48067== by 0x5507861: ???
==48067== by 0x5507B49: ???
==48067== by 0x54FC760: ???
==48067== by 0x545EE78: ???
==48067== by 0x534271E: ???
==48067== by 0x5311A00: ???
==48067== by 0x5313067: ???
==48067== by 0x53133F8: ???
==48067== by 0x173891: actionCallDoAction (action.c:1276)
==48067== by 0x173891: actionProcessMessage.part.0 (action.c:1318)
==48067== by 0x1751BC: actionProcessMessage (action.c:1689)
==48067== by 0x1751BC: processMsgMain.part.0 (action.c:1688)
==48067== by 0x17564F: processMsgMain (action.c:1678)
==48067== by 0x17564F: processBatchMain (action.c:1722)
==48067==
==48067== 205 bytes in 10 blocks are definitely lost in loss record 74 of 98
==48067== at 0x48407B4: malloc (vg_replace_malloc.c:381)
==48067== by 0x4AE07F9: strdup (strdup.c:42)
==48067== by 0x132E7B: yylex (lexer.l:297)
==48067== by 0x12ED47: yyparse (grammar.c:1535)
==48067== by 0x1459AC: load (rsconf.c:1494)
==48067== by 0x12130A: initAll (rsyslogd.c:1636)
==48067== by 0x11F0E8: main (rsyslogd.c:2337)
==48067==
==48067== 205 bytes in 10 blocks are definitely lost in loss record 75 of 98
==48067== at 0x48407B4: malloc (vg_replace_malloc.c:381)
==48067== by 0x4AE07F9: strdup (strdup.c:42)
==48067== by 0x150E0C: msgPropDescrFill (msg.c:5431)
==48067== by 0x13B71A: cnffuncexistsNew (rainerscript.c:5351)
==48067== by 0x12F585: yyparse (grammar.y:222)
==48067== by 0x1459AC: load (rsconf.c:1494)
==48067== by 0x12130A: initAll (rsyslogd.c:1636)
==48067== by 0x11F0E8: main (rsyslogd.c:2337)
==48067==
==48067== 404 (392 direct, 12 indirect) bytes in 1 blocks are definitely lost in loss record 81 of 98
==48067== at 0x48407B4: malloc (vg_replace_malloc.c:381)
==48067== by 0x54FEFA6: ???
==48067== by 0x54FF22E: ???
==48067== by 0x54FC70D: ???
==48067== by 0x545EE78: ???
==48067== by 0x534271E: ???
==48067== by 0x5311A00: ???
==48067== by 0x5313067: ???
==48067== by 0x53133F8: ???
==48067== by 0x173891: actionCallDoAction (action.c:1276)
==48067== by 0x173891: actionProcessMessage.part.0 (action.c:1318)
==48067== by 0x1751BC: actionProcessMessage (action.c:1689)
==48067== by 0x1751BC: processMsgMain.part.0 (action.c:1688)
==48067== by 0x17564F: processMsgMain (action.c:1678)
==48067== by 0x17564F: processBatchMain (action.c:1722)
==48067==
==48067== 904 (32 direct, 872 indirect) bytes in 1 blocks are definitely lost in loss record 86 of 98
==48067== at 0x48407B4: malloc (vg_replace_malloc.c:381)
==48067== by 0x54FC6A8: ???
==48067== by 0x545EE78: ???
==48067== by 0x534271E: ???
==48067== by 0x5311A00: ???
==48067== by 0x5313067: ???
==48067== by 0x53133F8: ???
==48067== by 0x173891: actionCallDoAction (action.c:1276)
==48067== by 0x173891: actionProcessMessage.part.0 (action.c:1318)
==48067== by 0x1751BC: actionProcessMessage (action.c:1689)
==48067== by 0x1751BC: processMsgMain.part.0 (action.c:1688)
==48067== by 0x17564F: processMsgMain (action.c:1678)
==48067== by 0x17564F: processBatchMain (action.c:1722)
==48067== by 0x1683A1: ConsumerReg (queue.c:2158)
==48067== by 0x164FB0: wtiWorker (wti.c:430)
==48067==
==48067== 4,112 bytes in 1 blocks are definitely lost in loss record 90 of 98
==48067== at 0x48407B4: malloc (vg_replace_malloc.c:381)
==48067== by 0x54FA8CA: ???
==48067== by 0x54FF24A: ???
==48067== by 0x54FC70D: ???
==48067== by 0x545EE78: ???
==48067== by 0x534271E: ???
==48067== by 0x5311A00: ???
==48067== by 0x5313067: ???
==48067== by 0x53133F8: ???
==48067== by 0x173891: actionCallDoAction (action.c:1276)
==48067== by 0x173891: actionProcessMessage.part.0 (action.c:1318)
==48067== by 0x1751BC: actionProcessMessage (action.c:1689)
==48067== by 0x1751BC: processMsgMain.part.0 (action.c:1688)
==48067== by 0x17564F: processMsgMain (action.c:1678)
==48067== by 0x17564F: processBatchMain (action.c:1722)
==48067==
==48067== 8,364 (1,000 direct, 7,364 indirect) bytes in 1 blocks are definitely lost in loss record 93 of 98
==48067== at 0x48455EF: calloc (vg_replace_malloc.c:1328)
==48067== by 0x535E8FB: ???
==48067== by 0x5360158: ???
==48067== by 0x530FDF2: ???
==48067== by 0x5310E65: ???
==48067== by 0x5313481: ???
==48067== by 0x173891: actionCallDoAction (action.c:1276)
==48067== by 0x173891: actionProcessMessage.part.0 (action.c:1318)
==48067== by 0x1751BC: actionProcessMessage (action.c:1689)
==48067== by 0x1751BC: processMsgMain.part.0 (action.c:1688)
==48067== by 0x17564F: processMsgMain (action.c:1678)
==48067== by 0x17564F: processBatchMain (action.c:1722)
==48067== by 0x1683A1: ConsumerReg (queue.c:2158)
==48067== by 0x164FB0: wtiWorker (wti.c:430)
==48067== by 0x1637A9: wtpWorker (wtp.c:435)
==48067==
==48067== 207,872 bytes in 812 blocks are definitely lost in loss record 97 of 98
==48067== at 0x48455EF: calloc (vg_replace_malloc.c:1328)
==48067== by 0x495BF12: fjson_object_new_object (in /usr/lib/x86_64-linux-gnu/libfastjson.so.4.3.0)
==48067== by 0x5A51864: ???
==48067== by 0x5A516B2: ???
==48067== by 0x5A516B2: ???
==48067== by 0x5A51D8A: ???
==48067== by 0x5A44B8B: ???
==48067== by 0x173891: actionCallDoAction (action.c:1276)
==48067== by 0x173891: actionProcessMessage.part.0 (action.c:1318)
==48067== by 0x1751BC: actionProcessMessage (action.c:1689)
==48067== by 0x1751BC: processMsgMain.part.0 (action.c:1688)
==48067== by 0x175478: processMsgMain (action.c:1678)
==48067== by 0x175478: doSubmitToActionQ (action.c:1855)
==48067== by 0x16C82E: execAct (ruleset.c:209)
==48067== by 0x16C82E: scriptExec (ruleset.c:599)
==48067== by 0x16D4F4: processBatch (ruleset.c:660)
==48067==
==48067== 3,289,856 bytes in 12,851 blocks are definitely lost in loss record 98 of 98
==48067== at 0x48455EF: calloc (vg_replace_malloc.c:1328)
==48067== by 0x495BF12: fjson_object_new_object (in /usr/lib/x86_64-linux-gnu/libfastjson.so.4.3.0)
==48067== by 0x5A51864: ???
==48067== by 0x5A51D8A: ???
==48067== by 0x5A44B8B: ???
==48067== by 0x173891: actionCallDoAction (action.c:1276)
==48067== by 0x173891: actionProcessMessage.part.0 (action.c:1318)
==48067== by 0x1751BC: actionProcessMessage (action.c:1689)
==48067== by 0x1751BC: processMsgMain.part.0 (action.c:1688)
==48067== by 0x175478: processMsgMain (action.c:1678)
==48067== by 0x175478: doSubmitToActionQ (action.c:1855)
==48067== by 0x16C82E: execAct (ruleset.c:209)
==48067== by 0x16C82E: scriptExec (ruleset.c:599)
==48067== by 0x16D4F4: processBatch (ruleset.c:660)
==48067== by 0x11FD1B: msgConsumer (rsyslogd.c:779)
==48067== by 0x1683A1: ConsumerReg (queue.c:2158)
==48067==
==48067== LEAK SUMMARY:
==48067== definitely lost: 3,503,922 bytes in 13,692 blocks
==48067== indirectly lost: 8,360 bytes in 68 blocks
==48067== possibly lost: 0 bytes in 0 blocks
==48067== still reachable: 57,506 bytes in 158 blocks
==48067== suppressed: 0 bytes in 0 blocks
==48067== Reachable blocks (those to which a pointer was found) are not shown.
==48067== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==48067==
==48067== For lists of detected and suppressed errors, rerun with: -s
==48067== ERROR SUMMARY: 13 errors from 13 contexts (suppressed: 0 from 0)

Kind Regards
---

________________________________
From: Rainer Gerhards <rgerhards@hq.adiscon.com>
Sent: 07 December 2023 13:39
To: Adriaan de Waal <adriaan.dewaal@nymbis.cloud>
Cc: rsyslog-users <rsyslog@lists.adiscon.com>
Subject: Re: [rsyslog] Memory Leak?

Thx, that is useful. Obviously we have a leak:

> 48,420,608 bytes in 189,143 blocks are definitely lost

unfortunately, we do not have symbols, so we do not know exactly where
it happens. Can you install debug symbols? Usually there is a package
rsyslog-debuginfo (or the like), which should at least contain the
symbols for the static (non-plugin) parts of rsyslog. That would help
with the stacktrace and ease bug hunting.

Rainer

El jue, 7 dic 2023 a las 11:41, Adriaan de Waal
(<adriaan.dewaal@nymbis.cloud>) escribi?:
>
> Good day
>
> I ran rsyslog under Valgrind for a while and got the following output:
>
> ==31532== Memcheck, a memory error detector
> ==31532== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
> ==31532== Using Valgrind-3.19.0 and LibVEX; rerun with -h for copyright info
> ==31532== Command: /usr/sbin/rsyslogd -n
> ==31532== Parent PID: 1
> ==31532==
> ==31532==
> ==31532== HEAP SUMMARY:
> ==31532== in use at exit: 55,152,508 bytes in 215,413 blocks
> ==31532== total heap usage: 36,459,066 allocs, 36,243,653 frees, 10,340,685,855 bytes allocated
> ==31532==
> ==31532== 32 bytes in 1 blocks are definitely lost in loss record 35 of 101
> ==31532== at 0x48407B4: malloc (vg_replace_malloc.c:381)
> ==31532== by 0x54FE7B6: ???
> ==31532== by 0x54FE8D4: ???
> ==31532== by 0x550796F: ???
> ==31532== by 0x54FC760: ???
> ==31532== by 0x545EE78: ???
> ==31532== by 0x534271E: ???
> ==31532== by 0x5311A00: ???
> ==31532== by 0x5313067: ???
> ==31532== by 0x53133F8: ???
> ==31532== by 0x173891: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x1751BC: ??? (in /usr/sbin/rsyslogd)
> ==31532==
> ==31532== 48 bytes in 1 blocks are definitely lost in loss record 53 of 101
> ==31532== at 0x48455EF: calloc (vg_replace_malloc.c:1328)
> ==31532== by 0x16DBF2: propConstruct (in /usr/sbin/rsyslogd)
> ==31532== by 0x52E4199: ???
> ==31532== by 0x52E63DF: ???
> ==31532== by 0x52E6C32: ???
> ==31532== by 0x4ACB043: start_thread (pthread_create.c:442)
> ==31532== by 0x4B4A87F: clone (clone.S:100)
> ==31532==
> ==31532== 48 bytes in 1 blocks are definitely lost in loss record 54 of 101
> ==31532== at 0x48455EF: calloc (vg_replace_malloc.c:1328)
> ==31532== by 0x16DBF2: propConstruct (in /usr/sbin/rsyslogd)
> ==31532== by 0x52E41D5: ???
> ==31532== by 0x52E63DF: ???
> ==31532== by 0x52E6C32: ???
> ==31532== by 0x4ACB043: start_thread (pthread_create.c:442)
> ==31532== by 0x4B4A87F: clone (clone.S:100)
> ==31532==
> ==31532== 104 bytes in 1 blocks are definitely lost in loss record 62 of 101
> ==31532== at 0x48407B4: malloc (vg_replace_malloc.c:381)
> ==31532== by 0x5500EDE: ???
> ==31532== by 0x54FC78D: ???
> ==31532== by 0x545EE78: ???
> ==31532== by 0x534271E: ???
> ==31532== by 0x5311A00: ???
> ==31532== by 0x5313067: ???
> ==31532== by 0x53133F8: ???
> ==31532== by 0x173891: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x1751BC: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x17564F: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x1683A1: ??? (in /usr/sbin/rsyslogd)
> ==31532==
> ==31532== 128 (16 direct, 112 indirect) bytes in 1 blocks are definitely lost in loss record 69 of 101
> ==31532== at 0x48407B4: malloc (vg_replace_malloc.c:381)
> ==31532== by 0x5507861: ???
> ==31532== by 0x5507B49: ???
> ==31532== by 0x54FC760: ???
> ==31532== by 0x545EE78: ???
> ==31532== by 0x534271E: ???
> ==31532== by 0x5311A00: ???
> ==31532== by 0x5313067: ???
> ==31532== by 0x53133F8: ???
> ==31532== by 0x173891: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x1751BC: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x17564F: ??? (in /usr/sbin/rsyslogd)
> ==31532==
> ==31532== 205 bytes in 10 blocks are definitely lost in loss record 74 of 101
> ==31532== at 0x48407B4: malloc (vg_replace_malloc.c:381)
> ==31532== by 0x4AE07F9: strdup (strdup.c:42)
> ==31532== by 0x132E7B: yylex (in /usr/sbin/rsyslogd)
> ==31532== by 0x12ED47: yyparse (in /usr/sbin/rsyslogd)
> ==31532== by 0x1459AC: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x12130A: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x11F0E8: main (in /usr/sbin/rsyslogd)
> ==31532==
> ==31532== 205 bytes in 10 blocks are definitely lost in loss record 75 of 101
> ==31532== at 0x48407B4: malloc (vg_replace_malloc.c:381)
> ==31532== by 0x4AE07F9: strdup (strdup.c:42)
> ==31532== by 0x150E0C: msgPropDescrFill (in /usr/sbin/rsyslogd)
> ==31532== by 0x13B71A: cnffuncexistsNew (in /usr/sbin/rsyslogd)
> ==31532== by 0x12F585: yyparse (in /usr/sbin/rsyslogd)
> ==31532== by 0x1459AC: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x12130A: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x11F0E8: main (in /usr/sbin/rsyslogd)
> ==31532==
> ==31532== 404 (392 direct, 12 indirect) bytes in 1 blocks are definitely lost in loss record 82 of 101
> ==31532== at 0x48407B4: malloc (vg_replace_malloc.c:381)
> ==31532== by 0x54FEFA6: ???
> ==31532== by 0x54FF22E: ???
> ==31532== by 0x54FC70D: ???
> ==31532== by 0x545EE78: ???
> ==31532== by 0x534271E: ???
> ==31532== by 0x5311A00: ???
> ==31532== by 0x5313067: ???
> ==31532== by 0x53133F8: ???
> ==31532== by 0x173891: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x1751BC: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x17564F: ??? (in /usr/sbin/rsyslogd)
> ==31532==
> ==31532== 512 bytes in 2 blocks are possibly lost in loss record 85 of 101
> ==31532== at 0x48455EF: calloc (vg_replace_malloc.c:1328)
> ==31532== by 0x495BF12: fjson_object_new_object (in /usr/lib/x86_64-linux-gnu/libfastjson.so.4.3.0)
> ==31532== by 0x5A51864: ???
> ==31532== by 0x5A516B2: ???
> ==31532== by 0x5A516B2: ???
> ==31532== by 0x5A51D8A: ???
> ==31532== by 0x5A44B8B: ???
> ==31532== by 0x173891: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x1751BC: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x175478: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x16C82E: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x16D4F4: ??? (in /usr/sbin/rsyslogd)
> ==31532==
> ==31532== 904 (32 direct, 872 indirect) bytes in 1 blocks are definitely lost in loss record 88 of 101
> ==31532== at 0x48407B4: malloc (vg_replace_malloc.c:381)
> ==31532== by 0x54FC6A8: ???
> ==31532== by 0x545EE78: ???
> ==31532== by 0x534271E: ???
> ==31532== by 0x5311A00: ???
> ==31532== by 0x5313067: ???
> ==31532== by 0x53133F8: ???
> ==31532== by 0x173891: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x1751BC: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x17564F: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x1683A1: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x164FB0: wtiWorker (in /usr/sbin/rsyslogd)
> ==31532==
> ==31532== 1,792 bytes in 7 blocks are possibly lost in loss record 90 of 101
> ==31532== at 0x48455EF: calloc (vg_replace_malloc.c:1328)
> ==31532== by 0x495BF12: fjson_object_new_object (in /usr/lib/x86_64-linux-gnu/libfastjson.so.4.3.0)
> ==31532== by 0x5A51864: ???
> ==31532== by 0x5A51D8A: ???
> ==31532== by 0x5A44B8B: ???
> ==31532== by 0x173891: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x1751BC: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x175478: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x16C82E: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x16D4F4: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x11FD1B: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x1683A1: ??? (in /usr/sbin/rsyslogd)
> ==31532==
> ==31532== 4,112 bytes in 1 blocks are definitely lost in loss record 93 of 101
> ==31532== at 0x48407B4: malloc (vg_replace_malloc.c:381)
> ==31532== by 0x54FA8CA: ???
> ==31532== by 0x54FF24A: ???
> ==31532== by 0x54FC70D: ???
> ==31532== by 0x545EE78: ???
> ==31532== by 0x534271E: ???
> ==31532== by 0x5311A00: ???
> ==31532== by 0x5313067: ???
> ==31532== by 0x53133F8: ???
> ==31532== by 0x173891: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x1751BC: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x17564F: ??? (in /usr/sbin/rsyslogd)
> ==31532==
> ==31532== 8,364 (1,000 direct, 7,364 indirect) bytes in 1 blocks are definitely lost in loss record 96 of 101
> ==31532== at 0x48455EF: calloc (vg_replace_malloc.c:1328)
> ==31532== by 0x535E8FB: ???
> ==31532== by 0x5360158: ???
> ==31532== by 0x530FDF2: ???
> ==31532== by 0x5310E65: ???
> ==31532== by 0x5313481: ???
> ==31532== by 0x173891: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x1751BC: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x17564F: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x1683A1: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x164FB0: wtiWorker (in /usr/sbin/rsyslogd)
> ==31532== by 0x1637A9: ??? (in /usr/sbin/rsyslogd)
> ==31532==
> ==31532== 6,657,280 bytes in 26,005 blocks are definitely lost in loss record 100 of 101
> ==31532== at 0x48455EF: calloc (vg_replace_malloc.c:1328)
> ==31532== by 0x495BF12: fjson_object_new_object (in /usr/lib/x86_64-linux-gnu/libfastjson.so.4.3.0)
> ==31532== by 0x5A51864: ???
> ==31532== by 0x5A516B2: ???
> ==31532== by 0x5A516B2: ???
> ==31532== by 0x5A51D8A: ???
> ==31532== by 0x5A44B8B: ???
> ==31532== by 0x173891: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x1751BC: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x175478: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x16C82E: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x16D4F4: ??? (in /usr/sbin/rsyslogd)
> ==31532==
> ==31532== 48,420,608 bytes in 189,143 blocks are definitely lost in loss record 101 of 101
> ==31532== at 0x48455EF: calloc (vg_replace_malloc.c:1328)
> ==31532== by 0x495BF12: fjson_object_new_object (in /usr/lib/x86_64-linux-gnu/libfastjson.so.4.3.0)
> ==31532== by 0x5A51864: ???
> ==31532== by 0x5A51D8A: ???
> ==31532== by 0x5A44B8B: ???
> ==31532== by 0x173891: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x1751BC: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x175478: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x16C82E: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x16D4F4: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x11FD1B: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x1683A1: ??? (in /usr/sbin/rsyslogd)
> ==31532==
> ==31532== LEAK SUMMARY:
> ==31532== definitely lost: 55,084,082 bytes in 215,177 blocks
> ==31532== indirectly lost: 8,360 bytes in 68 blocks
> ==31532== possibly lost: 2,304 bytes in 9 blocks
> ==31532== still reachable: 57,762 bytes in 159 blocks
> ==31532== suppressed: 0 bytes in 0 blocks
> ==31532== Reachable blocks (those to which a pointer was found) are not shown.
> ==31532== To see them, rerun with: --leak-check=full --show-leak-kinds=all
> ==31532==
> ==31532== For lists of detected and suppressed errors, rerun with: -s
> ==31532== ERROR SUMMARY: 15 errors from 15 contexts (suppressed: 0 from 0)
>
> If you need me to run this for longer I can do that as well, or if there is anything else I should do.
>
> ***
> As an additional test, I stopped shipping logs to the relevant listener (10514/UDP) for a few hours and monitored the memory usage, which remained constant. The syslog-TLS (port 6514) listener remained in use throughout with one system shipping logs to it. This seems to show that something related to the UDP listener on port 10514 is where the problem lies.
>
> Kind Regards
> ---
> ________________________________
> From: Rainer Gerhards <rgerhards@hq.adiscon.com>
> Sent: 06 December 2023 21:12
> To: Adriaan de Waal <adriaan.dewaal@nymbis.cloud>
> Cc: rsyslog-users <rsyslog@lists.adiscon.com>
> Subject: Re: [rsyslog] Memory Leak?
>
> Great.
>
> Do you know valgrind? If so, you could run rsyslog under valgrind control, best in the forground. When you terminate rsyslog, valgrind will show leak stats of any.
>
> Rainer
>
> Sent from phone, thus brief.
>
> Adriaan de Waal <adriaan.dewaal@nymbis.cloud> schrieb am Mi., 6. Dez. 2023, 19:56:
>
> Good day
>
> Looking at the main Q statistics, the size remains mostly constant around 30. The max queue size currently sits at 400. There is also a queue (linked list + disk assisted) configured for the omkafka action, with the size not really going above single digits (and the DA queue stats remain at 0). Also note I completely disabled the omkafka action's queue previously as a test, but that didn't make a difference. There are no other queues.
>
> Kind Regards
> ---
>
> ________________________________
> From: Rainer Gerhards <rgerhards@hq.adiscon.com>
> Sent: 06 December 2023 17:05
> To: rsyslog-users <rsyslog@lists.adiscon.com>
> Cc: Adriaan de Waal <adriaan.dewaal@nymbis.cloud>
> Subject: Re: [rsyslog] Memory Leak?
>
> Look at the queue sizes in impstats. Are they ever-increasing?
>
> Rainer
>
> El mi?, 6 dic 2023 a las 14:30, Adriaan de Waal via rsyslog
> (<rsyslog@lists.adiscon.com>) escribi?:
> >
> > Good day
> >
> > I am trying to diagnose and resolve an issue whereby the memory consumed by the rsyslog daemon increases linearly over time. This continues until it consumes most of the memory (including swap) of the system and the service has to be restarted to free up memory. There are two servers with identical configurations. What I noticed is that the server receiving a higher volume of messages also consumes memory at a higher rate. In other word it appears as if the message rate, or message volume, is directly proportional to the rate at which memory is consumed.
> >
> > Below is the version information for the rsyslogd daemon:
> > rsyslogd 8.2310.0 (aka 2023.10) compiled with:
> > PLATFORM: x86_64-pc-linux-gnu
> > PLATFORM (lsb_release -d):
> > FEATURE_REGEXP: Yes
> > GSSAPI Kerberos 5 support: No
> > FEATURE_DEBUG (debug build, slow code): No
> > 32bit Atomic operations supported: Yes
> > 64bit Atomic operations supported: Yes
> > memory allocator: system default
> > Runtime Instrumentation (slow code): No
> > uuid support: Yes
> > systemd support: Yes
> > Config file: /etc/rsyslog.conf
> > PID file: /var/run/rsyslogd.pid
> > Number of Bits in RainerScript integers: 64
> >
> > It is running on Debian 12 servers.
> >
> > To provide you with more background detail, initially I configured three listeners: one UDP (port 514), one TCP (port 514) and one TLS (port 6514). A single system was configured to push logs to the TLS port and that worked well (no increase in memory usage over time). Recently I added another UDP listener (port 10514) and started configured a number of systems to push their logs to this port, but since then I've observed the described gradual memory increase.
> >
> > This new listener is configured as follows: A ruleset was created and bound to this listener (the ruleset doesn't have its own queue). The ruleset first runs the mmutf8fix action then calls a different ruleset (named "normalise"), which normalises the data (just sets specific variables that is later used in a template to construct a JSON object). After the call to the "normalise" ruleset returns, a mmnormalize action is performed and some additional variables are set. Lastly the ruleset (the one bound to the listener) then calls yet another ruleset (named "kafka_output"), which is used to construct a JSON object from the various variables and uses the omkafka action to push this to a Kafka cluster.
> >
> > The flow of the above can be visualised as:
> > Source -> Syslog Server [10514/UDP] -> [listener ruleset] -> [normalise ruleset] -> [kafka_output ruleset]
> >
> > It should also be noted the original listeners are configured in much the same way, apart from having calls to even more rulesets. I haven't tested if the UDP listener on port 514 exhibits the same behaviour (it isn't currently being used).
> >
> > This rsyslog daemon is also used to capture locally generated logs and the statistics (impstats) module is also loaded.
> >
> > What can I do to troubleshoot what's causing this "memory leak"?
> >
> > Kind Regards
> > ---
> >
> > _______________________________________________
> > rsyslog mailing list
> > https://lists.adiscon.net/mailman/listinfo/rsyslog
> > http://www.rsyslog.com/professional-services/
> > What's up with rsyslog? Follow https://twitter.com/rgerhards
> > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.
_______________________________________________
rsyslog mailing list
https://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.
_______________________________________________
rsyslog mailing list
https://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.
Re: Memory Leak? [ In reply to ]
Good day

I'm still experiencing this situation where OOM-killer kills the rsyslogd process for consuming too much memory. What do you suggest is the next step to take to address this memory leak?

Kind Regards
---
________________________________
From: Adriaan de Waal <adriaan.dewaal@nymbis.cloud>
Sent: 12 December 2023 06:12
To: Rainer Gerhards <rgerhards@hq.adiscon.com>; rsyslog-users <rsyslog@lists.adiscon.com>
Subject: Re: [rsyslog] Memory Leak?

Good day

I'd just like to follow up whether there is anything else I should try?

Kind Regards
---
________________________________
From: rsyslog <rsyslog-bounces@lists.adiscon.com> on behalf of Adriaan de Waal via rsyslog <rsyslog@lists.adiscon.com>
Sent: 07 December 2023 14:56
To: Rainer Gerhards <rgerhards@hq.adiscon.com>
Cc: Adriaan de Waal <adriaan.dewaal@nymbis.cloud>; rsyslog-users <rsyslog@lists.adiscon.com>
Subject: Re: [rsyslog] Memory Leak?

Good day

I found debug symbol packages for:

* rsyslog
* rsyslog-gnutls
* rsyslog-imptcp
* rsyslog-kafka

All these were installed and I re-ran rsyslog under Valgrind for a while and this is the result:

==48067== Memcheck, a memory error detector
==48067== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
==48067== Using Valgrind-3.19.0 and LibVEX; rerun with -h for copyright info
==48067== Command: /usr/sbin/rsyslogd -n
==48067== Parent PID: 1
==48067==
==48067==
==48067== HEAP SUMMARY:
==48067== in use at exit: 3,569,788 bytes in 13,918 blocks
==48067== total heap usage: 3,724,046 allocs, 3,710,128 frees, 1,141,461,392 bytes allocated
==48067==
==48067== 32 bytes in 1 blocks are definitely lost in loss record 35 of 98
==48067== at 0x48407B4: malloc (vg_replace_malloc.c:381)
==48067== by 0x54FE7B6: ???
==48067== by 0x54FE8D4: ???
==48067== by 0x550796F: ???
==48067== by 0x54FC760: ???
==48067== by 0x545EE78: ???
==48067== by 0x534271E: ???
==48067== by 0x5311A00: ???
==48067== by 0x5313067: ???
==48067== by 0x53133F8: ???
==48067== by 0x173891: actionCallDoAction (action.c:1276)
==48067== by 0x173891: actionProcessMessage.part.0 (action.c:1318)
==48067== by 0x1751BC: actionProcessMessage (action.c:1689)
==48067== by 0x1751BC: processMsgMain.part.0 (action.c:1688)
==48067==
==48067== 48 bytes in 1 blocks are definitely lost in loss record 53 of 98
==48067== at 0x48455EF: calloc (vg_replace_malloc.c:1328)
==48067== by 0x16DBF2: propConstruct (prop.c:56)
==48067== by 0x52E4199: ???
==48067== by 0x52E63DF: ???
==48067== by 0x52E6C32: ???
==48067== by 0x4ACB043: start_thread (pthread_create.c:442)
==48067== by 0x4B4A87F: clone (clone.S:100)
==48067==
==48067== 48 bytes in 1 blocks are definitely lost in loss record 54 of 98
==48067== at 0x48455EF: calloc (vg_replace_malloc.c:1328)
==48067== by 0x16DBF2: propConstruct (prop.c:56)
==48067== by 0x52E41D5: ???
==48067== by 0x52E63DF: ???
==48067== by 0x52E6C32: ???
==48067== by 0x4ACB043: start_thread (pthread_create.c:442)
==48067== by 0x4B4A87F: clone (clone.S:100)
==48067==
==48067== 104 bytes in 1 blocks are definitely lost in loss record 62 of 98
==48067== at 0x48407B4: malloc (vg_replace_malloc.c:381)
==48067== by 0x5500EDE: ???
==48067== by 0x54FC78D: ???
==48067== by 0x545EE78: ???
==48067== by 0x534271E: ???
==48067== by 0x5311A00: ???
==48067== by 0x5313067: ???
==48067== by 0x53133F8: ???
==48067== by 0x173891: actionCallDoAction (action.c:1276)
==48067== by 0x173891: actionProcessMessage.part.0 (action.c:1318)
==48067== by 0x1751BC: actionProcessMessage (action.c:1689)
==48067== by 0x1751BC: processMsgMain.part.0 (action.c:1688)
==48067== by 0x17564F: processMsgMain (action.c:1678)
==48067== by 0x17564F: processBatchMain (action.c:1722)
==48067== by 0x1683A1: ConsumerReg (queue.c:2158)
==48067==
==48067== 128 (16 direct, 112 indirect) bytes in 1 blocks are definitely lost in loss record 69 of 98
==48067== at 0x48407B4: malloc (vg_replace_malloc.c:381)
==48067== by 0x5507861: ???
==48067== by 0x5507B49: ???
==48067== by 0x54FC760: ???
==48067== by 0x545EE78: ???
==48067== by 0x534271E: ???
==48067== by 0x5311A00: ???
==48067== by 0x5313067: ???
==48067== by 0x53133F8: ???
==48067== by 0x173891: actionCallDoAction (action.c:1276)
==48067== by 0x173891: actionProcessMessage.part.0 (action.c:1318)
==48067== by 0x1751BC: actionProcessMessage (action.c:1689)
==48067== by 0x1751BC: processMsgMain.part.0 (action.c:1688)
==48067== by 0x17564F: processMsgMain (action.c:1678)
==48067== by 0x17564F: processBatchMain (action.c:1722)
==48067==
==48067== 205 bytes in 10 blocks are definitely lost in loss record 74 of 98
==48067== at 0x48407B4: malloc (vg_replace_malloc.c:381)
==48067== by 0x4AE07F9: strdup (strdup.c:42)
==48067== by 0x132E7B: yylex (lexer.l:297)
==48067== by 0x12ED47: yyparse (grammar.c:1535)
==48067== by 0x1459AC: load (rsconf.c:1494)
==48067== by 0x12130A: initAll (rsyslogd.c:1636)
==48067== by 0x11F0E8: main (rsyslogd.c:2337)
==48067==
==48067== 205 bytes in 10 blocks are definitely lost in loss record 75 of 98
==48067== at 0x48407B4: malloc (vg_replace_malloc.c:381)
==48067== by 0x4AE07F9: strdup (strdup.c:42)
==48067== by 0x150E0C: msgPropDescrFill (msg.c:5431)
==48067== by 0x13B71A: cnffuncexistsNew (rainerscript.c:5351)
==48067== by 0x12F585: yyparse (grammar.y:222)
==48067== by 0x1459AC: load (rsconf.c:1494)
==48067== by 0x12130A: initAll (rsyslogd.c:1636)
==48067== by 0x11F0E8: main (rsyslogd.c:2337)
==48067==
==48067== 404 (392 direct, 12 indirect) bytes in 1 blocks are definitely lost in loss record 81 of 98
==48067== at 0x48407B4: malloc (vg_replace_malloc.c:381)
==48067== by 0x54FEFA6: ???
==48067== by 0x54FF22E: ???
==48067== by 0x54FC70D: ???
==48067== by 0x545EE78: ???
==48067== by 0x534271E: ???
==48067== by 0x5311A00: ???
==48067== by 0x5313067: ???
==48067== by 0x53133F8: ???
==48067== by 0x173891: actionCallDoAction (action.c:1276)
==48067== by 0x173891: actionProcessMessage.part.0 (action.c:1318)
==48067== by 0x1751BC: actionProcessMessage (action.c:1689)
==48067== by 0x1751BC: processMsgMain.part.0 (action.c:1688)
==48067== by 0x17564F: processMsgMain (action.c:1678)
==48067== by 0x17564F: processBatchMain (action.c:1722)
==48067==
==48067== 904 (32 direct, 872 indirect) bytes in 1 blocks are definitely lost in loss record 86 of 98
==48067== at 0x48407B4: malloc (vg_replace_malloc.c:381)
==48067== by 0x54FC6A8: ???
==48067== by 0x545EE78: ???
==48067== by 0x534271E: ???
==48067== by 0x5311A00: ???
==48067== by 0x5313067: ???
==48067== by 0x53133F8: ???
==48067== by 0x173891: actionCallDoAction (action.c:1276)
==48067== by 0x173891: actionProcessMessage.part.0 (action.c:1318)
==48067== by 0x1751BC: actionProcessMessage (action.c:1689)
==48067== by 0x1751BC: processMsgMain.part.0 (action.c:1688)
==48067== by 0x17564F: processMsgMain (action.c:1678)
==48067== by 0x17564F: processBatchMain (action.c:1722)
==48067== by 0x1683A1: ConsumerReg (queue.c:2158)
==48067== by 0x164FB0: wtiWorker (wti.c:430)
==48067==
==48067== 4,112 bytes in 1 blocks are definitely lost in loss record 90 of 98
==48067== at 0x48407B4: malloc (vg_replace_malloc.c:381)
==48067== by 0x54FA8CA: ???
==48067== by 0x54FF24A: ???
==48067== by 0x54FC70D: ???
==48067== by 0x545EE78: ???
==48067== by 0x534271E: ???
==48067== by 0x5311A00: ???
==48067== by 0x5313067: ???
==48067== by 0x53133F8: ???
==48067== by 0x173891: actionCallDoAction (action.c:1276)
==48067== by 0x173891: actionProcessMessage.part.0 (action.c:1318)
==48067== by 0x1751BC: actionProcessMessage (action.c:1689)
==48067== by 0x1751BC: processMsgMain.part.0 (action.c:1688)
==48067== by 0x17564F: processMsgMain (action.c:1678)
==48067== by 0x17564F: processBatchMain (action.c:1722)
==48067==
==48067== 8,364 (1,000 direct, 7,364 indirect) bytes in 1 blocks are definitely lost in loss record 93 of 98
==48067== at 0x48455EF: calloc (vg_replace_malloc.c:1328)
==48067== by 0x535E8FB: ???
==48067== by 0x5360158: ???
==48067== by 0x530FDF2: ???
==48067== by 0x5310E65: ???
==48067== by 0x5313481: ???
==48067== by 0x173891: actionCallDoAction (action.c:1276)
==48067== by 0x173891: actionProcessMessage.part.0 (action.c:1318)
==48067== by 0x1751BC: actionProcessMessage (action.c:1689)
==48067== by 0x1751BC: processMsgMain.part.0 (action.c:1688)
==48067== by 0x17564F: processMsgMain (action.c:1678)
==48067== by 0x17564F: processBatchMain (action.c:1722)
==48067== by 0x1683A1: ConsumerReg (queue.c:2158)
==48067== by 0x164FB0: wtiWorker (wti.c:430)
==48067== by 0x1637A9: wtpWorker (wtp.c:435)
==48067==
==48067== 207,872 bytes in 812 blocks are definitely lost in loss record 97 of 98
==48067== at 0x48455EF: calloc (vg_replace_malloc.c:1328)
==48067== by 0x495BF12: fjson_object_new_object (in /usr/lib/x86_64-linux-gnu/libfastjson.so.4.3.0)
==48067== by 0x5A51864: ???
==48067== by 0x5A516B2: ???
==48067== by 0x5A516B2: ???
==48067== by 0x5A51D8A: ???
==48067== by 0x5A44B8B: ???
==48067== by 0x173891: actionCallDoAction (action.c:1276)
==48067== by 0x173891: actionProcessMessage.part.0 (action.c:1318)
==48067== by 0x1751BC: actionProcessMessage (action.c:1689)
==48067== by 0x1751BC: processMsgMain.part.0 (action.c:1688)
==48067== by 0x175478: processMsgMain (action.c:1678)
==48067== by 0x175478: doSubmitToActionQ (action.c:1855)
==48067== by 0x16C82E: execAct (ruleset.c:209)
==48067== by 0x16C82E: scriptExec (ruleset.c:599)
==48067== by 0x16D4F4: processBatch (ruleset.c:660)
==48067==
==48067== 3,289,856 bytes in 12,851 blocks are definitely lost in loss record 98 of 98
==48067== at 0x48455EF: calloc (vg_replace_malloc.c:1328)
==48067== by 0x495BF12: fjson_object_new_object (in /usr/lib/x86_64-linux-gnu/libfastjson.so.4.3.0)
==48067== by 0x5A51864: ???
==48067== by 0x5A51D8A: ???
==48067== by 0x5A44B8B: ???
==48067== by 0x173891: actionCallDoAction (action.c:1276)
==48067== by 0x173891: actionProcessMessage.part.0 (action.c:1318)
==48067== by 0x1751BC: actionProcessMessage (action.c:1689)
==48067== by 0x1751BC: processMsgMain.part.0 (action.c:1688)
==48067== by 0x175478: processMsgMain (action.c:1678)
==48067== by 0x175478: doSubmitToActionQ (action.c:1855)
==48067== by 0x16C82E: execAct (ruleset.c:209)
==48067== by 0x16C82E: scriptExec (ruleset.c:599)
==48067== by 0x16D4F4: processBatch (ruleset.c:660)
==48067== by 0x11FD1B: msgConsumer (rsyslogd.c:779)
==48067== by 0x1683A1: ConsumerReg (queue.c:2158)
==48067==
==48067== LEAK SUMMARY:
==48067== definitely lost: 3,503,922 bytes in 13,692 blocks
==48067== indirectly lost: 8,360 bytes in 68 blocks
==48067== possibly lost: 0 bytes in 0 blocks
==48067== still reachable: 57,506 bytes in 158 blocks
==48067== suppressed: 0 bytes in 0 blocks
==48067== Reachable blocks (those to which a pointer was found) are not shown.
==48067== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==48067==
==48067== For lists of detected and suppressed errors, rerun with: -s
==48067== ERROR SUMMARY: 13 errors from 13 contexts (suppressed: 0 from 0)

Kind Regards
---

________________________________
From: Rainer Gerhards <rgerhards@hq.adiscon.com>
Sent: 07 December 2023 13:39
To: Adriaan de Waal <adriaan.dewaal@nymbis.cloud>
Cc: rsyslog-users <rsyslog@lists.adiscon.com>
Subject: Re: [rsyslog] Memory Leak?

Thx, that is useful. Obviously we have a leak:

> 48,420,608 bytes in 189,143 blocks are definitely lost

unfortunately, we do not have symbols, so we do not know exactly where
it happens. Can you install debug symbols? Usually there is a package
rsyslog-debuginfo (or the like), which should at least contain the
symbols for the static (non-plugin) parts of rsyslog. That would help
with the stacktrace and ease bug hunting.

Rainer

El jue, 7 dic 2023 a las 11:41, Adriaan de Waal
(<adriaan.dewaal@nymbis.cloud>) escribi?:
>
> Good day
>
> I ran rsyslog under Valgrind for a while and got the following output:
>
> ==31532== Memcheck, a memory error detector
> ==31532== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
> ==31532== Using Valgrind-3.19.0 and LibVEX; rerun with -h for copyright info
> ==31532== Command: /usr/sbin/rsyslogd -n
> ==31532== Parent PID: 1
> ==31532==
> ==31532==
> ==31532== HEAP SUMMARY:
> ==31532== in use at exit: 55,152,508 bytes in 215,413 blocks
> ==31532== total heap usage: 36,459,066 allocs, 36,243,653 frees, 10,340,685,855 bytes allocated
> ==31532==
> ==31532== 32 bytes in 1 blocks are definitely lost in loss record 35 of 101
> ==31532== at 0x48407B4: malloc (vg_replace_malloc.c:381)
> ==31532== by 0x54FE7B6: ???
> ==31532== by 0x54FE8D4: ???
> ==31532== by 0x550796F: ???
> ==31532== by 0x54FC760: ???
> ==31532== by 0x545EE78: ???
> ==31532== by 0x534271E: ???
> ==31532== by 0x5311A00: ???
> ==31532== by 0x5313067: ???
> ==31532== by 0x53133F8: ???
> ==31532== by 0x173891: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x1751BC: ??? (in /usr/sbin/rsyslogd)
> ==31532==
> ==31532== 48 bytes in 1 blocks are definitely lost in loss record 53 of 101
> ==31532== at 0x48455EF: calloc (vg_replace_malloc.c:1328)
> ==31532== by 0x16DBF2: propConstruct (in /usr/sbin/rsyslogd)
> ==31532== by 0x52E4199: ???
> ==31532== by 0x52E63DF: ???
> ==31532== by 0x52E6C32: ???
> ==31532== by 0x4ACB043: start_thread (pthread_create.c:442)
> ==31532== by 0x4B4A87F: clone (clone.S:100)
> ==31532==
> ==31532== 48 bytes in 1 blocks are definitely lost in loss record 54 of 101
> ==31532== at 0x48455EF: calloc (vg_replace_malloc.c:1328)
> ==31532== by 0x16DBF2: propConstruct (in /usr/sbin/rsyslogd)
> ==31532== by 0x52E41D5: ???
> ==31532== by 0x52E63DF: ???
> ==31532== by 0x52E6C32: ???
> ==31532== by 0x4ACB043: start_thread (pthread_create.c:442)
> ==31532== by 0x4B4A87F: clone (clone.S:100)
> ==31532==
> ==31532== 104 bytes in 1 blocks are definitely lost in loss record 62 of 101
> ==31532== at 0x48407B4: malloc (vg_replace_malloc.c:381)
> ==31532== by 0x5500EDE: ???
> ==31532== by 0x54FC78D: ???
> ==31532== by 0x545EE78: ???
> ==31532== by 0x534271E: ???
> ==31532== by 0x5311A00: ???
> ==31532== by 0x5313067: ???
> ==31532== by 0x53133F8: ???
> ==31532== by 0x173891: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x1751BC: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x17564F: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x1683A1: ??? (in /usr/sbin/rsyslogd)
> ==31532==
> ==31532== 128 (16 direct, 112 indirect) bytes in 1 blocks are definitely lost in loss record 69 of 101
> ==31532== at 0x48407B4: malloc (vg_replace_malloc.c:381)
> ==31532== by 0x5507861: ???
> ==31532== by 0x5507B49: ???
> ==31532== by 0x54FC760: ???
> ==31532== by 0x545EE78: ???
> ==31532== by 0x534271E: ???
> ==31532== by 0x5311A00: ???
> ==31532== by 0x5313067: ???
> ==31532== by 0x53133F8: ???
> ==31532== by 0x173891: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x1751BC: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x17564F: ??? (in /usr/sbin/rsyslogd)
> ==31532==
> ==31532== 205 bytes in 10 blocks are definitely lost in loss record 74 of 101
> ==31532== at 0x48407B4: malloc (vg_replace_malloc.c:381)
> ==31532== by 0x4AE07F9: strdup (strdup.c:42)
> ==31532== by 0x132E7B: yylex (in /usr/sbin/rsyslogd)
> ==31532== by 0x12ED47: yyparse (in /usr/sbin/rsyslogd)
> ==31532== by 0x1459AC: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x12130A: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x11F0E8: main (in /usr/sbin/rsyslogd)
> ==31532==
> ==31532== 205 bytes in 10 blocks are definitely lost in loss record 75 of 101
> ==31532== at 0x48407B4: malloc (vg_replace_malloc.c:381)
> ==31532== by 0x4AE07F9: strdup (strdup.c:42)
> ==31532== by 0x150E0C: msgPropDescrFill (in /usr/sbin/rsyslogd)
> ==31532== by 0x13B71A: cnffuncexistsNew (in /usr/sbin/rsyslogd)
> ==31532== by 0x12F585: yyparse (in /usr/sbin/rsyslogd)
> ==31532== by 0x1459AC: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x12130A: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x11F0E8: main (in /usr/sbin/rsyslogd)
> ==31532==
> ==31532== 404 (392 direct, 12 indirect) bytes in 1 blocks are definitely lost in loss record 82 of 101
> ==31532== at 0x48407B4: malloc (vg_replace_malloc.c:381)
> ==31532== by 0x54FEFA6: ???
> ==31532== by 0x54FF22E: ???
> ==31532== by 0x54FC70D: ???
> ==31532== by 0x545EE78: ???
> ==31532== by 0x534271E: ???
> ==31532== by 0x5311A00: ???
> ==31532== by 0x5313067: ???
> ==31532== by 0x53133F8: ???
> ==31532== by 0x173891: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x1751BC: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x17564F: ??? (in /usr/sbin/rsyslogd)
> ==31532==
> ==31532== 512 bytes in 2 blocks are possibly lost in loss record 85 of 101
> ==31532== at 0x48455EF: calloc (vg_replace_malloc.c:1328)
> ==31532== by 0x495BF12: fjson_object_new_object (in /usr/lib/x86_64-linux-gnu/libfastjson.so.4.3.0)
> ==31532== by 0x5A51864: ???
> ==31532== by 0x5A516B2: ???
> ==31532== by 0x5A516B2: ???
> ==31532== by 0x5A51D8A: ???
> ==31532== by 0x5A44B8B: ???
> ==31532== by 0x173891: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x1751BC: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x175478: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x16C82E: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x16D4F4: ??? (in /usr/sbin/rsyslogd)
> ==31532==
> ==31532== 904 (32 direct, 872 indirect) bytes in 1 blocks are definitely lost in loss record 88 of 101
> ==31532== at 0x48407B4: malloc (vg_replace_malloc.c:381)
> ==31532== by 0x54FC6A8: ???
> ==31532== by 0x545EE78: ???
> ==31532== by 0x534271E: ???
> ==31532== by 0x5311A00: ???
> ==31532== by 0x5313067: ???
> ==31532== by 0x53133F8: ???
> ==31532== by 0x173891: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x1751BC: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x17564F: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x1683A1: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x164FB0: wtiWorker (in /usr/sbin/rsyslogd)
> ==31532==
> ==31532== 1,792 bytes in 7 blocks are possibly lost in loss record 90 of 101
> ==31532== at 0x48455EF: calloc (vg_replace_malloc.c:1328)
> ==31532== by 0x495BF12: fjson_object_new_object (in /usr/lib/x86_64-linux-gnu/libfastjson.so.4.3.0)
> ==31532== by 0x5A51864: ???
> ==31532== by 0x5A51D8A: ???
> ==31532== by 0x5A44B8B: ???
> ==31532== by 0x173891: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x1751BC: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x175478: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x16C82E: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x16D4F4: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x11FD1B: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x1683A1: ??? (in /usr/sbin/rsyslogd)
> ==31532==
> ==31532== 4,112 bytes in 1 blocks are definitely lost in loss record 93 of 101
> ==31532== at 0x48407B4: malloc (vg_replace_malloc.c:381)
> ==31532== by 0x54FA8CA: ???
> ==31532== by 0x54FF24A: ???
> ==31532== by 0x54FC70D: ???
> ==31532== by 0x545EE78: ???
> ==31532== by 0x534271E: ???
> ==31532== by 0x5311A00: ???
> ==31532== by 0x5313067: ???
> ==31532== by 0x53133F8: ???
> ==31532== by 0x173891: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x1751BC: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x17564F: ??? (in /usr/sbin/rsyslogd)
> ==31532==
> ==31532== 8,364 (1,000 direct, 7,364 indirect) bytes in 1 blocks are definitely lost in loss record 96 of 101
> ==31532== at 0x48455EF: calloc (vg_replace_malloc.c:1328)
> ==31532== by 0x535E8FB: ???
> ==31532== by 0x5360158: ???
> ==31532== by 0x530FDF2: ???
> ==31532== by 0x5310E65: ???
> ==31532== by 0x5313481: ???
> ==31532== by 0x173891: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x1751BC: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x17564F: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x1683A1: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x164FB0: wtiWorker (in /usr/sbin/rsyslogd)
> ==31532== by 0x1637A9: ??? (in /usr/sbin/rsyslogd)
> ==31532==
> ==31532== 6,657,280 bytes in 26,005 blocks are definitely lost in loss record 100 of 101
> ==31532== at 0x48455EF: calloc (vg_replace_malloc.c:1328)
> ==31532== by 0x495BF12: fjson_object_new_object (in /usr/lib/x86_64-linux-gnu/libfastjson.so.4.3.0)
> ==31532== by 0x5A51864: ???
> ==31532== by 0x5A516B2: ???
> ==31532== by 0x5A516B2: ???
> ==31532== by 0x5A51D8A: ???
> ==31532== by 0x5A44B8B: ???
> ==31532== by 0x173891: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x1751BC: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x175478: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x16C82E: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x16D4F4: ??? (in /usr/sbin/rsyslogd)
> ==31532==
> ==31532== 48,420,608 bytes in 189,143 blocks are definitely lost in loss record 101 of 101
> ==31532== at 0x48455EF: calloc (vg_replace_malloc.c:1328)
> ==31532== by 0x495BF12: fjson_object_new_object (in /usr/lib/x86_64-linux-gnu/libfastjson.so.4.3.0)
> ==31532== by 0x5A51864: ???
> ==31532== by 0x5A51D8A: ???
> ==31532== by 0x5A44B8B: ???
> ==31532== by 0x173891: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x1751BC: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x175478: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x16C82E: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x16D4F4: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x11FD1B: ??? (in /usr/sbin/rsyslogd)
> ==31532== by 0x1683A1: ??? (in /usr/sbin/rsyslogd)
> ==31532==
> ==31532== LEAK SUMMARY:
> ==31532== definitely lost: 55,084,082 bytes in 215,177 blocks
> ==31532== indirectly lost: 8,360 bytes in 68 blocks
> ==31532== possibly lost: 2,304 bytes in 9 blocks
> ==31532== still reachable: 57,762 bytes in 159 blocks
> ==31532== suppressed: 0 bytes in 0 blocks
> ==31532== Reachable blocks (those to which a pointer was found) are not shown.
> ==31532== To see them, rerun with: --leak-check=full --show-leak-kinds=all
> ==31532==
> ==31532== For lists of detected and suppressed errors, rerun with: -s
> ==31532== ERROR SUMMARY: 15 errors from 15 contexts (suppressed: 0 from 0)
>
> If you need me to run this for longer I can do that as well, or if there is anything else I should do.
>
> ***
> As an additional test, I stopped shipping logs to the relevant listener (10514/UDP) for a few hours and monitored the memory usage, which remained constant. The syslog-TLS (port 6514) listener remained in use throughout with one system shipping logs to it. This seems to show that something related to the UDP listener on port 10514 is where the problem lies.
>
> Kind Regards
> ---
> ________________________________
> From: Rainer Gerhards <rgerhards@hq.adiscon.com>
> Sent: 06 December 2023 21:12
> To: Adriaan de Waal <adriaan.dewaal@nymbis.cloud>
> Cc: rsyslog-users <rsyslog@lists.adiscon.com>
> Subject: Re: [rsyslog] Memory Leak?
>
> Great.
>
> Do you know valgrind? If so, you could run rsyslog under valgrind control, best in the forground. When you terminate rsyslog, valgrind will show leak stats of any.
>
> Rainer
>
> Sent from phone, thus brief.
>
> Adriaan de Waal <adriaan.dewaal@nymbis.cloud> schrieb am Mi., 6. Dez. 2023, 19:56:
>
> Good day
>
> Looking at the main Q statistics, the size remains mostly constant around 30. The max queue size currently sits at 400. There is also a queue (linked list + disk assisted) configured for the omkafka action, with the size not really going above single digits (and the DA queue stats remain at 0). Also note I completely disabled the omkafka action's queue previously as a test, but that didn't make a difference. There are no other queues.
>
> Kind Regards
> ---
>
> ________________________________
> From: Rainer Gerhards <rgerhards@hq.adiscon.com>
> Sent: 06 December 2023 17:05
> To: rsyslog-users <rsyslog@lists.adiscon.com>
> Cc: Adriaan de Waal <adriaan.dewaal@nymbis.cloud>
> Subject: Re: [rsyslog] Memory Leak?
>
> Look at the queue sizes in impstats. Are they ever-increasing?
>
> Rainer
>
> El mi?, 6 dic 2023 a las 14:30, Adriaan de Waal via rsyslog
> (<rsyslog@lists.adiscon.com>) escribi?:
> >
> > Good day
> >
> > I am trying to diagnose and resolve an issue whereby the memory consumed by the rsyslog daemon increases linearly over time. This continues until it consumes most of the memory (including swap) of the system and the service has to be restarted to free up memory. There are two servers with identical configurations. What I noticed is that the server receiving a higher volume of messages also consumes memory at a higher rate. In other word it appears as if the message rate, or message volume, is directly proportional to the rate at which memory is consumed.
> >
> > Below is the version information for the rsyslogd daemon:
> > rsyslogd 8.2310.0 (aka 2023.10) compiled with:
> > PLATFORM: x86_64-pc-linux-gnu
> > PLATFORM (lsb_release -d):
> > FEATURE_REGEXP: Yes
> > GSSAPI Kerberos 5 support: No
> > FEATURE_DEBUG (debug build, slow code): No
> > 32bit Atomic operations supported: Yes
> > 64bit Atomic operations supported: Yes
> > memory allocator: system default
> > Runtime Instrumentation (slow code): No
> > uuid support: Yes
> > systemd support: Yes
> > Config file: /etc/rsyslog.conf
> > PID file: /var/run/rsyslogd.pid
> > Number of Bits in RainerScript integers: 64
> >
> > It is running on Debian 12 servers.
> >
> > To provide you with more background detail, initially I configured three listeners: one UDP (port 514), one TCP (port 514) and one TLS (port 6514). A single system was configured to push logs to the TLS port and that worked well (no increase in memory usage over time). Recently I added another UDP listener (port 10514) and started configured a number of systems to push their logs to this port, but since then I've observed the described gradual memory increase.
> >
> > This new listener is configured as follows: A ruleset was created and bound to this listener (the ruleset doesn't have its own queue). The ruleset first runs the mmutf8fix action then calls a different ruleset (named "normalise"), which normalises the data (just sets specific variables that is later used in a template to construct a JSON object). After the call to the "normalise" ruleset returns, a mmnormalize action is performed and some additional variables are set. Lastly the ruleset (the one bound to the listener) then calls yet another ruleset (named "kafka_output"), which is used to construct a JSON object from the various variables and uses the omkafka action to push this to a Kafka cluster.
> >
> > The flow of the above can be visualised as:
> > Source -> Syslog Server [10514/UDP] -> [listener ruleset] -> [normalise ruleset] -> [kafka_output ruleset]
> >
> > It should also be noted the original listeners are configured in much the same way, apart from having calls to even more rulesets. I haven't tested if the UDP listener on port 514 exhibits the same behaviour (it isn't currently being used).
> >
> > This rsyslog daemon is also used to capture locally generated logs and the statistics (impstats) module is also loaded.
> >
> > What can I do to troubleshoot what's causing this "memory leak"?
> >
> > Kind Regards
> > ---
> >
> > _______________________________________________
> > rsyslog mailing list
> > https://lists.adiscon.net/mailman/listinfo/rsyslog
> > http://www.rsyslog.com/professional-services/
> > What's up with rsyslog? Follow https://twitter.com/rgerhards
> > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.
_______________________________________________
rsyslog mailing list
https://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.
_______________________________________________
rsyslog mailing list
https://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.