Mailing List Archive

high mainQ size on busy rsyslog system
Hello (again)

I have a central rsyslog system which receives logs from 10 syslog relays via imptcp. After enabling impstats a few facts came to surface. It appears that my MainQ on central syslog system constantly has AVG of 150k objects.

In fact:
Messages/sec: ~45k
MainQ size: ~150k with spikes of 350k per minute (screenshot attached)


Looking at the documentation and trying to find some info, I got a bit confused on the way, so any help on the following questions are appreciated.


1. How one would interpret the fact that a MainQ size constantly fluctuates between 150k ~ 350k on a ~45k mps traffic.

Should your MainQ be always zero? Is there an issue having MainQ size of 150k?


2. I was thinking to tweak dequeue_batch_size and worker_threads parameters to bring MainQ size down, but I am not sure of what is currently set in defaults. I used as a reference the code in github for my current running version (8.2204)
https://github.com/rsyslog/rsyslog/blob/v8.2204.1/runtime/rsconf.c


globals.mainQ.iMainMsgQHighWtrMark = 80000;
globals.mainQ.iMainMsgQueDeqBatchSize = 256;
globals.mainQ.iMainMsgQueueNumWorkers = 2;
globals.mainQ.iMainMsgQLowWtrMark = 20000;

is the above correct, or rsyslog would use various factors on start-up to dynamically tweak above main q parameters?

The documentation* states that the default dequeue batch size for ruleset queues is 1024.

Which value is correct, 256 (github) or 1024 (Documentation)?
In general, what/where is the best place to get default values of rsyslog?


3. Is there a way to detect throttling

According to doc**, when queue is full , rsyslogd will throttle submitter (in my case, syslog relays). Is there a way to detect if/when my central syslog system is throttling my relays?

This is my config on Central busy syslog system


?*****************rsyslog.conf

module(
load="impstats"
interval="10"
resetCounters="on"
format="json"
ruleset="impstats"
)

module(load="omprog")
$LocalHostName central-syslog.mydomain.net

$MaxMessageSize 32k

# CIS
$umask 0000
$FileCreateMode 0640

$FileOwner root
$FileGroup logroup


$DirCreateMode 0750
$DirOwner root
$DirGroup logroup


$ModLoad imuxsock
$ModLoad imklog
$ModLoad immark

$SystemLogRateLimitInterval 0
$SystemLogRateLimitBurst 0


$ModLoad imudp
$UDPServerRun 514

$ModLoad imtcp
$InputTCPServerRun 514

$ModLoad imptcp
$InputPTCPServerRun 6514

$PreserveFQDN on
$MaxOpenFiles 16384
$ActionFileDefaultTemplate RSYSLOG_FileFormat
$IncludeConfig /etc/rsyslog.d/*.conf


:msg, regex, ".* audit: .*" stop
:msg, regex, ".* kauditd_printk_skb: .*" stop


template(name="APPone_plain type="string"
string="/logs/app1-logs.log")

template(name="APPone_archive" type="string"
string="/logs/archive/app1-logs.log.gz")

if $syslogtag contains 'app-one' then {
action(type="omfile" DynaFile="APPone_plain" Template="RSYSLOG_SyslogProtocol23Format"
DirCreateMode="0750" dirOwner="root" dirGroup="logroup"
FileCreateMode="0640" fileOwner="root" fileGroup="logroup")


action(type="omfile" DynaFile="APPone_archive" Template="RSYSLOG_SyslogProtocol23Format"
DirCreateMode="0750" dirOwner="root" dirGroup="logroup"
FileCreateMode="0640" fileOwner="root" fileGroup="logroup")
stop
}


template(name="APPzero_plain type="string"
string="/logs/app_zero-logs.log")

template(name="APPone_archive" type="string"
string="/logs/archive/app_zero-logs.log.gz")

if $syslogtag contains 'app-zero' then {
action(type="omfile" DynaFile="APPzero_plain" Template="RSYSLOG_SyslogProtocol23Format"
DirCreateMode="0750" dirOwner="root" dirGroup="logroup"
FileCreateMode="0640" fileOwner="root" fileGroup="logroup")


action(type="omfile" DynaFile="APPzero_archive" Template="RSYSLOG_SyslogProtocol23Format"
DirCreateMode="0750" dirOwner="root" dirGroup="logroup"
FileCreateMode="0640" fileOwner="root" fileGroup="logroup")
stop
}


*.info;mail.none;authpriv.none;cron.none /logs/messages

# Log anything (except mail) of level info or higher.
# The authpriv file has restricted access.
authpriv.* /logs/auth

# Log all the mail messages in one place.
mail.* /logs/maillog

# Log Kernel messages
kern.* /logs/kernel

# Log cron stuff
cron.* /logs/cron

# Everybody gets emergency messages
*.emerg /logs/messages

# Save news errors of level crit and higher in a special file.
uucp,news.crit /logs/spooler

lpr.debug /logs/boot.log

# Save boot messages also to boot.log
local7.* /logs/boot.log
local6.* /logs/devs.log
local5.* /logs/various.log
local4.* /logs/session.log
local3.* /logs/messages
local2.* /logs/messages
local1.* /logs/history.log


$OMFileZipLevel 6
$template MAINARC,"/logs/archive/%$YEAR%/%$MONTH%/%$DAY%/all.log.gz"
*.* -?MAINARC



*********************************************


Thanks
D.




* https://www.rsyslog.com/doc/master/rainerscript/queue_parameters.html

** https://www.rsyslog.com/doc/v8-stable/concepts/queues.html

?
Re: high mainQ size on busy rsyslog system [ In reply to ]
omfile should be very fast, but you are using dynafile. Without seeing your
stats data I can't be sure, but I'd lay odds that your evictions counts are very
high. If you increase your dynafilecachesize until it's larger than the number
of files you are writing to at any one time, performance will probably
skyrocket.

multiple worker threads are generally not useful when you are writing to files
(as you can only have one thread writing to a file at a time.

The other thing to do is to run top and hit H to see per-thread stats, if you
have one thread maxing out a core (100% cpu), then you have a bottleneck to
address.

I think the default batch size is 100 or 128, which is generally enough to get
you out of trouble from lock contention, I will sometimes increase it to ~1000,
but it's unlikely to make a big difference when you are writing out to files.

note that the high/low watermark values are telling rsyslog to throw away logs
when it hits those levels (this shows up as discarded messages in the pstats
output)

the pstats output will also tell you the max queue size, and how many times it
was full when a new message attempted to be delivered (although with it
configured to discard messages, you probably aren't hitting full)

David Lang


On Thu, 2 Mar 2023, Dimi Onobodies via
rsyslog wrote:

> Date: Thu, 2 Mar 2023 13:08:11 +0000
> From: Dimi Onobodies via rsyslog <rsyslog@lists.adiscon.com>
> To: rsyslog-users <rsyslog@lists.adiscon.com>
> Cc: Dimi Onobodies <dimi_kdj@hotmail.com>
> Subject: [rsyslog] high mainQ size on busy rsyslog system
>
> Hello (again)
>
> I have a central rsyslog system which receives logs from 10 syslog relays via imptcp. After enabling impstats a few facts came to surface. It appears that my MainQ on central syslog system constantly has AVG of 150k objects.
>
> In fact:
> Messages/sec: ~45k
> MainQ size: ~150k with spikes of 350k per minute (screenshot attached)
>
>
> Looking at the documentation and trying to find some info, I got a bit confused on the way, so any help on the following questions are appreciated.
>
>
> 1. How one would interpret the fact that a MainQ size constantly fluctuates between 150k ~ 350k on a ~45k mps traffic.
>
> Should your MainQ be always zero? Is there an issue having MainQ size of 150k?
>
>
> 2. I was thinking to tweak dequeue_batch_size and worker_threads parameters to bring MainQ size down, but I am not sure of what is currently set in defaults. I used as a reference the code in github for my current running version (8.2204)
> https://github.com/rsyslog/rsyslog/blob/v8.2204.1/runtime/rsconf.c
>
>
> globals.mainQ.iMainMsgQHighWtrMark = 80000;
> globals.mainQ.iMainMsgQueDeqBatchSize = 256;
> globals.mainQ.iMainMsgQueueNumWorkers = 2;
> globals.mainQ.iMainMsgQLowWtrMark = 20000;
>
> is the above correct, or rsyslog would use various factors on start-up to dynamically tweak above main q parameters?
>
> The documentation* states that the default dequeue batch size for ruleset queues is 1024.
>
> Which value is correct, 256 (github) or 1024 (Documentation)?
> In general, what/where is the best place to get default values of rsyslog?
>
>
> 3. Is there a way to detect throttling
>
> According to doc**, when queue is full , rsyslogd will throttle submitter (in my case, syslog relays). Is there a way to detect if/when my central syslog system is throttling my relays?
>
> This is my config on Central busy syslog system
>
>
> ?*****************rsyslog.conf
>
> module(
> load="impstats"
> interval="10"
> resetCounters="on"
> format="json"
> ruleset="impstats"
> )
>
> module(load="omprog")
> $LocalHostName central-syslog.mydomain.net
>
> $MaxMessageSize 32k
>
> # CIS
> $umask 0000
> $FileCreateMode 0640
>
> $FileOwner root
> $FileGroup logroup
>
>
> $DirCreateMode 0750
> $DirOwner root
> $DirGroup logroup
>
>
> $ModLoad imuxsock
> $ModLoad imklog
> $ModLoad immark
>
> $SystemLogRateLimitInterval 0
> $SystemLogRateLimitBurst 0
>
>
> $ModLoad imudp
> $UDPServerRun 514
>
> $ModLoad imtcp
> $InputTCPServerRun 514
>
> $ModLoad imptcp
> $InputPTCPServerRun 6514
>
> $PreserveFQDN on
> $MaxOpenFiles 16384
> $ActionFileDefaultTemplate RSYSLOG_FileFormat
> $IncludeConfig /etc/rsyslog.d/*.conf
>
>
> :msg, regex, ".* audit: .*" stop
> :msg, regex, ".* kauditd_printk_skb: .*" stop
>
>
> template(name="APPone_plain type="string"
> string="/logs/app1-logs.log")
>
> template(name="APPone_archive" type="string"
> string="/logs/archive/app1-logs.log.gz")
>
> if $syslogtag contains 'app-one' then {
> action(type="omfile" DynaFile="APPone_plain" Template="RSYSLOG_SyslogProtocol23Format"
> DirCreateMode="0750" dirOwner="root" dirGroup="logroup"
> FileCreateMode="0640" fileOwner="root" fileGroup="logroup")
>
>
> action(type="omfile" DynaFile="APPone_archive" Template="RSYSLOG_SyslogProtocol23Format"
> DirCreateMode="0750" dirOwner="root" dirGroup="logroup"
> FileCreateMode="0640" fileOwner="root" fileGroup="logroup")
> stop
> }
>
>
> template(name="APPzero_plain type="string"
> string="/logs/app_zero-logs.log")
>
> template(name="APPone_archive" type="string"
> string="/logs/archive/app_zero-logs.log.gz")
>
> if $syslogtag contains 'app-zero' then {
> action(type="omfile" DynaFile="APPzero_plain" Template="RSYSLOG_SyslogProtocol23Format"
> DirCreateMode="0750" dirOwner="root" dirGroup="logroup"
> FileCreateMode="0640" fileOwner="root" fileGroup="logroup")
>
>
> action(type="omfile" DynaFile="APPzero_archive" Template="RSYSLOG_SyslogProtocol23Format"
> DirCreateMode="0750" dirOwner="root" dirGroup="logroup"
> FileCreateMode="0640" fileOwner="root" fileGroup="logroup")
> stop
> }
>
>
> *.info;mail.none;authpriv.none;cron.none /logs/messages
>
> # Log anything (except mail) of level info or higher.
> # The authpriv file has restricted access.
> authpriv.* /logs/auth
>
> # Log all the mail messages in one place.
> mail.* /logs/maillog
>
> # Log Kernel messages
> kern.* /logs/kernel
>
> # Log cron stuff
> cron.* /logs/cron
>
> # Everybody gets emergency messages
> *.emerg /logs/messages
>
> # Save news errors of level crit and higher in a special file.
> uucp,news.crit /logs/spooler
>
> lpr.debug /logs/boot.log
>
> # Save boot messages also to boot.log
> local7.* /logs/boot.log
> local6.* /logs/devs.log
> local5.* /logs/various.log
> local4.* /logs/session.log
> local3.* /logs/messages
> local2.* /logs/messages
> local1.* /logs/history.log
>
>
> $OMFileZipLevel 6
> $template MAINARC,"/logs/archive/%$YEAR%/%$MONTH%/%$DAY%/all.log.gz"
> *.* -?MAINARC
>
>
>
> *********************************************
>
>
> Thanks
> D.
>
>
>
>
> * https://www.rsyslog.com/doc/master/rainerscript/queue_parameters.html
>
> ** https://www.rsyslog.com/doc/v8-stable/concepts/queues.html
>
> ?
>
>
_______________________________________________
rsyslog mailing list
https://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.