Mailing List Archive

Problems setting up RELP, losing messages
Hi all

I'm trying to setup RELP for the following use case:

- All clients will be configured with RELP and will try to send messages to
primary rsyslog server, if failed send to secondary, if failed again try
primary, forever, with a disk queue.
I'm using the following configuration, at the end of rsyslog.conf, on the
client machines:

*.* action(name="relp_1514_send_sec1" type="omrelp"
target="server_1_ip"
port="1514"
queue.workerThreads="1"
queue.saveonshutdown="on"
action.resumeRetryCount="1"
action.ResumeInterval="5"
action.reportSuspension="on"
action.reportSuspensionContinuation="on"
Timeout="5"
)
& action(name="relp_1514_send_sec2" type="omrelp"
action.execOnlyWhenPreviousIsSuspended="on"
target="server_2_ip"
port="1514"
queue.workerThreads="1514"
queue.saveonshutdown="on"
action.resumeRetryCount="1"
action.ResumeInterval="5"
action.reportSuspension="on"
action.reportSuspensionContinuation="on"
Timeout="5"
)
& action(name="relp_1514_send_sec1-forever" type="omrelp"
action.execOnlyWhenPreviousIsSuspended="on"
target="server_1_ip"
port="1514"
queue.type="Disk"
queue.size="1000000"
queue.highWatermark="5"
queue.filename="relp-persistente-queue"
queue.spoolDirectory="/var/log/spool"
queue.workerThreads="1"
queue.saveonshutdown="on"
queue.maxdiskspace="5G"
queue.syncqueuefiles="on"
queue.checkpointInterval="5"
action.resumeRetryCount="-1"
action.ResumeInterval="60"
action.reportSuspension="on"
action.reportSuspensionContinuation="on"
Timeout="5"
)
}

- On the server side, write to disk and send to SIEM (1 and 2, same
strategy as before). It should save the queue when rsyslog stops by
"systemctl stop rsyslog" and resume the queue when starts again. I'm using
the following configuration on server 1 and server 2:

module(load="imrelp")
input(type="imrelp"
port="1514"
ruleset="relp_1514"
KeepAlive="off"
)

ruleset(name="relp_1514")
{
action(name="relp_1514_save_to_disk"
type="omfile"
sync="on"
queue.spoolDirectory="/var/log/spool"
queue.filename="local-save-queue"
queue.saveonshutdown="on"
dynaFile="RemoteLogSavePath"
dynaFileCacheSize="2000"
asyncWriting="on"
flushInterval="5"
fileCreateMode="0644"
dirCreateMode="0755"
)
action(name="relp_1514_send_siem1" type="omrelp"
target="siem1_collector_ip"
port="1514"
queue.workerThreads="2"
queue.saveonshutdown="on"
action.resumeRetryCount="2"
action.ResumeInterval="5"
action.reportSuspension="on"
action.reportSuspensionContinuation="on"
Timeout="5"
)
action(name="relp_1514_send_siem2" type="omrelp"
action.execOnlyWhenPreviousIsSuspended="on"
target="siem2_collector_ip"
port="1514"
queue.workerThreads="2"
queue.saveonshutdown="on"
action.resumeRetryCount="2"
action.ResumeInterval="5"
action.reportSuspension="on"
action.reportSuspensionContinuation="on"
Timeout="5"
)
action(name="relp_1514_send_prx1-forever" type="omrelp"
action.execOnlyWhenPreviousIsSuspended="on"
target="siem1_collector_ip"
port="1514"
queue.type="Disk"
queue.size="1000000"
queue.highWatermark="1"
queue.filename="fwd-relp-persistent-queue-sec"
queue.spoolDirectory="/var/log/spool"
queue.workerThreads="2"
queue.saveonshutdown="on"
queue.maxdiskspace="5G"
queue.syncqueuefiles="on"
queue.checkpointInterval="5"
action.resumeRetryCount="-1"
action.ResumeInterval="10"
action.reportSuspension="on"
action.reportSuspensionContinuation="on"
Timeout="5"
)
}

I tested this configuration by starting a flow of logs from a client
machine and then stoped by systemctl stop rsyslog, each rsyslog server.
Then reboot the OS, this for simulating an upgrade for example.
The rsyslog unit file is:

[Unit]
Description=System Logging Service
;Requires=syslog.socket
Wants=network.target network-online.target
After=network.target network-online.target
Documentation=man:rsyslogd(8)
Documentation=http://www.rsyslog.com/doc/

[Service]
Type=notify
EnvironmentFile=-/etc/sysconfig/rsyslog
ExecStart=/usr/sbin/rsyslogd -n $SYSLOGD_OPTIONS
Restart=on-failure
UMask=0066
StandardOutput=null
KillMode=control-group
TimeoutSec=900

[Install]
WantedBy=multi-user.target


Testing this by sending 200k messages in a 5 minutes window, without
reboots, I get the 200k messages both in the rsyslog server and the siem.
If I stop and reboot the rsyslog servers, one by one, waiting for startup
before each reboot, I lose between 20 and 2000 messages on both.
Is this the expected behaviour?
What can I do to have 0 loss in this setup?
Im using:
rsyslogd 8.24.0-41.el7_7, compiled with:
PLATFORM: x86_64-redhat-linux-gnu
PLATFORM (lsb_release -d):
FEATURE_REGEXP: Yes
GSSAPI Kerberos 5 support: Yes
FEATURE_DEBUG (debug build, slow code): No
32bit Atomic operations supported: Yes
64bit Atomic operations supported: Yes
memory allocator: system default
Runtime Instrumentation (slow code): No
uuid support: Yes
Number of Bits in RainerScript integers: 64

That comes with CentOS 7, I cannot install a version that is not in the
repository.

Regards,

Pedro Reis
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.