Mailing List Archive

Problems with spooling to disk
Hello everybody

Now I'm trying to configure rsyslog to spool log messages to disk, while the
connection to the central log collector is down. But it does not work......

I am using rsyslog v 3.21.2

I have configured the client rsyslog like this:

$ModLoad imuxsock
$ModLoad omrelp
$WorkDirectory /var/spool/rsyslog
$ActionQueueType LinkedList
$ActionQueueFileName rsyslog-buffer
$ActionResumeRetryCount -1
$ActionQueueSaveOnShutdown on
*.* :omrelp:172.16.0.43:2514

Which is similar to the reliable forwarding configuration from the
documentation, except that I prefer to use RELP.

But When I test it, by stopping rsyslog on the central log collector, and
running this small shell script on the client:

#! /bin/sh
for ((i=1;i<=10000;i+=1)); do
logger Log line $i
done

The shell script does not complete until I restart rsyslog, on the central
log collector, and when I look at the timestamps in the logfile on the log
collector, I see a large time difference between log line 3303 and 3304.

My plan is to get rsyslog to spool all log entries locally, while the
connection the the central log collector is interrupted, and then flush all
the spooled log messages to the central log collector when the communication
is restored - while the rest of the client system continues with business as
usual.

I can send you a debug log is needed.


-- David Darville
Problems with spooling to disk [ In reply to ]
From what I see, this should work. So please send the debug log, client
(more important) and server please :)

Thanks,
Rainer

> -----Original Message-----
> From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog-
> bounces at lists.adiscon.com] On Behalf Of David Darville
> Sent: Thursday, August 07, 2008 2:06 PM
> To: rsyslog at lists.adiscon.com
> Subject: [rsyslog] Problems with spooling to disk
>
> Hello everybody
>
> Now I'm trying to configure rsyslog to spool log messages to disk,
> while the
> connection to the central log collector is down. But it does not
> work......
>
> I am using rsyslog v 3.21.2
>
> I have configured the client rsyslog like this:
>
> $ModLoad imuxsock
> $ModLoad omrelp
> $WorkDirectory /var/spool/rsyslog
> $ActionQueueType LinkedList
> $ActionQueueFileName rsyslog-buffer
> $ActionResumeRetryCount -1
> $ActionQueueSaveOnShutdown on
> *.* :omrelp:172.16.0.43:2514
>
> Which is similar to the reliable forwarding configuration from the
> documentation, except that I prefer to use RELP.
>
> But When I test it, by stopping rsyslog on the central log collector,
> and
> running this small shell script on the client:
>
> #! /bin/sh
> for ((i=1;i<=10000;i+=1)); do
> logger Log line $i
> done
>
> The shell script does not complete until I restart rsyslog, on the
> central
> log collector, and when I look at the timestamps in the logfile on the
> log
> collector, I see a large time difference between log line 3303 and
> 3304.
>
> My plan is to get rsyslog to spool all log entries locally, while the
> connection the the central log collector is interrupted, and then
flush
> all
> the spooled log messages to the central log collector when the
> communication
> is restored - while the rest of the client system continues with
> business as
> usual.
>
> I can send you a debug log is needed.
>
>
> -- David Darville
> _______________________________________________
> rsyslog mailing list
> http://lists.adiscon.net/mailman/listinfo/rsyslog
Problems with spooling to disk [ In reply to ]
Thanks, I now received the log files. And, of course, they made
immediately clear what is going on ;) What you experience is flow
control kicking in. Let me first elaborate a bit:

Rsyslog knows inputs which can be delayed a bit (like tcp receiver AND
the local socket) and those who can not (like UDP). The idea is that we
can slow down senders where this does not result in loss of messages,
leaving room inside the queue on busy systems for those where we can not
do that. We had cases where not delaying caused a lot of grief.

There are also two watermarks, the low and high delay mark. If the low
delay mark is reached, a delayable source is delayed for one second. It
is at 70% of the (in-memory) queue capacity. Again, this is all done for
some very good reasons.

However, when I introduced these features, I did not make them
configurable. Also, I did classify the local unix socket as one who can
be delayed. In your scenario, this does not stop the system, but once
the local log socket is filled up, it needs to wait one second for each
message before it is pulled.

Again, the rate limiting logic is there for good reasons. So what I now
see is that I need to add a few config statements to allow to control it
according to your needs. I am also unsure if I should re-classify
imuxsock as an input which cannot be delayed - but that also can have
some drawbacks. Maybe the best thing to do is make its state
configurable and use "not to be delayed" as default.

Anyone with a thought on that?

Rainer

> -----Original Message-----
> From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog-
> bounces at lists.adiscon.com] On Behalf Of Rainer Gerhards
> Sent: Thursday, August 07, 2008 2:19 PM
> To: rsyslog-users
> Subject: Re: [rsyslog] Problems with spooling to disk
>
> >From what I see, this should work. So please send the debug log,
> client
> (more important) and server please :)
>
> Thanks,
> Rainer
>
> > -----Original Message-----
> > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog-
> > bounces at lists.adiscon.com] On Behalf Of David Darville
> > Sent: Thursday, August 07, 2008 2:06 PM
> > To: rsyslog at lists.adiscon.com
> > Subject: [rsyslog] Problems with spooling to disk
> >
> > Hello everybody
> >
> > Now I'm trying to configure rsyslog to spool log messages to disk,
> > while the
> > connection to the central log collector is down. But it does not
> > work......
> >
> > I am using rsyslog v 3.21.2
> >
> > I have configured the client rsyslog like this:
> >
> > $ModLoad imuxsock
> > $ModLoad omrelp
> > $WorkDirectory /var/spool/rsyslog
> > $ActionQueueType LinkedList
> > $ActionQueueFileName rsyslog-buffer
> > $ActionResumeRetryCount -1
> > $ActionQueueSaveOnShutdown on
> > *.* :omrelp:172.16.0.43:2514
> >
> > Which is similar to the reliable forwarding configuration from the
> > documentation, except that I prefer to use RELP.
> >
> > But When I test it, by stopping rsyslog on the central log
collector,
> > and
> > running this small shell script on the client:
> >
> > #! /bin/sh
> > for ((i=1;i<=10000;i+=1)); do
> > logger Log line $i
> > done
> >
> > The shell script does not complete until I restart rsyslog, on the
> > central
> > log collector, and when I look at the timestamps in the logfile on
> the
> > log
> > collector, I see a large time difference between log line 3303 and
> > 3304.
> >
> > My plan is to get rsyslog to spool all log entries locally, while
the
> > connection the the central log collector is interrupted, and then
> flush
> > all
> > the spooled log messages to the central log collector when the
> > communication
> > is restored - while the rest of the client system continues with
> > business as
> > usual.
> >
> > I can send you a debug log is needed.
> >
> >
> > -- David Darville
> > _______________________________________________
> > rsyslog mailing list
> > http://lists.adiscon.net/mailman/listinfo/rsyslog
> _______________________________________________
> rsyslog mailing list
> http://lists.adiscon.net/mailman/listinfo/rsyslog
Problems with spooling to disk [ In reply to ]
David,

I have done a quick and dirty (not to stay) fix, which makes imuxsock a
NO_DELAY input:

http://download.rsyslog.com/rsyslog/rsyslog-3.21.3-Test4.tar.gz

I'd appreciate if you could give it a try and report back.

Rainer

> -----Original Message-----
> From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog-
> bounces at lists.adiscon.com] On Behalf Of Rainer Gerhards
> Sent: Friday, August 08, 2008 12:28 PM
> To: rsyslog-users
> Subject: Re: [rsyslog] Problems with spooling to disk
>
> Thanks, I now received the log files. And, of course, they made
> immediately clear what is going on ;) What you experience is flow
> control kicking in. Let me first elaborate a bit:
>
> Rsyslog knows inputs which can be delayed a bit (like tcp receiver AND
> the local socket) and those who can not (like UDP). The idea is that
we
> can slow down senders where this does not result in loss of messages,
> leaving room inside the queue on busy systems for those where we can
> not
> do that. We had cases where not delaying caused a lot of grief.
>
> There are also two watermarks, the low and high delay mark. If the low
> delay mark is reached, a delayable source is delayed for one second.
It
> is at 70% of the (in-memory) queue capacity. Again, this is all done
> for
> some very good reasons.
>
> However, when I introduced these features, I did not make them
> configurable. Also, I did classify the local unix socket as one who
can
> be delayed. In your scenario, this does not stop the system, but once
> the local log socket is filled up, it needs to wait one second for
each
> message before it is pulled.
>
> Again, the rate limiting logic is there for good reasons. So what I
now
> see is that I need to add a few config statements to allow to control
> it
> according to your needs. I am also unsure if I should re-classify
> imuxsock as an input which cannot be delayed - but that also can have
> some drawbacks. Maybe the best thing to do is make its state
> configurable and use "not to be delayed" as default.
>
> Anyone with a thought on that?
>
> Rainer
>
> > -----Original Message-----
> > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog-
> > bounces at lists.adiscon.com] On Behalf Of Rainer Gerhards
> > Sent: Thursday, August 07, 2008 2:19 PM
> > To: rsyslog-users
> > Subject: Re: [rsyslog] Problems with spooling to disk
> >
> > >From what I see, this should work. So please send the debug log,
> > client
> > (more important) and server please :)
> >
> > Thanks,
> > Rainer
> >
> > > -----Original Message-----
> > > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog-
> > > bounces at lists.adiscon.com] On Behalf Of David Darville
> > > Sent: Thursday, August 07, 2008 2:06 PM
> > > To: rsyslog at lists.adiscon.com
> > > Subject: [rsyslog] Problems with spooling to disk
> > >
> > > Hello everybody
> > >
> > > Now I'm trying to configure rsyslog to spool log messages to disk,
> > > while the
> > > connection to the central log collector is down. But it does not
> > > work......
> > >
> > > I am using rsyslog v 3.21.2
> > >
> > > I have configured the client rsyslog like this:
> > >
> > > $ModLoad imuxsock
> > > $ModLoad omrelp
> > > $WorkDirectory /var/spool/rsyslog
> > > $ActionQueueType LinkedList
> > > $ActionQueueFileName rsyslog-buffer
> > > $ActionResumeRetryCount -1
> > > $ActionQueueSaveOnShutdown on
> > > *.* :omrelp:172.16.0.43:2514
> > >
> > > Which is similar to the reliable forwarding configuration from the
> > > documentation, except that I prefer to use RELP.
> > >
> > > But When I test it, by stopping rsyslog on the central log
> collector,
> > > and
> > > running this small shell script on the client:
> > >
> > > #! /bin/sh
> > > for ((i=1;i<=10000;i+=1)); do
> > > logger Log line $i
> > > done
> > >
> > > The shell script does not complete until I restart rsyslog, on the
> > > central
> > > log collector, and when I look at the timestamps in the logfile on
> > the
> > > log
> > > collector, I see a large time difference between log line 3303 and
> > > 3304.
> > >
> > > My plan is to get rsyslog to spool all log entries locally, while
> the
> > > connection the the central log collector is interrupted, and then
> > flush
> > > all
> > > the spooled log messages to the central log collector when the
> > > communication
> > > is restored - while the rest of the client system continues with
> > > business as
> > > usual.
> > >
> > > I can send you a debug log is needed.
> > >
> > >
> > > -- David Darville
> > > _______________________________________________
> > > rsyslog mailing list
> > > http://lists.adiscon.net/mailman/listinfo/rsyslog
> > _______________________________________________
> > rsyslog mailing list
> > http://lists.adiscon.net/mailman/listinfo/rsyslog
> _______________________________________________
> rsyslog mailing list
> http://lists.adiscon.net/mailman/listinfo/rsyslog
Problems with spooling to disk [ In reply to ]
hi,

Rainer Gerhards wrote:
> Thanks, I now received the log files. And, of course, they made
> immediately clear what is going on ;) What you experience is flow
> control kicking in. Let me first elaborate a bit:
>
> Rsyslog knows inputs which can be delayed a bit (like tcp receiver AND
> the local socket) and those who can not (like UDP). The idea is that we
> can slow down senders where this does not result in loss of messages,
> leaving room inside the queue on busy systems for those where we can not
> do that. We had cases where not delaying caused a lot of grief.
>
> There are also two watermarks, the low and high delay mark. If the low
> delay mark is reached, a delayable source is delayed for one second. It
> is at 70% of the (in-memory) queue capacity. Again, this is all done for
> some very good reasons.
>
> However, when I introduced these features, I did not make them
> configurable. Also, I did classify the local unix socket as one who can
> be delayed. In your scenario, this does not stop the system, but once
> the local log socket is filled up, it needs to wait one second for each
> message before it is pulled.
>
> Again, the rate limiting logic is there for good reasons. So what I now
> see is that I need to add a few config statements to allow to control it
> according to your needs. I am also unsure if I should re-classify
> imuxsock as an input which cannot be delayed - but that also can have
> some drawbacks. Maybe the best thing to do is make its state
> configurable and use "not to be delayed" as default.

ok, after reading through your explanation, i (almost) understand the
delay mechanism and that at present it is not configurable.

almost, because i do not get the part where you talk about the "70% of
the (in-memory) queue capacity".

anyways, how does this relate to the original problem, that the
following code stalls the logging process.

>>> #! /bin/sh
>>> for ((i=1;i<=10000;i+=1)); do
>>> logger Log line $i
>>> done

is this a capacity problem? is this another problem?

i really consider this a critical bug, as i yesterday ran out of
diskspace on my remote-rsyslog-and-mysql-on-one-host-testbed and the
rsyslog-client, which logs locally and to the remote destination,
stalled completely.

cheers,
raoul
ps: i am on 3.16.1-1~bpo40+1
--
____________________________________________________________________
DI (FH) Raoul Bhatia M.Sc. email. r.bhatia at ipax.at
Technischer Leiter

IPAX - Aloy Bhatia Hava OEG web. http://www.ipax.at
Barawitzkagasse 10/2/2/11 email. office at ipax.at
1190 Wien tel. +43 1 3670030
FN 277995t HG Wien fax. +43 1 3670030 15
____________________________________________________________________
Problems with spooling to disk [ In reply to ]
On Wed, 2008-08-13 at 13:04 +0200, Raoul Bhatia [IPAX] wrote:
> hi,
>
> Rainer Gerhards wrote:
> > Thanks, I now received the log files. And, of course, they made
> > immediately clear what is going on ;) What you experience is flow
> > control kicking in. Let me first elaborate a bit:
> >
> > Rsyslog knows inputs which can be delayed a bit (like tcp receiver AND
> > the local socket) and those who can not (like UDP). The idea is that we
> > can slow down senders where this does not result in loss of messages,
> > leaving room inside the queue on busy systems for those where we can not
> > do that. We had cases where not delaying caused a lot of grief.
> >
> > There are also two watermarks, the low and high delay mark. If the low
> > delay mark is reached, a delayable source is delayed for one second. It
> > is at 70% of the (in-memory) queue capacity. Again, this is all done for
> > some very good reasons.
> >
> > However, when I introduced these features, I did not make them
> > configurable. Also, I did classify the local unix socket as one who can
> > be delayed. In your scenario, this does not stop the system, but once
> > the local log socket is filled up, it needs to wait one second for each
> > message before it is pulled.
> >
> > Again, the rate limiting logic is there for good reasons. So what I now
> > see is that I need to add a few config statements to allow to control it
> > according to your needs. I am also unsure if I should re-classify
> > imuxsock as an input which cannot be delayed - but that also can have
> > some drawbacks. Maybe the best thing to do is make its state
> > configurable and use "not to be delayed" as default.
>
> ok, after reading through your explanation, i (almost) understand the
> delay mechanism and that at present it is not configurable.
>
> almost, because i do not get the part where you talk about the "70% of
> the (in-memory) queue capacity".

An example: if the configured queue capacity is 10,000 (the default
value for the main message queue), than no flow control at all happens
until 7,000 messages are inside the queue. Once they are (and only as
long as they are), flow control is applied to those inputs that permit
it (imuxsock permits it).

It doesn't matter here if the queue is disk-assisted or not. Even with
disk-assisted queues, this setting is very important to prevent an input
(e.g. imfile!) to send massive amounts of data to the queue when there
is no need to (because the input can be flow-controlled).

>
> anyways, how does this relate to the original problem, that the
> following code stalls the logging process.
>
> >>> #! /bin/sh
> >>> for ((i=1;i<=10000;i+=1)); do
> >>> logger Log line $i
> >>> done
>
> is this a capacity problem? is this another problem?

It is going over the flow control watermark, flow control kicks in and
delays imuxsock writes by one second each as long as there are more than
7,000 messages. No problem, intended behavior, but obviously
questionable. In the imuxsock case, there are also some bad cases.

>
> i really consider this a critical bug, as i yesterday ran out of
> diskspace on my remote-rsyslog-and-mysql-on-one-host-testbed and the
> rsyslog-client, which logs locally and to the remote destination,
> stalled completely.

We'll - not completely, one message per second ;) But that's probably
complete enough ;)

As a quick fix, I think I'll make imuxsock a non-flow-controllable input
in the v3-stable (3.18.x). The devel will have a switch to make it flow
controlled if one desires. I'll also add some other knobs to fine-tune
the flow control algo, but that probably needs to wait until after my
summer break. These tweaks could easily break things and I don't like
the idea of doing them under time pressure ;)

HTH
Rainer
>
> cheers,
> raoul
> ps: i am on 3.16.1-1~bpo40+1
Problems with spooling to disk [ In reply to ]
This is the v3-stable fix:

http://git.adiscon.com/?p=rsyslog.git;a=blobdiff;f=plugins/imuxsock/imux
sock.c;h=60ccaffbc526a03011be3ce5869e40bc3e32c319;hp=f8798039c5d53e58fc7
fa17807547862e08c999e;hb=8eb888d049da12e1294a7688432b6325794ade32;hpb=02
f768c37dac9dde424bbd31e378482750fc276c

You can also easily apply it to 3.16 - I just won't do it that version
has been superseded.

Rainer

> -----Original Message-----
> From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog-
> bounces at lists.adiscon.com] On Behalf Of Rainer Gerhards
> Sent: Wednesday, August 13, 2008 2:33 PM
> To: rsyslog-users
> Subject: Re: [rsyslog] Problems with spooling to disk
>
> On Wed, 2008-08-13 at 13:04 +0200, Raoul Bhatia [IPAX] wrote:
> > hi,
> >
> > Rainer Gerhards wrote:
> > > Thanks, I now received the log files. And, of course, they made
> > > immediately clear what is going on ;) What you experience is flow
> > > control kicking in. Let me first elaborate a bit:
> > >
> > > Rsyslog knows inputs which can be delayed a bit (like tcp receiver
> AND
> > > the local socket) and those who can not (like UDP). The idea is
> that we
> > > can slow down senders where this does not result in loss of
> messages,
> > > leaving room inside the queue on busy systems for those where we
> can not
> > > do that. We had cases where not delaying caused a lot of grief.
> > >
> > > There are also two watermarks, the low and high delay mark. If the
> low
> > > delay mark is reached, a delayable source is delayed for one
> second. It
> > > is at 70% of the (in-memory) queue capacity. Again, this is all
> done for
> > > some very good reasons.
> > >
> > > However, when I introduced these features, I did not make them
> > > configurable. Also, I did classify the local unix socket as one
who
> can
> > > be delayed. In your scenario, this does not stop the system, but
> once
> > > the local log socket is filled up, it needs to wait one second for
> each
> > > message before it is pulled.
> > >
> > > Again, the rate limiting logic is there for good reasons. So what
I
> now
> > > see is that I need to add a few config statements to allow to
> control it
> > > according to your needs. I am also unsure if I should re-classify
> > > imuxsock as an input which cannot be delayed - but that also can
> have
> > > some drawbacks. Maybe the best thing to do is make its state
> > > configurable and use "not to be delayed" as default.
> >
> > ok, after reading through your explanation, i (almost) understand
the
> > delay mechanism and that at present it is not configurable.
> >
> > almost, because i do not get the part where you talk about the "70%
> of
> > the (in-memory) queue capacity".
>
> An example: if the configured queue capacity is 10,000 (the default
> value for the main message queue), than no flow control at all happens
> until 7,000 messages are inside the queue. Once they are (and only as
> long as they are), flow control is applied to those inputs that permit
> it (imuxsock permits it).
>
> It doesn't matter here if the queue is disk-assisted or not. Even with
> disk-assisted queues, this setting is very important to prevent an
> input
> (e.g. imfile!) to send massive amounts of data to the queue when there
> is no need to (because the input can be flow-controlled).
>
> >
> > anyways, how does this relate to the original problem, that the
> > following code stalls the logging process.
> >
> > >>> #! /bin/sh
> > >>> for ((i=1;i<=10000;i+=1)); do
> > >>> logger Log line $i
> > >>> done
> >
> > is this a capacity problem? is this another problem?
>
> It is going over the flow control watermark, flow control kicks in and
> delays imuxsock writes by one second each as long as there are more
> than
> 7,000 messages. No problem, intended behavior, but obviously
> questionable. In the imuxsock case, there are also some bad cases.
>
> >
> > i really consider this a critical bug, as i yesterday ran out of
> > diskspace on my remote-rsyslog-and-mysql-on-one-host-testbed and the
> > rsyslog-client, which logs locally and to the remote destination,
> > stalled completely.
>
> We'll - not completely, one message per second ;) But that's probably
> complete enough ;)
>
> As a quick fix, I think I'll make imuxsock a non-flow-controllable
> input
> in the v3-stable (3.18.x). The devel will have a switch to make it
flow
> controlled if one desires. I'll also add some other knobs to fine-tune
> the flow control algo, but that probably needs to wait until after my
> summer break. These tweaks could easily break things and I don't like
> the idea of doing them under time pressure ;)
>
> HTH
> Rainer
> >
> > cheers,
> > raoul
> > ps: i am on 3.16.1-1~bpo40+1
>
> _______________________________________________
> rsyslog mailing list
> http://lists.adiscon.net/mailman/listinfo/rsyslog
Problems with spooling to disk [ In reply to ]
Rainer Gerhards wrote:
> This is the v3-stable fix:
>
> http://git.adiscon.com/?p=rsyslog.git;a=blobdiff;f=plugins/imuxsock/imux
> sock.c;h=60ccaffbc526a03011be3ce5869e40bc3e32c319;hp=f8798039c5d53e58fc7
> fa17807547862e08c999e;hb=8eb888d049da12e1294a7688432b6325794ade32;hpb=02
> f768c37dac9dde424bbd31e378482750fc276c

thank you for your prompt fix. i will try to simulate the failure again.

cheers,
raoul
--
____________________________________________________________________
DI (FH) Raoul Bhatia M.Sc. email. r.bhatia at ipax.at
Technischer Leiter

IPAX - Aloy Bhatia Hava OEG web. http://www.ipax.at
Barawitzkagasse 10/2/2/11 email. office at ipax.at
1190 Wien tel. +43 1 3670030
FN 277995t HG Wien fax. +43 1 3670030 15
____________________________________________________________________