Mailing List Archive

Disk error when using rsyslog
Hi Rainer,

We're using rsyslog(3.20.0) as our central logging server and clients.

We experienced an DISK ERROR on server last month. At that time, we were
using TCP to transport logs from client to server. And we also setup the
configuration just like [1]. But unfortunately our central logging
server got DISK error for one hour. So we lost logfiles of that period
of time.

I've a look at doc [1] carefully, I guess "RELIABLE" only means when
server got offline or rsyslogd on it isn't running, then clients will
save logs in buffer or write to a file on disk. If server is still
online and rsyslogd is running, but with IO/Error or Disk Full, then
client will still transfer logs to server even with RELP, coz I guess
RELP only protects logs could be transferred via network successfully,
it doesn't care the logs are written successfully to file on server. Am
I right?

So I guess if we need to prevent this, we need do some work on server?
Do we have some "directives" options that we could transfer logs to a
failover server if local disk fails or buffer in memory before disk got
corrected?

[1]: http://www.rsyslog.com/doc-rsyslog_reliable_forwarding.html

Thanks,

--
Patrick Shen
Operations Engineer

_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com
Re: Disk error when using rsyslog [ In reply to ]
On Fri, Dec 12, 2008 at 1:48 AM, Patrick Shen <patrick.shen@net-m.de> wrote:
> Hi Rainer,
>
> We're using rsyslog(3.20.0) as our central logging server and clients.
>
> We experienced an DISK ERROR on server last month. At that time, we were
> using TCP to transport logs from client to server. And we also setup the
> configuration just like [1]. But unfortunately our central logging
> server got DISK error for one hour. So we lost logfiles of that period
> of time.
>
> I've a look at doc [1] carefully, I guess "RELIABLE" only means when
> server got offline or rsyslogd on it isn't running, then clients will
> save logs in buffer or write to a file on disk. If server is still
> online and rsyslogd is running, but with IO/Error or Disk Full, then
> client will still transfer logs to server even with RELP, coz I guess
> RELP only protects logs could be transferred via network successfully,
> it doesn't care the logs are written successfully to file on server. Am
> I right?

Yes. RELP Is a protocol for the reliable exchange of event logs over a
network. What the destination daemon does once it has the logs is no
concern of the client's.


> So I guess if we need to prevent this, we need do some work on server?
> Do we have some "directives" options that we could transfer logs to a
> failover server if local disk fails or buffer in memory before disk got
> corrected?

Yes: http://wiki.rsyslog.com/index.php/FailoverSyslogServer

-HKS

> [1]: http://www.rsyslog.com/doc-rsyslog_reliable_forwarding.html
>
> Thanks,
>
> --
> Patrick Shen
> Operations Engineer
>
> _______________________________________________
> rsyslog mailing list
> http://lists.adiscon.net/mailman/listinfo/rsyslog
> http://www.rsyslog.com
>
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com
Re: Disk error when using rsyslog [ In reply to ]
I seem to have overlooked the initial message (spam filter maybe...).

HKS is right (thx), but I think this looks like a bug in that the output
write does not care about the write failure (what it should). The output
writer is pretty old legacy code, so that's quite possible. I'll look
into it ASAP, but I got a new machine (hopefully fast enough to disply
some troubles) today and currently I am happy that at least mail does
work again (so far it's a mess). So... some time next week ;)

Rainer

> -----Original Message-----
> From: rsyslog-bounces@lists.adiscon.com [mailto:rsyslog-
> bounces@lists.adiscon.com] On Behalf Of (private) HKS
> Sent: Friday, December 12, 2008 4:10 PM
> To: rsyslog-users
> Subject: Re: [rsyslog] Disk error when using rsyslog
>
> On Fri, Dec 12, 2008 at 1:48 AM, Patrick Shen <patrick.shen@net-m.de>
> wrote:
> > Hi Rainer,
> >
> > We're using rsyslog(3.20.0) as our central logging server and
> clients.
> >
> > We experienced an DISK ERROR on server last month. At that time, we
> were
> > using TCP to transport logs from client to server. And we also setup
> the
> > configuration just like [1]. But unfortunately our central logging
> > server got DISK error for one hour. So we lost logfiles of that
> period
> > of time.
> >
> > I've a look at doc [1] carefully, I guess "RELIABLE" only means when
> > server got offline or rsyslogd on it isn't running, then clients
will
> > save logs in buffer or write to a file on disk. If server is still
> > online and rsyslogd is running, but with IO/Error or Disk Full, then
> > client will still transfer logs to server even with RELP, coz I
guess
> > RELP only protects logs could be transferred via network
> successfully,
> > it doesn't care the logs are written successfully to file on server.
> Am
> > I right?
>
> Yes. RELP Is a protocol for the reliable exchange of event logs over a
> network. What the destination daemon does once it has the logs is no
> concern of the client's.
>
>
> > So I guess if we need to prevent this, we need do some work on
> server?
> > Do we have some "directives" options that we could transfer logs to
a
> > failover server if local disk fails or buffer in memory before disk
> got
> > corrected?
>
> Yes: http://wiki.rsyslog.com/index.php/FailoverSyslogServer
>
> -HKS
>
> > [1]: http://www.rsyslog.com/doc-rsyslog_reliable_forwarding.html
> >
> > Thanks,
> >
> > --
> > Patrick Shen
> > Operations Engineer
> >
> > _______________________________________________
> > rsyslog mailing list
> > http://lists.adiscon.net/mailman/listinfo/rsyslog
> > http://www.rsyslog.com
> >
> _______________________________________________
> rsyslog mailing list
> http://lists.adiscon.net/mailman/listinfo/rsyslog
> http://www.rsyslog.com
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com
Re: Disk error when using rsyslog [ In reply to ]
Hi HKS,

Thanks for your good advice. It's helpful.

Patrick

(private) HKS wrote:
> On Fri, Dec 12, 2008 at 1:48 AM, Patrick Shen <patrick.shen@net-m.de> wrote:
>> Hi Rainer,
>>
>> We're using rsyslog(3.20.0) as our central logging server and clients.
>>
>> We experienced an DISK ERROR on server last month. At that time, we were
>> using TCP to transport logs from client to server. And we also setup the
>> configuration just like [1]. But unfortunately our central logging
>> server got DISK error for one hour. So we lost logfiles of that period
>> of time.
>>
>> I've a look at doc [1] carefully, I guess "RELIABLE" only means when
>> server got offline or rsyslogd on it isn't running, then clients will
>> save logs in buffer or write to a file on disk. If server is still
>> online and rsyslogd is running, but with IO/Error or Disk Full, then
>> client will still transfer logs to server even with RELP, coz I guess
>> RELP only protects logs could be transferred via network successfully,
>> it doesn't care the logs are written successfully to file on server. Am
>> I right?
>
> Yes. RELP Is a protocol for the reliable exchange of event logs over a
> network. What the destination daemon does once it has the logs is no
> concern of the client's.
>
>
>> So I guess if we need to prevent this, we need do some work on server?
>> Do we have some "directives" options that we could transfer logs to a
>> failover server if local disk fails or buffer in memory before disk got
>> corrected?
>
> Yes: http://wiki.rsyslog.com/index.php/FailoverSyslogServer
>
> -HKS
>
>> [1]: http://www.rsyslog.com/doc-rsyslog_reliable_forwarding.html
>>
>> Thanks,
>>
>> --
>> Patrick Shen
>> Operations Engineer
>>
>> _______________________________________________
>> rsyslog mailing list
>> http://lists.adiscon.net/mailman/listinfo/rsyslog
>> http://www.rsyslog.com
>>
> _______________________________________________
> rsyslog mailing list
> http://lists.adiscon.net/mailman/listinfo/rsyslog
> http://www.rsyslog.com
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com
Re: Disk error when using rsyslog [ In reply to ]
Hi Rainer,

Rainer Gerhards wrote:
> I seem to have overlooked the initial message (spam filter maybe...).

Bad luck for my mail account :-(

> HKS is right (thx), but I think this looks like a bug in that the output
> write does not care about the write failure (what it should). The output
> writer is pretty old legacy code, so that's quite possible. I'll look
> into it ASAP, but I got a new machine (hopefully fast enough to disply
> some troubles) today and currently I am happy that at least mail does
> work again (so far it's a mess). So... some time next week ;)

Quite appreciate if you have a look at write failure. I think it's quite
Enterprise demand feature :-)

Many thanks,
Patrick
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com
Re: Disk error when using rsyslog [ In reply to ]
On Sun, 2008-12-14 at 04:56 +0100, Patrick Shen wrote:
> Hi Rainer,
>
> Rainer Gerhards wrote:
> > I seem to have overlooked the initial message (spam filter maybe...).
>
> Bad luck for my mail account :-(
>
> > HKS is right (thx), but I think this looks like a bug in that the output
> > write does not care about the write failure (what it should). The output
> > writer is pretty old legacy code, so that's quite possible. I'll look
> > into it ASAP, but I got a new machine (hopefully fast enough to disply
> > some troubles) today and currently I am happy that at least mail does
> > work again (so far it's a mess). So... some time next week ;)
>
> Quite appreciate if you have a look at write failure. I think it's quite
> Enterprise demand feature :-)

I have now verified that the code (by intension) ignores write errors.
That, of course, is legacy from a long gone era. However, I need to
think a bit about how to handle this most intelligently. The problem is
partial writes. Maybe I just try to write a LF after a failure and, if
that succeeds, simply continue. This results in a partial record begin
written and then the same record being "duplicated" (actually, the
partial part being duplicated).

Does anyone have a suggestion on how to best handle such a case? Or I
could try to write what could not yet be written. Maybe this is better,
but it wont' be able to survive a daemon restart...

Feedback appreciated.

Rainer

_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com
Re: Disk error when using rsyslog [ In reply to ]
On Mon, 15 Dec 2008, Rainer Gerhards wrote:

> On Sun, 2008-12-14 at 04:56 +0100, Patrick Shen wrote:
>> Hi Rainer,
>>
>> Rainer Gerhards wrote:
>>> I seem to have overlooked the initial message (spam filter maybe...).
>>
>> Bad luck for my mail account :-(
>>
>>> HKS is right (thx), but I think this looks like a bug in that the output
>>> write does not care about the write failure (what it should). The output
>>> writer is pretty old legacy code, so that's quite possible. I'll look
>>> into it ASAP, but I got a new machine (hopefully fast enough to disply
>>> some troubles) today and currently I am happy that at least mail does
>>> work again (so far it's a mess). So... some time next week ;)
>>
>> Quite appreciate if you have a look at write failure. I think it's quite
>> Enterprise demand feature :-)
>
> I have now verified that the code (by intension) ignores write errors.
> That, of course, is legacy from a long gone era. However, I need to
> think a bit about how to handle this most intelligently. The problem is
> partial writes. Maybe I just try to write a LF after a failure and, if
> that succeeds, simply continue. This results in a partial record begin
> written and then the same record being "duplicated" (actually, the
> partial part being duplicated).
>
> Does anyone have a suggestion on how to best handle such a case? Or I
> could try to write what could not yet be written. Maybe this is better,
> but it wont' be able to survive a daemon restart...
>
> Feedback appreciated.

this sounds like a smaller portion of the problem we were discussing when
we were talking about de-queuing multiple messages.

this approach would work, but if you know you have a file you could also
truncate the file to the end of the previous record (the beginning of the
record you are trying to write).

David Lang
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com