On Thu, 2008-07-17 at 09:08 -0600, RB wrote:
> > I actually (really ;)) do not know syslog-ng and keep myself somewhat
> > away from it to prevent accidental steal of "whatever" from there. But
> > now that you raise the point: could you quickly describe how it works.
> > From your mail, it sounds like what I am thinking about...
>
> I 100% agree with and applaud that separation. As you've stated
> before, they make a good product and don't warrant any interference.
Definitely.
>
> Not touching the docs myself (since I know too much of it by heart
> anyway), flush_lines allows the admin to configure a particular number
> of log entries to buffer before forcing a flush to disk, whereas
> flush_timeout configures the maximum interval between disk flushes.
> Unlike ramlog, it seems to do this with internal allocations and
> doesn't rely on a ramdisk. I usually set it rather conservatively (20
> and 600 respectively), but I can definitely see being more aggressive
> on a dedicated log collector with critical power or a laptop in
> ultra-low power mode.
This begins to make more and more sense. Just one thing to make sure we
are talking along the same lines. The ramdisk approach provides a way to
save the IO while still being able to look at the log lines as fast as
they come in.
If, in rsyslog, I create a memory buffer, I can persist lines to disk
only after it is "time to write them to disk" (whatever that triggers).
So you will not be able to observe the messages while they stick inside
rsyslog's queue. I guess for many cases this is not really relevant. And
of course, it is something that must be configurable on an action basis,
so different files may use different settings.
Thinking more about a potential algorithm, I tend to think it would
probably useful to write log files in chunks of n bytes instead of n
lines. I am thinking along the lines of matching up the output buffer
with the disk sector size (or any multiple of it) so that partial writes
of the same sector are avoided. Of course, that involves some stating of
the file to obtain the size of the first buffer to be written (filesize
mod sector [allocation unit] size). I also requires me to find out the
allocation unit size for a given file system. It obviously involves
writing incomplete lines. It requires additional code. So the best
solution may actually be to permit partial writes (as initially thought)
but recommend large buffers. The overall effect could justify the
performance impact from doing non-optimal writes. But I am in too much
detail.
The actual questions are:
a) is it OK that log data is visible only after the (write) delay?
b) does it sound useful to buffer based on allocation unit sizes?
Thanks,
Rainer
>
> This has been a feature of the public version of syslog-ng for as long
> as I can remember (or four years, whichever is sooner ;). Combined
> with disk queues I can see a very nice tiered approach to handling
> extremely high volumes of log data in a rather reliable manner.
> _______________________________________________
> rsyslog mailing list
> http://lists.adiscon.net/mailman/listinfo/rsyslog