Mailing List Archive: Thinking about syslog forwarding infrastructure

Thinking about syslog forwarding infrastructure

Jul 29, 2008, 3:28 PM

Post #1 of 4 (747 views)

In the interest of brevity, I'm leaving out details...if you need
more, just ask.

We have a security event monitoring system that processes probably in
the neighborhood of 100 million syslog messages per day (I know
precisely how many events it has processed, but it doesn't break them
down by protocol). In some of our WAN sites, I would like to
implement a local system that will receive all the local syslog
messages and ship some/all back to the main collector on our LAN. The
main collector on the LAN would receive the bulk of its message from
other LAN systems. Then the main collect would ship some/all to the
SEM environment. I'm leaning towards using rsyslog for this task and
have a few questions:

1) What kind of system [rough estimate] would I need for the main
collector if assume 200 million syslog messages per day and peak that
is triple that average rate (~7000 eps)?

2) Can I enable rate limiting in a way that will:
A1) start dropping messages beyond a given threshold

A2) start intelligently dropping messages beyond a given threshold
(i.e. start dropping events matching this regex)

B) allow me to alert someone that this is occurring (is written to
log file, etc)

Thinking about syslog forwarding infrastructure [ In reply to ]

rgerhards at hq

Jul 30, 2008, 8:13 AM

Post #2 of 4 (739 views)

Permalink

Hi Matt,

On Tue, 2008-07-29 at 17:28 -0500, Matt Hellman wrote:
> In the interest of brevity, I'm leaving out details...if you need
> more, just ask.
>
> We have a security event monitoring system that processes probably in
> the neighborhood of 100 million syslog messages per day (I know
> precisely how many events it has processed, but it doesn't break them
> down by protocol). In some of our WAN sites, I would like to
> implement a local system that will receive all the local syslog
> messages and ship some/all back to the main collector on our LAN. The
> main collector on the LAN would receive the bulk of its message from
> other LAN systems. Then the main collect would ship some/all to the
> SEM environment. I'm leaning towards using rsyslog for this task and
> have a few questions:
>
> 1) What kind of system [rough estimate] would I need for the main
> collector if assume 200 million syslog messages per day and peak that
> is triple that average rate (~7000 eps)?

Quite honestly: I don't know. Which rules you carry out has a big
effect. But I have no real good big deployment numbers. The old game:
everyone is interested in them, no-one conveys them (hint: let me know
if you have some ;)).

>
> 2) Can I enable rate limiting in a way that will:
> A1) start dropping messages beyond a given threshold

you can do that

>
> A2) start intelligently dropping messages beyond a given threshold
> (i.e. start dropping events matching this regex)

not yet, but an interesting idea

>
> B) allow me to alert someone that this is occurring (is written to
> log file, etc)

mmhhh... not really. That's another interesting idea, and it should be
simple to enable. It conveys that to the debug log, but does not emit a
user message.

In any case, I think there are a couple of docs you need to read and
*understand* for this scale of deployment. Ask if you do not understand
them - I have written them and may have left too much out just out of
habit ;)

First and foremost, you must understand rsyslog queues. They handle all
queueing and rate limiting. The relevant doc is:

http://www.rsyslog.com/doc-queues.html

Then, I suggest to have a quick look at some use cases. I probably is a
good idea to read the queue doc once, go over the use cases and the
re-read the queue doc with the use cases on your mind:

http://www.rsyslog.com/doc-queues.html
http://www.rsyslog.com/doc-rsyslog_reliable_forwarding.html
http://wiki.rsyslog.com/index.php/OffPeakHours

Finally, IMHO you need to think about syslog reliability and the service
level you can expect. This is not only true for rsyslog, but the other
folks don't tell you about the reliability issues. Read this:

http://blog.gerhards.net/2008/04/on-unreliability-of-plain-tcp-syslog.html

I hope this gets you started.

Rainer

> _______________________________________________
> rsyslog mailing list
> http://lists.adiscon.net/mailman/listinfo/rsyslog

Thinking about syslog forwarding infrastructure [ In reply to ]

rgerhards at hq

Jul 30, 2008, 8:15 AM

Post #3 of 4 (745 views)

Permalink

I was too quick. A small yet important thing to add:

> > A2) start intelligently dropping messages beyond a given
> threshold
> > (i.e. start dropping events matching this regex)
>
> not yet, but an interesting idea

Well, it is semi-intelligent. Messages are discarded based on severity
level. But you can not use any other property. The details are in the
queue doc.

Rainer

Thinking about syslog forwarding infrastructure [ In reply to ]

mattjhell at gmail

Jul 30, 2008, 10:33 AM

Post #4 of 4 (735 views)

Permalink

Thanks for the reply Rainer.

>> 1) What kind of system [rough estimate] would I need for the main
>> collector if assume 200 million syslog messages per day and peak that
>> is triple that average rate (~7000 eps)?
>
> Quite honestly: I don't know. Which rules you carry out has a big
> effect. But I have no real good big deployment numbers. The old game:
> everyone is interested in them, no-one conveys them (hint: let me know
> if you have some ;)).

crap. well, I can probably test this easily enough myself. I just
feel better knowing that someone has already done it.

>> A2) start intelligently dropping messages beyond a given threshold
>> (i.e. start dropping events matching this regex)
>
> not yet, but an interesting idea

well, regex wouldn't be the only "intelligent" way to drop messages.
I suppose anything that isn't arbitrary might be considered
intelligent. Currently this is done based on priority, which won't
work well for us because we use a product (Snare) that converts
windows events into syslog that all have the same priority. FWIW,
this is a common way for SEM products to collect Windows events.

>> B) allow me to alert someone that this is occurring (is written to
>> log file, etc)
>
> mmhhh... not really. That's another interesting idea, and it should be
> simple to enable. It conveys that to the debug log, but does not emit a
> user message.

I was thinking about this and I don't necessarily need the product to
emit something directly to a user, if that's what you mean. I plan to
buffer to disk. Can I create a process to monitor the queue files or
something --warning: I have printed but at best skimmed many of the
docs you reference;-)

> In any case, I think there are a couple of docs you need to read and
> *understand* for this scale of deployment. Ask if you do not understand
> them - I have written them and may have left too much out just out of
> habit ;)

re: doc links. Thanks. I was being lazy and trying to avoid having to
read them prematurely;-) I think I'm too the point where I believe
rsyslog can theoretically deliver on my requirements though. It's time
to dig in.