Mailing List Archive

tips for managing data
So, my current mysql rsyslog drops about 20 million rows of data per day.

Over time, this gets slow as tables grow.

I'm not a dba, so I was wondering if anyone had some suggestions for
keeping performance still on the order of seconds, and not minutes or hours.

thx

I did add a key for EventSource, as that is commonly searched. However,
using PhpLogCon, it seems that if I search using the web interface (i.e.
I click on a host entry and hit the available searches) it is relatively
quick. However, changing the text field that is generated and hitting
the "search" button is slow. Do these two methods use the same query, or
is something else going on?

thx
tips for managing data [ In reply to ]
OK, so it seems that doing a query from the query line does a LIKE,
which can take significantly longer (sample query 8 seconds vs. 50 msecs...)

So, replacing the LIKE % in logstreamdb.class.db with an = speeds things
up quite a but, but I lose some flexibility. Is there some kind of
search syntax where I can differentiate between LIKE and =?

If not, I'm thinking something like:

source:foo.bar.com # would be using =

~source:foo # would be using LIKE



Rory Toma wrote:
> So, my current mysql rsyslog drops about 20 million rows of data per day.
>
> Over time, this gets slow as tables grow.
>
> I'm not a dba, so I was wondering if anyone had some suggestions for
> keeping performance still on the order of seconds, and not minutes or hours.
>
> thx
>
> I did add a key for EventSource, as that is commonly searched. However,
> using PhpLogCon, it seems that if I search using the web interface (i.e.
> I click on a host entry and hit the available searches) it is relatively
> quick. However, changing the text field that is generated and hitting
> the "search" button is slow. Do these two methods use the same query, or
> is something else going on?
>
> thx
> _______________________________________________
> rsyslog mailing list
> http://lists.adiscon.net/mailman/listinfo/rsyslog
>
tips for managing data [ In reply to ]
Hi,

the like query can indeed have quiet an impact on performance when doing
queries on large databases.
But I think we can expand the syntax, so you can either search by part
of a string (LIKE '%search%') or the whole string (= 'search'). This
should be rather easy to implement. I will put this on my todolist, if
it is as easy as I think, the next minor update of the devel branch will
contain this new feature.

Best regards,
Andre Lorbach

> -----Original Message-----
> From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog-
> bounces at lists.adiscon.com] On Behalf Of Rory Toma
> Sent: Thursday, July 31, 2008 4:10 AM
> To: rsyslog-users
> Subject: Re: [rsyslog] tips for managing data
>
> OK, so it seems that doing a query from the query line does a LIKE,
> which can take significantly longer (sample query 8 seconds vs. 50
msecs...)
>
> So, replacing the LIKE % in logstreamdb.class.db with an = speeds
things
> up quite a but, but I lose some flexibility. Is there some kind of
> search syntax where I can differentiate between LIKE and =?
>
> If not, I'm thinking something like:
>
> source:foo.bar.com # would be using =
>
> ~source:foo # would be using LIKE
>
>
>
> Rory Toma wrote:
> > So, my current mysql rsyslog drops about 20 million rows of data per
day.
> >
> > Over time, this gets slow as tables grow.
> >
> > I'm not a dba, so I was wondering if anyone had some suggestions for
> > keeping performance still on the order of seconds, and not minutes
or hours.
> >
> > thx
> >
> > I did add a key for EventSource, as that is commonly searched.
However,
> > using PhpLogCon, it seems that if I search using the web interface
(i.e.
> > I click on a host entry and hit the available searches) it is
relatively
> > quick. However, changing the text field that is generated and
hitting
> > the "search" button is slow. Do these two methods use the same
query, or
> > is something else going on?
> >
> > thx
> > _______________________________________________
> > rsyslog mailing list
> > http://lists.adiscon.net/mailman/listinfo/rsyslog
> >
>
> _______________________________________________
> rsyslog mailing list
> http://lists.adiscon.net/mailman/listinfo/rsyslog
tips for managing data [ In reply to ]
Cool. For me, it seems that using LIKE is most useful when searching the
message text. So, something like:

source:foo ~bar

would produce

where fromhost = 'foo' and message LIKE '%bar%'

thx

Andre Lorbach wrote:
> Hi,
>
> the like query can indeed have quiet an impact on performance when doing
> queries on large databases.
> But I think we can expand the syntax, so you can either search by part
> of a string (LIKE '%search%') or the whole string (= 'search'). This
> should be rather easy to implement. I will put this on my todolist, if
> it is as easy as I think, the next minor update of the devel branch will
> contain this new feature.
>
> Best regards,
> Andre Lorbach
>
>
>> -----Original Message-----
>> From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog-
>> bounces at lists.adiscon.com] On Behalf Of Rory Toma
>> Sent: Thursday, July 31, 2008 4:10 AM
>> To: rsyslog-users
>> Subject: Re: [rsyslog] tips for managing data
>>
>> OK, so it seems that doing a query from the query line does a LIKE,
>> which can take significantly longer (sample query 8 seconds vs. 50
>>
> msecs...)
>
>> So, replacing the LIKE % in logstreamdb.class.db with an = speeds
>>
> things
>
>> up quite a but, but I lose some flexibility. Is there some kind of
>> search syntax where I can differentiate between LIKE and =?
>>
>> If not, I'm thinking something like:
>>
>> source:foo.bar.com # would be using =
>>
>> ~source:foo # would be using LIKE
>>
>>
>>
>> Rory Toma wrote:
>>
>>> So, my current mysql rsyslog drops about 20 million rows of data per
>>>
> day.
>
>>> Over time, this gets slow as tables grow.
>>>
>>> I'm not a dba, so I was wondering if anyone had some suggestions for
>>> keeping performance still on the order of seconds, and not minutes
>>>
> or hours.
>
>>> thx
>>>
>>> I did add a key for EventSource, as that is commonly searched.
>>>
> However,
>
>>> using PhpLogCon, it seems that if I search using the web interface
>>>
> (i.e.
>
>>> I click on a host entry and hit the available searches) it is
>>>
> relatively
>
>>> quick. However, changing the text field that is generated and
>>>
> hitting
>
>>> the "search" button is slow. Do these two methods use the same
>>>
> query, or
>
>>> is something else going on?
>>>
>>> thx
>>> _______________________________________________
>>> rsyslog mailing list
>>> http://lists.adiscon.net/mailman/listinfo/rsyslog
>>>
>>>
>> _______________________________________________
>> rsyslog mailing list
>> http://lists.adiscon.net/mailman/listinfo/rsyslog
>>
> _______________________________________________
> rsyslog mailing list
> http://lists.adiscon.net/mailman/listinfo/rsyslog
>
tips for managing data [ In reply to ]
Hi again,

I want to inform you all that the search within phpLogCon has been
extended to support full and partial searches in Version 2.5.3,
available for download here: http://www.phplogcon.org/downloads

I first tests have shown, that filtering by hostname and other string
based fields performs much better on database sources now.
You can use the submenu buttons on each field within the messages view
to perform such a search. If you want to do manual searches, it is very
simple. Here is a sample:

Search for a full hostname: source:=SERVERNAME (Will only find if the
match is 100% = SERVERNAME)
Search for a partial hostname source:SERVER (Willmatch if the hostname
is SERVER or SERVERNAME).

I hope this helps,
Best regards,
Andre Lorbach

> -----Original Message-----
> From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog-
> bounces at lists.adiscon.com] On Behalf Of Andre Lorbach
> Sent: Thursday, July 31, 2008 10:15 AM
> To: rsyslog-users
> Subject: Re: [rsyslog] tips for managing data
>
> Hi,
>
> the like query can indeed have quiet an impact on performance when
doing
> queries on large databases.
> But I think we can expand the syntax, so you can either search by part
> of a string (LIKE '%search%') or the whole string (= 'search'). This
> should be rather easy to implement. I will put this on my todolist, if
> it is as easy as I think, the next minor update of the devel branch
will
> contain this new feature.
>
> Best regards,
> Andre Lorbach
>
> > -----Original Message-----
> > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog-
> > bounces at lists.adiscon.com] On Behalf Of Rory Toma
> > Sent: Thursday, July 31, 2008 4:10 AM
> > To: rsyslog-users
> > Subject: Re: [rsyslog] tips for managing data
> >
> > OK, so it seems that doing a query from the query line does a LIKE,
> > which can take significantly longer (sample query 8 seconds vs. 50
> msecs...)
> >
> > So, replacing the LIKE % in logstreamdb.class.db with an = speeds
> things
> > up quite a but, but I lose some flexibility. Is there some kind of
> > search syntax where I can differentiate between LIKE and =?
> >
> > If not, I'm thinking something like:
> >
> > source:foo.bar.com # would be using =
> >
> > ~source:foo # would be using LIKE
> >
> >
> >
> > Rory Toma wrote:
> > > So, my current mysql rsyslog drops about 20 million rows of data
per
> day.
> > >
> > > Over time, this gets slow as tables grow.
> > >
> > > I'm not a dba, so I was wondering if anyone had some suggestions
for
> > > keeping performance still on the order of seconds, and not minutes
> or hours.
> > >
> > > thx
> > >
> > > I did add a key for EventSource, as that is commonly searched.
> However,
> > > using PhpLogCon, it seems that if I search using the web interface
> (i.e.
> > > I click on a host entry and hit the available searches) it is
> relatively
> > > quick. However, changing the text field that is generated and
> hitting
> > > the "search" button is slow. Do these two methods use the same
> query, or
> > > is something else going on?
> > >
> > > thx
> > > _______________________________________________
> > > rsyslog mailing list
> > > http://lists.adiscon.net/mailman/listinfo/rsyslog
> > >
> >
> > _______________________________________________
> > rsyslog mailing list
> > http://lists.adiscon.net/mailman/listinfo/rsyslog
> _______________________________________________
> rsyslog mailing list
> http://lists.adiscon.net/mailman/listinfo/rsyslog
tips for managing data [ In reply to ]
Will searching for NAME match SERVERNAME?

-HKS

On Tue, Aug 5, 2008 at 10:25 AM, Andre Lorbach <alorbach at ro1.adiscon.com> wrote:
> Hi again,
>
> I want to inform you all that the search within phpLogCon has been
> extended to support full and partial searches in Version 2.5.3,
> available for download here: http://www.phplogcon.org/downloads
>
> I first tests have shown, that filtering by hostname and other string
> based fields performs much better on database sources now.
> You can use the submenu buttons on each field within the messages view
> to perform such a search. If you want to do manual searches, it is very
> simple. Here is a sample:
>
> Search for a full hostname: source:=SERVERNAME (Will only find if the
> match is 100% = SERVERNAME)
> Search for a partial hostname source:SERVER (Willmatch if the hostname
> is SERVER or SERVERNAME).
>
> I hope this helps,
> Best regards,
> Andre Lorbach
>
>> -----Original Message-----
>> From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog-
>> bounces at lists.adiscon.com] On Behalf Of Andre Lorbach
>> Sent: Thursday, July 31, 2008 10:15 AM
>> To: rsyslog-users
>> Subject: Re: [rsyslog] tips for managing data
>>
>> Hi,
>>
>> the like query can indeed have quiet an impact on performance when
> doing
>> queries on large databases.
>> But I think we can expand the syntax, so you can either search by part
>> of a string (LIKE '%search%') or the whole string (= 'search'). This
>> should be rather easy to implement. I will put this on my todolist, if
>> it is as easy as I think, the next minor update of the devel branch
> will
>> contain this new feature.
>>
>> Best regards,
>> Andre Lorbach
>>
>> > -----Original Message-----
>> > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog-
>> > bounces at lists.adiscon.com] On Behalf Of Rory Toma
>> > Sent: Thursday, July 31, 2008 4:10 AM
>> > To: rsyslog-users
>> > Subject: Re: [rsyslog] tips for managing data
>> >
>> > OK, so it seems that doing a query from the query line does a LIKE,
>> > which can take significantly longer (sample query 8 seconds vs. 50
>> msecs...)
>> >
>> > So, replacing the LIKE % in logstreamdb.class.db with an = speeds
>> things
>> > up quite a but, but I lose some flexibility. Is there some kind of
>> > search syntax where I can differentiate between LIKE and =?
>> >
>> > If not, I'm thinking something like:
>> >
>> > source:foo.bar.com # would be using =
>> >
>> > ~source:foo # would be using LIKE
>> >
>> >
>> >
>> > Rory Toma wrote:
>> > > So, my current mysql rsyslog drops about 20 million rows of data
> per
>> day.
>> > >
>> > > Over time, this gets slow as tables grow.
>> > >
>> > > I'm not a dba, so I was wondering if anyone had some suggestions
> for
>> > > keeping performance still on the order of seconds, and not minutes
>> or hours.
>> > >
>> > > thx
>> > >
>> > > I did add a key for EventSource, as that is commonly searched.
>> However,
>> > > using PhpLogCon, it seems that if I search using the web interface
>> (i.e.
>> > > I click on a host entry and hit the available searches) it is
>> relatively
>> > > quick. However, changing the text field that is generated and
>> hitting
>> > > the "search" button is slow. Do these two methods use the same
>> query, or
>> > > is something else going on?
>> > >
>> > > thx
>> > > _______________________________________________
>> > > rsyslog mailing list
>> > > http://lists.adiscon.net/mailman/listinfo/rsyslog
>> > >
>> >
>> > _______________________________________________
>> > rsyslog mailing list
>> > http://lists.adiscon.net/mailman/listinfo/rsyslog
>> _______________________________________________
>> rsyslog mailing list
>> http://lists.adiscon.net/mailman/listinfo/rsyslog
> _______________________________________________
> rsyslog mailing list
> http://lists.adiscon.net/mailman/listinfo/rsyslog
>
tips for managing data [ In reply to ]
Yes if you use a partial search, it will match as well.

Best regards,
Andre Lorbach

> -----Original Message-----
> From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog-
> bounces at lists.adiscon.com] On Behalf Of (private) HKS
> Sent: Tuesday, August 05, 2008 5:09 PM
> To: rsyslog-users
> Subject: Re: [rsyslog] tips for managing data
>
> Will searching for NAME match SERVERNAME?
>
> -HKS
>
> On Tue, Aug 5, 2008 at 10:25 AM, Andre Lorbach
<alorbach at ro1.adiscon.com>
> wrote:
> > Hi again,
> >
> > I want to inform you all that the search within phpLogCon has been
> > extended to support full and partial searches in Version 2.5.3,
> > available for download here: http://www.phplogcon.org/downloads
> >
> > I first tests have shown, that filtering by hostname and other
string
> > based fields performs much better on database sources now.
> > You can use the submenu buttons on each field within the messages
view
> > to perform such a search. If you want to do manual searches, it is
very
> > simple. Here is a sample:
> >
> > Search for a full hostname: source:=SERVERNAME (Will only find if
the
> > match is 100% = SERVERNAME)
> > Search for a partial hostname source:SERVER (Willmatch if the
hostname
> > is SERVER or SERVERNAME).
> >
> > I hope this helps,
> > Best regards,
> > Andre Lorbach
> >
> >> -----Original Message-----
> >> From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog-
> >> bounces at lists.adiscon.com] On Behalf Of Andre Lorbach
> >> Sent: Thursday, July 31, 2008 10:15 AM
> >> To: rsyslog-users
> >> Subject: Re: [rsyslog] tips for managing data
> >>
> >> Hi,
> >>
> >> the like query can indeed have quiet an impact on performance when
> > doing
> >> queries on large databases.
> >> But I think we can expand the syntax, so you can either search by
part
> >> of a string (LIKE '%search%') or the whole string (= 'search').
This
> >> should be rather easy to implement. I will put this on my todolist,
if
> >> it is as easy as I think, the next minor update of the devel branch
> > will
> >> contain this new feature.
> >>
> >> Best regards,
> >> Andre Lorbach
> >>
> >> > -----Original Message-----
> >> > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog-
> >> > bounces at lists.adiscon.com] On Behalf Of Rory Toma
> >> > Sent: Thursday, July 31, 2008 4:10 AM
> >> > To: rsyslog-users
> >> > Subject: Re: [rsyslog] tips for managing data
> >> >
> >> > OK, so it seems that doing a query from the query line does a
LIKE,
> >> > which can take significantly longer (sample query 8 seconds vs.
50
> >> msecs...)
> >> >
> >> > So, replacing the LIKE % in logstreamdb.class.db with an = speeds
> >> things
> >> > up quite a but, but I lose some flexibility. Is there some kind
of
> >> > search syntax where I can differentiate between LIKE and =?
> >> >
> >> > If not, I'm thinking something like:
> >> >
> >> > source:foo.bar.com # would be using =
> >> >
> >> > ~source:foo # would be using LIKE
> >> >
> >> >
> >> >
> >> > Rory Toma wrote:
> >> > > So, my current mysql rsyslog drops about 20 million rows of
data
> > per
> >> day.
> >> > >
> >> > > Over time, this gets slow as tables grow.
> >> > >
> >> > > I'm not a dba, so I was wondering if anyone had some
suggestions
> > for
> >> > > keeping performance still on the order of seconds, and not
minutes
> >> or hours.
> >> > >
> >> > > thx
> >> > >
> >> > > I did add a key for EventSource, as that is commonly searched.
> >> However,
> >> > > using PhpLogCon, it seems that if I search using the web
interface
> >> (i.e.
> >> > > I click on a host entry and hit the available searches) it is
> >> relatively
> >> > > quick. However, changing the text field that is generated and
> >> hitting
> >> > > the "search" button is slow. Do these two methods use the same
> >> query, or
> >> > > is something else going on?
> >> > >
> >> > > thx
> >> > > _______________________________________________
> >> > > rsyslog mailing list
> >> > > http://lists.adiscon.net/mailman/listinfo/rsyslog
> >> > >
> >> >
> >> > _______________________________________________
> >> > rsyslog mailing list
> >> > http://lists.adiscon.net/mailman/listinfo/rsyslog
> >> _______________________________________________
> >> rsyslog mailing list
> >> http://lists.adiscon.net/mailman/listinfo/rsyslog
> > _______________________________________________
> > rsyslog mailing list
> > http://lists.adiscon.net/mailman/listinfo/rsyslog
> >
> _______________________________________________
> rsyslog mailing list
> http://lists.adiscon.net/mailman/listinfo/rsyslog
tips for managing data [ In reply to ]
Andre Lorbach wrote:
> Hi again,
>
> I want to inform you all that the search within phpLogCon has been
> extended to support full and partial searches in Version 2.5.3,
> available for download here: http://www.phplogcon.org/downloads
>
> I first tests have shown, that filtering by hostname and other string
> based fields performs much better on database sources now.
> You can use the submenu buttons on each field within the messages view
> to perform such a search. If you want to do manual searches, it is very
> simple. Here is a sample:
>
> Search for a full hostname: source:=SERVERNAME (Will only find if the
> match is 100% = SERVERNAME)
> Search for a partial hostname source:SERVER (Willmatch if the hostname
> is SERVER or SERVERNAME).
>
Thanks for the quick change.

Now I have another question, which is more sql but I thought I'd ask it
anyway.

What I'd like to do is keep 3 months worth of data, but rotate and merge
my tables after a set amount of time to keep search times low. Tables
more than 3 months old would be dropped.

If anyone has a nice, canned script for this, it would be appreciated.
Otherwise, I'll read the mysql manuals. 8-)

thx