[rsyslog] Common Performance Gotchas

david at lang.hm david at lang.hm
Tue Sep 21 08:43:38 CEST 2010


On Tue, 21 Sep 2010, Rainer Gerhards wrote:

> thanks for the long and insightful post. I hope you don't mind if I have it
> reproduced on the web site ;)

no problem, anything sent to the list is public, and if something I sent 
to the list can be reformatted (and/or corrected) to become documentation 
for others, it just saves me from having to answer again sometime in the 
future ;-)

> Some more info: you are right, there are no version specific counters.

it would be handy to have a condensed changelog that just showed what new 
features (as opposed to bugfixes) are in each version. it doesn't _need_ 
to show each release (for example, you may just show all the changes 
between 5.4 and 5.6 in one batch rather than showing 5.4.0, 5.4.1, etc)

This would give people who are trying to decide if they should stick with 
the vendor provided version or upgrade to something newer more information 
about what they gain with the newer version (not to mention prodding said 
vendors to test and ship the more recent versions)

> Especially performance is being much changed. It must be noted that
> configuration has a very, very big impact on configuration. For example, it
> is very important to use script filters (if ... then) only if absolutely
> necessary. They were quickly hacked in because there was big demand, but the
> engine has not yet been optimized at all (and has a really bad performance).
> Also, the last round of v5 optimizations was done to perform well in usual
> configurations (default parameters!). So config changes can have big impact
> here (but in any case late v5 is much better than early v5).

you've mentioned before that there is a significant speed difference 
between the three methods of doing the same test. I would be interested in 
learning the relative costs of the various tests (including, for example 
the difference between startswith (anchored regex) and contains 
(unanchored regex)

with the new ruleset feature, things that once required the multipart 
script filters can (with difficulty) be implemented with more, simpler 
filters and multiple rulesets. At what point does this extra complexity 
become worthwhile?

i don't know if there are any debug options that could be turned on to 
learn this info without turning them all on (which affects things so much 
that the test may not be valid anymore)

> I will give a presentation on the rsyslog tuning effort this firday on Linux
> Congress over here in Nuremberg. I am permitted to post the paper after the
> conference, and I will do so early next week. It describes the initial effort
> (spring 2009), but still provides a lot of insight (though from a developers
> PoV).

I'm definatly interested in seeing that.

David Lang

> Also keep in mind that I have scheduled a third tuning phase for the
> winter/spring 2010/11 timeframe. I guess I will be able to start with this in
> November. But keep in mind that I need to do some research and base testing
> first, so I don't expect anything of this to be visible until much later.
>
> Rainer
>
>
>> -----Original Message-----
>> From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog-
>> bounces at lists.adiscon.com] On Behalf Of david at lang.hm
>> Sent: Monday, September 20, 2010 10:44 PM
>> To: rsyslog-users
>> Subject: Re: [rsyslog] Common Performance Gotchas
>>
>> On Mon, 20 Sep 2010, Joe Williams wrote:
>>
>>> Are there commonly used tuning options for rsyslog, networking stack
>> and
>>> kernel? I have found a bit of information (links below) and have made
>> a
>>> few changes but I am curious if there are any low hanging fruit to
>>> increase my message rate. To that end what are "normal" message rates
>>> for the various versions of rsyslog? I know a lot of this is
>> subjective
>>> depending on various factors unique to each system but I am curious
>> of
>>> commonly hit performance issues that can be tuned around.
>>
>> version 3 didn't have much in the way of performance tuning knobs (you
>> could adjust the message size and count limits)
>>
>> it's been a long time since I tested v3, but I think it's performance
>> was
>> below 10k messages/sec
>>
>> version 4 is where the performance changes really started, and there
>> are a
>> lot of knobs that were created
>>
>> version 4 could receive messages up to 1Gb wire speed with the correct
>> tuning, I could get it to forward or write messages at ~80K
>> messages/sec
>> or do both at ~30K messages/sec
>>
>> version 5 gained a lot more performance and dropped some of the tuning
>> knobs, I haven't tested the most recent versions, but reports are that
>> it
>> can do >250K messages/sec
>>
>> the key tuning knobs will vary a bit depending on your source of logs.
>> I
>> deal mostly with UDP logs.
>>
>> one key thing is to reduce the frequency of gettimeofday() calls
>>
>> $UDPServerTimeRequery 10
>>
>> lets you say that if you get a continuous stream of messages (i.e.,
>> ever
>> time rsyslog finishes processing one message there is already another
>> waiting) instead of checking the time every message it checks it every
>> 10
>> messages and uses the same local timestamp for the messages in between.
>> you get most of the benifit even with a small value like 10
>>
>> another  thing to look at (and I don't remember the config option at
>> the
>> moment) is the batch size for processing messages (especially if you
>> are
>> doing something like inserting them into a database, but even for much
>> simpler configs)
>>
>> If you can disable DNS lookups, that will make a huge difference (not
>> that
>> even with DNS lookups disabled you will have hostnames if the sending
>> server puts their name in the message like they are supposed to.
>>
>> however, for most hardware and uses, it really is going to be fast
>> enough
>> out of the box to not need much, if any tuning.
>>
>> remember 'premature optimization is the root of all evil', get rsyslog
>> running, look at the CPU that it's taking (per-thread, not just total
>> CPU), and then look at what the threads that are busy are doing. In V3
>> and
>> early V4 it was the thread that was receiving messages that was the
>> bottleneck, by the end of V4 it was the threads outputting the messages
>> that was the bottleneck (mostly by thrashing the queue locks), in the
>> very
>> recent V5 versions most of this locking went away and performance is
>> _way_
>> up, but it means that whatever bottlenecks are left are in different
>> places.
>>
>>
>> one thing you should do is to tune your OS. make sure you have pleanty
>> of
>> network buffers (tcp or udp as appropriate for your system) and check
>> your
>> disk I/O capibilities (especially if you are doing something other than
>> simple buffered writes to files)
>>
>>
>> the high database link you point to is obsolete now in the face of the
>> batch mode of inserts. the key is to tune your batch sizes to be fairly
>> large (but you need to watch your database to make sure they don't get
>> so
>> large that your database chokes on them)
>>
>> rsyslog has changes so much and so rapidly that I don't think there are
>> really any good documents on tuning. The current version with it's
>> ability
>> to ahve subsets of the rules, each with their own queue can be
>> configured
>> to spread itself across all the processor cores in your system
>> (although
>> the configuration gets very messy) and with that you can do a huge
>> amount
>> of processing on the log messages.
>>
>> what does your environment look like? how many messages of what size do
>> you expect to receive, and what do you want to do with them?
>>
>> David Lang
>>
>>
>>> Thanks.
>>> -Joe
>>>
>>>
>>> http://mperedim.wordpress.com/2010/01/21/rsyslog-evaluation/
>>> http://www.rsyslog.com/doc/queues.html
>>> http://www.rsyslog.com/doc/rsyslog_high_database_rate.html
>>> http://www.gossamer-threads.com/lists/rsyslog/users/4029
>>>
>>>
>>>
>>>
>>> Name: Joseph A. Williams
>>> Email: joe at joetify.com
>>> Blog: http://www.joeandmotorboat.com/
>>> Twitter: http://twitter.com/williamsjoe
>>>
>>> _______________________________________________
>>> rsyslog mailing list
>>> http://lists.adiscon.net/mailman/listinfo/rsyslog
>>> http://www.rsyslog.com
>>>
>> _______________________________________________
>> rsyslog mailing list
>> http://lists.adiscon.net/mailman/listinfo/rsyslog
>> http://www.rsyslog.com
> _______________________________________________
> rsyslog mailing list
> http://lists.adiscon.net/mailman/listinfo/rsyslog
> http://www.rsyslog.com
>



More information about the rsyslog mailing list