[rsyslog] rsyslog performance as receiver, heavily using regex in templates

Gary Foster gfoster at realgravity.com
Thu Jan 31 15:58:39 CET 2013


I am doing something similar, but the way I'm handling it is to push the
formatting upstream.  I'm actually moving towards generating the log
messages in preparsed format (well structured JSON along the lines of CEE).

For example, when an incoming GET request comes in on an nginx server, it
contains a huge number of potential params... GET /foo?bar=1&baz=2 etc.
 The bar and baz params are what I'm really interested in (along with the
timestamp, url, etc of course), and they are moderately dynamic instead of
being a fixed pattern every time, so I'm pushing that out to the clients so
it becomes json like {"action": "GET", "url": "foo", "bar": "1", "baz":
"2"}.

I am not even sure if it is completely possible to do that all entirely
within rsyslog right now, since the key/value pairs are dynamic so I just
simply do it it pre-rsyslog and then use rsyslog to route it on the JSON
keys.  I'm routing about 500 per sec without even breaking a sweat, and
have tested it upwards of 30k per sec.  It is more moving parts though,
which I am not particularly a fan of.

-- Gary F.

On Thu, Jan 31, 2013 at 5:44 AM, Rainer Gerhards
<rgerhards at hq.adiscon.com>wrote:

> On Thu, 2013-01-31 at 14:51 +0200, Radu Gheorghe wrote:
> > Hi Ben,
> >
> > 2013/1/31 Ben Bradley <bbradleyuk at gmail.com>
> >
> > > Hi everyone
> > >
> > > I'm currently using logstash as the log collector from a few rsyslog
> > > sender clients. I'd like to use rsyslog to receive the remote logs
> instead
> > > of logstash. This means I'm keeping things simple and can possibly
> also use
> > > RELP.
> > >
> > > If the rsyslog receiver is doing alot of regex parsing on each message
> > > received (i.e. parsing Apache logs into ElasticSearch fields) at what
> sort
> > > of volume of log messages would I start to notice performance problems?
> > >
> > > Eventually I'm expecting about 5-10GB per day to be received by our
> > > centralised rsyslog log server.
> > >
> >
> > I guess it all comes down to performance testing, but 10GB would probably
> > mean ~20M logs or something like that. If the majority of those will be
> > sent during the day (say 10 hours), my poor math says if you handle
> 500-600
> > logs/sec you should be fine.
>
> seeing that number, I'd say it requires quite some regexpes to get
> rsyslog to sweat. HOWEVER... do we really need regexpes? Can you post a
> couple of samples?
>
> Rainer
> >
> > I've never used regex with rsyslog in a performance situation, so I can't
> > say, but it seems to me like it should easily handle that amount.
> >
> >
> > >
> > > Should I actually get the rsyslog senders to parse the regex patterns
> of
> > > Apache logs into JSON then forward that JSON to the receiver? So the
> > > sender's got the regex overhead?
> > >
> > > Or will an rsyslog receiver easily be able to parse all the regex
> patterns
> > > with my volume of logging?
> > > Having the regex patterns parsed in one place would make for easier
> > > management. If necessary we can just throw more vCPUs and memory at
> the log
> > > server without needing to touch the web nodes.
> > >
> >
> > I suspect the load won't be too high, but making the clients to that will
> > scale a lot better and - especially since we don't expect the total load
> to
> > be high - nobody will feel that load if it's that distributed. And if you
> > add more web nodes, you won't have to touch anything. Not even adding
> vCPUs
> > and memory.
> >
> > Personally, I'd try the "centralized" method first, because it's easier
> to
> > get started. If all works smoothly, you can push the same(ish) config to
> > the web nodes. If you ever feel the need to do that :) By then,
> configuring
> > them might get easier because of natural evolutions of packaging, testing
> > and documentation.
> >
> > Best regards,
> > Radu
> > _______________________________________________
> > rsyslog mailing list
> > http://lists.adiscon.net/mailman/listinfo/rsyslog
> > http://www.rsyslog.com/professional-services/
> > What's up with rsyslog? Follow https://twitter.com/rgerhards
> > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
> DON'T LIKE THAT.
>
> _______________________________________________
> rsyslog mailing list
> http://lists.adiscon.net/mailman/listinfo/rsyslog
> http://www.rsyslog.com/professional-services/
> What's up with rsyslog? Follow https://twitter.com/rgerhards
> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
> DON'T LIKE THAT.
>


More information about the rsyslog mailing list