[rsyslog] rsyslog performance as receiver, heavily using regex in templates

Ben Bradley bbradleyuk at gmail.com
Thu Jan 31 17:07:54 CET 2013


On Thu, 31 Jan 2013 15:32:21 +0000
Rainer Gerhards <rgerhards at hq.adiscon.com> wrote:

> On Thu, 2013-01-31 at 15:26 +0000, Ben Bradley wrote:
> > On Thu, 31 Jan 2013 13:44:03 +0000
> > Rainer Gerhards <rgerhards at hq.adiscon.com> wrote:
> > 
> > > > I guess it all comes down to performance testing, but 10GB would probably
> > > > mean ~20M logs or something like that. If the majority of those will be
> > > > sent during the day (say 10 hours), my poor math says if you handle 500-600
> > > > logs/sec you should be fine.
> > > 
> > > seeing that number, I'd say it requires quite some regexpes to get
> > > rsyslog to sweat. HOWEVER... do we really need regexpes? Can you post a
> > > couple of samples?
> > > 
> > > Rainer
> > 
> > Great news. I'll be testing this over the next few days/weeks.
> > 
> > Here's a sample log line as it comes in to rsyslog from Apache logging to /bin/logger...
> > http://pastebin.com/649fbqQ7
> > 
> > <134>Jan 30 14:09:30 LWEB03 apache-access[www.apachevhostname.com]: 84.184.148.184 - - [30/Jan/2013:14:09:30 +0000] "GET /fileadmin/images/bg-footerBar.gif HTTP/1.1" 404 244 "http://www.website.com/latest-news/article/newsarticle/article-name-in-here/" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.1; Trident/5.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; .NET4.0C; InfoPath.1)" www.apachevhostname.com 16992
> > 
> > It's an Apache combined log line with vhost and request time in microseconds added to the end.
> > At the moment I'm building a regular expression to capture each of those fields from the log line.
> > 
> That sounds a bit like we should be able to grab this even with the
> current version and mmnormalize - maybe with a bit larger rulebase than
> actually would be needed. I'll see if we can give it a try and report
> back.
> 
> How would you ideally like to see this after conversion?
> 
> Rainer
> > Cheers, Ben

Wow cool. Well in this case I'm building a template so it's formatted as JSON so I can send it to ElasticSearch using omelasticsearch.

And I forgot to mention that I've changed the syslog tag to apache-access[www.apachevhostname.com]. Normally the bit in square brackets should be the PID.

HTH



More information about the rsyslog mailing list