[Lognorm] Libnormalize issue

Rainer Gerhards rgerhards at hq.adiscon.com
Thu Nov 3 18:24:50 CET 2011


> -----Original Message-----
> From: lognorm-bounces at lists.adiscon.com [mailto:lognorm-
> bounces at lists.adiscon.com] On Behalf Of david at lang.hm
> Sent: Thursday, November 03, 2011 6:20 PM
> To: lognorm
> Subject: Re: [Lognorm] Libnormalize issue
> 
> On Thu, 3 Nov 2011, Rainer Gerhards wrote:
> 
> > The problem here is that liblognorm primarily aims at semi-structured
> > data, that is text data without an easily parsable structure. Iptables
> > actually provides structured data and liblognorm is not great at
> > processing that kind of data. It becomes even worse if there are any
> permutations in field order.
> > In that case, you need exponentionally many rules in the worst case.
> >
> > I was thinking about adding a special name/value parsing capability to
> > support that type of data. But then it is vitally important that the
> > data has a header that clearly identifies the message, otherwise
> > normalization will result in a big mess of garbage. Because the chance
> > that such a very generic parser mis-interprets things is very high,
> > especially in the uptables case as a single word (like "df" above) is
> > a valid (binary) "name/value-pair", so it is hard to detect during
> > parsing if that really is iptables or not. Even if we assume it is:
> > the parser consumes probably a lot of data before it detects a
> > mismatch. So we need to backtrack over a lot of data. In essence, one
> > such rule could probably double the processing speed of all rules. And
> > if you have
> > 10 such rules, you could come up with a 1024-times slower rule parsing
> > in the worst case (that's the problem that bugs the usual regex
> > approach and severely limits extraction speed).
> 
> how about adding a couple of new tag types
> 
> 1. name=value pair
> 
> 2. one or more name=value pairs
> 
> then you could make a rule that would match the fixed part of a log and
then
> let the log specify the rest of it

That's (especially 2) what I am thinking about.

> 
> >
> > So iptables actually presents a pretty hard problem. I'd still like to
> > tackle it, but unfortunately I am short on time at the moment. In any
> > case, normalization is still up on my agenda, so probably one of the
> > first things to look at when there is time left (CEE has a new draft
> > standard out and I'd like to make the necessary adaptions). Probably a
> > solution is to provide this "iptables" normalizer, maybe even as a
> > different api call, and the controlling application must first select
> > the normalize to use based on other information.
> 
> one key thing is that iptables lets you specify a tag for the log messages.
by
> default it is just 'kernel' but if you are wanting to identify them for
parsing,
> you really should set this (and then you can unambigously match them)

Yup, but doesn't help against something malicious  (but granted, that's also
the case for other things, but not tot hat extent as a no-match would occur
in most cases).

rainer
> 
> David Lang
> _______________________________________________
> Lognorm mailing list
> Lognorm at lists.adiscon.com
> http://lists.adiscon.net/mailman/listinfo/lognorm


More information about the Lognorm mailing list