[Lognorm] Libnormalize issue

Thu Nov 3 09:04:51 CET 2011

> -----Original Message-----
> From: lognorm-bounces at lists.adiscon.com [mailto:lognorm-
> bounces at lists.adiscon.com] On Behalf Of James Lay
> Sent: Wednesday, November 02, 2011 8:12 PM
> To: lognorm
> Subject: Re: [Lognorm] Libnormalize issue
> 
> >
> > The rule does not match, because "(Unhandled..." is not matching the
> > sample.
> > So it did not extract any fields at all.
> >
> > I'll elaborate a bit later why we need to have perfect matches. Think
> > about
> > false positives...
> >
> > rainer
> 
> Thanks for responding so quickly.  As I look at my test setup, I see
> that
> you are spot on...if it doesn't match the WHOLE thing, nothing gets
> parsed.  That leaves me with two options as it relates to the below
> examples:
> 
> IN=ppp0 OUT= MAC= SRC=121.11.80.101 DST=my_ext_ip LEN=40 TOS=0x00
> PREC=0x00 TTL=108 ID=256 PROTO=TCP SPT=6000 DPT=1433 WINDOW=16384
> RES=0x00
> SYN URGP=0
> 
> IN=ppp0 OUT= MAC= SRC=121.11.80.101 DST=my_ext_ip LEN=40 TOS=0x00
> PREC=0x00 TTL=108 ID=256 DF PROTO=TCP SPT=6000 DPT=1433 WINDOW=16384
> RES=0x00 SYN URGP=0
> 
> Easy to miss, but the DF there is where I have an issue...some have it,
> and some don't.  Without a regex to ignore junk (LEN=.*DF), then what
> are
> my options?  I can create 2 different rules, one to match the above
> with a
> %DF:word%, and one without, but now I have two seperate entries for
> pretty
> much the same info...not optimal.

The problem here is that liblognorm primarily aims at semi-structured data,
that is text data without an easily parsable structure. Iptables actually
provides structured data and liblognorm is not great at processing that kind
of data. It becomes even worse if there are any permutations in field order.
In that case, you need exponentionally many rules in the worst case.

I was thinking about adding a special name/value parsing capability to
support that type of data. But then it is vitally important that the data has
a header that clearly identifies the message, otherwise normalization will
result in a big mess of garbage. Because the chance that such a very generic
parser mis-interprets things is very high, especially in the uptables case as
a single word (like "df" above) is a valid (binary) "name/value-pair", so it
is hard to detect during parsing if that really is iptables or not. Even if
we assume it is: the parser consumes probably a lot of data before it detects
a mismatch. So we need to backtrack over a lot of data. In essence, one such
rule could probably double the processing speed of all rules. And if you have
10 such rules, you could come up with a 1024-times slower rule parsing in the
worst case (that's the problem that bugs the usual regex approach and
severely limits extraction speed).

So iptables actually presents a pretty hard problem. I'd still like to tackle
it, but unfortunately I am short on time at the moment. In any case,
normalization is still up on my agenda, so probably one of the first things
to look at when there is time left (CEE has a new draft standard out and I'd
like to make the necessary adaptions). Probably a solution is to provide this
"iptables" normalizer, maybe even as a different api call, and the
controlling application must first select the normalize to use based on other
information.

> 
> I'm guessing that my questions and comments are from my ignorance of
> how
> this all works.  From my dealings so far with Sagan, it looks like my
> rule
> file should match first, then send to normlize yes?  I would think that
> would reduce false positives since my rule has already done the job of
> matching, and liblognorm's job is to parse out the specific info..yes?
> Again, maybe I'm TOTALLY missing something.

I don't now how it is implemented in Sagan. But if you do that, you'll loose
all of liblognorm's performance benefits (which it gains from doing all in a
*single* pass). Note: I am not saying what you intend to do is bad for your
context, I am just saying how it relates to liblognorm.

rainer
> 
> I'll continue to test this out...my goal is corelate snort entires with
> firewall rules, but so far it's been an uphill battle.  Again, thanks
> for
> any light you can shed, and for taking your time to make liblognorm.
> 
> James
> 
> _______________________________________________
> Lognorm mailing list
> Lognorm at lists.adiscon.com
> http://lists.adiscon.net/mailman/listinfo/lognorm