[Lognorm] Libnormalize issue

Thu Nov 3 03:05:19 CET 2011

On Wed, 2 Nov 2011, cclark wrote:

> On Wed, 2 Nov 2011 13:11:34 -0600, James Lay wrote:
>> I'm guessing that my questions and comments are from my ignorance of how
>> this all works.  From my dealings so far with Sagan, it looks like my rule
>> file should match first, then send to normlize yes?  I would think that
>> would reduce false positives since my rule has already done the job of
>> matching, and liblognorm's job is to parse out the specific info..yes?
>> Again, maybe I'm TOTALLY missing something.
>
> If the first normalize rule doesn't match, it'll move on to the second 
> rule. That is, _right_ when liblognorm "see's" it's not going to match, 
> it's already moving on to the next rule. If you have pcre/regexp, it 
> would then have to pump that data via libpcre. That'd create more 
> overhead than you think.  Hence the reason I encourage users (for 
> Sagan/Snort rules) to use "content:" over "pcre:".  Because pcre adds 
> extra CPU overhead.
>
> I'm sure Rainer can explain better, and I know this has come up on the 
> list before, but adding regexp/libprec to the mix will actually make it 
> more complex and less efficient. Efficiency is the name of the game 
> here.
>
> I actually think of liblognorm as more of a "mask" than a rule.  That 
> is, if my log is:

The thing to remember is that liblognorm is creating a parse tree, not a 
set of regex rules to match.

So it's not evaluating the rules one at a time as each line arrives.

Instead it's evaluating them all at the same time.

It's essentially creating a mini program where it looks at the first 
character of the input and says 'this character means that it could match 
this set of rules, but not this other set', then it looks at the next 
character and says 'of the rules that were possible after the last step, 
this set is still possible' and repeats this until there is only one rule 
left in the 'possible' set. Then it goes through that rule to assign 
values to variables.

This process makes it so that it takes very close to the same amount of 
time to evaluate a large number of rules as a small number of rules.

David Lang