[Lognorm] Thought about %word%

Rainer Gerhards rgerhards at hq.adiscon.com
Thu Jan 20 16:24:37 CET 2011


Word is actually a catch-all parser, and a thing to avoid as better parsers
come it. I will soon make sure it will be tried as the last one. An email
parsers seems to make a lot of sense. And probably a "alpha" parser, which
takes a-zA-Z only. The bottom line is that the more precise, the better.

On the other hand, David's comments on regexpes was very interesting.
Regexpes are compiled to DFAs (deterministic finite automatons). Executing
the DFA takes some time. Executing many of them takes *a lot* of time. In
liblognorm, I combine all of those rules into a single DFA, thus it is much
quicker. But it may be possible to use the "regular" regexp logic and compile
many regexpes into a single DFA (liblognorms parse tree) and gain ease of use
of regexpes plus the speed of liblognorm. But this for sure sound like a very
challenging effort...

Rainer

> -----Original Message-----
> From: lognorm-bounces at lists.adiscon.com [mailto:lognorm-
> bounces at lists.adiscon.com] On Behalf Of Champ Clark III [Softwink]
> Sent: Thursday, January 20, 2011 4:15 PM
> To: lognorm at lists.adiscon.com
> Subject: [Lognorm] Thought about %word%
> 
> 
> 	Hello,
> 
> 	The other day I was working on some Sendmail rules (doing some
> testing) and I noticed something about the %field:word%.  Here's a
> quick example:
> 
> arg1=<bob at example.com>
> 
> 	What I'd planned on doing was:
> 
> arg1=<%email:word%>
> 
> 	But that obviously won't work.  In the end,  you end up grabbing
> the entire line (due to spaces).  Perhaps %word% should not _just_
> grab between spacing,  but any non-alpha/numberic.  This might lead
> into
> other problem.  For example,  %word% still wouldn't work due to the
> "@".  Or,  maybe it's a bad idea all the way around.  Perhaps a preset
> of defined delimiters ( < > [ ] (space) = , ) ?
> 
> --
>         Champ Clark III | Softwink, Inc | 800-538-9357 x 101
>                      http://www.softwink.com
> 
> GPG Key ID: 58A2A58F
> Key fingerprint = 7734 2A1C 007D 581E BDF7  6AD5 0F1F 655F 58A2 A58F
> If it wasn't for C, we'd be using BASI, PASAL and OBOL.


More information about the Lognorm mailing list