<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Fri, Oct 31, 2014 at 12:05 PM, Pavel Levshin <span dir="ltr"><<a href="mailto:pavel@levshin.spb.ru" target="_blank">pavel@levshin.spb.ru</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

  
  <div text="#000000" bgcolor="#FFFFFF">

    <div>Hi,<br>

      <br>

      I'll look at this little later. <br>

      <br>

      Do you use it in production? Is this (JSON arrays) compatible with

      lognormalizer tool? Can a %tokenized field contain another

      %tokenized fields (i.e., allow for recursion)? Would you write

      some docs on the feature?<br>

      <br>

      Why do you use 'const' modifier for non-pointer arguments, for

      example, 'const unsigned char c'?<br>

      <br>

      <br>

      --<br>

      Pavel<br>

      <br>

      <br>

      <br>

      30.10.2014 14:03, singh.janmejay:<br>

    </div>

    <blockquote type="cite"><div><div class="h5">

      <div dir="ltr">

        <div>

          <div>Hi,<br>

            <br>

            This patch-set introduces a log-norm field-type called

            tokenized, which allows parsing of token-separated values.<br>

            <br>

            A lot of applications such as nginx write fields in logs

            that are comma+space separated etc. For instance, nginx

            upstream_addrs field writes comma-separated ip+port

            combinations to access logs.<br>

            <br>

            Parsing such logs takes significant amount of regex and

            exec-template work and leads to rather ugly solution for

            something as simple as tokenized string.<br>

            <br>

            With this patch, parsing a list of ip-addresses separated by

            ', '(comma + space) for instance, would require a rule

            similar to:<br>

            <br>

            rule=ips:%my_ips:tokenized:, :ipv4%<br>

            <br>

            This requires a small patch to libestr as well, so this mail

            has 3 patches attached.<br>

            <br>

            libestr patch: <br>

            <br>

0001-Changed-some-functions-that-don-t-modify-their-arg-t.patch<br>

            <br>

            liblognorm patch:<br>

            <br>

0001-Moved-from-parser-receving-data-as-escaped-string-to.patch<br>

0002-added-support-for-field_type-tokenized-which-parses-.patch<br>

            <br>

            Patches go in order of prefix-number.<br clear="all">

          </div>

        </div>

        <div>

          <div>

            <div>

              <div>

                <div>

                  <div><br>

                    -- <br>

                    Regards,<br>

                    Janmejay<br>

                    <a href="http://codehunk.wordpress.com" target="_blank">http://codehunk.wordpress.com</a><br>

                  </div>

                </div>

              </div>

            </div>

          </div>

        </div>

      </div>

      <br>

      <fieldset></fieldset>

      <br>

      </div></div><span class=""><pre>_______________________________________________

Lognorm mailing list

<a href="mailto:Lognorm@lists.adiscon.com" target="_blank">Lognorm@lists.adiscon.com</a>

<a href="http://lists.adiscon.net/mailman/listinfo/lognorm" target="_blank">http://lists.adiscon.net/mailman/listinfo/lognorm</a>

</pre>

    </span></blockquote>

    <br>

  </div>


<br>_______________________________________________<br>

Lognorm mailing list<br>

<a href="mailto:Lognorm@lists.adiscon.com">Lognorm@lists.adiscon.com</a><br>

<a href="http://lists.adiscon.net/mailman/listinfo/lognorm" target="_blank">http://lists.adiscon.net/mailman/listinfo/lognorm</a><br>

<br></blockquote></div><br></div><div class="gmail_extra">Const modifier for non-pointer args is just habit, its not intentional.<br><br></div><div class="gmail_extra">I have done a lot of testing locally(on my box), but its not on my prod cluster yet.<br><br></div><div class="gmail_extra">Tokenizer followed by tokenizer is something that I have in mind too. But I promised myself that i'd write a test for that instead of testing it manually :-). Will add that patch on this thread once I get a chance to work on it.<br><br></div><div class="gmail_extra">However, since you are asking about those kind of forms, let met discuss something else that I was thinking about.<br><br></div><div class="gmail_extra">The idea is to have another field type called recurse.<br><br></div><div class="gmail_extra">Similar to how tokenized uses a ctx to parse matching text, recurse will parse it using the current context. AFAIK, the context is stateless, so I don't see any problems with that. I also plan to support tag based picking of which rules the text may match, and if it matches something else, it should be considered no-match.<br><br></div><div class="gmail_extra">Instead of typing it out here, i'll attach a picture I took after thinking through it briefly(i'll attach it to the next mail).<br></div><div class="gmail_extra"><br>-- <br><div class="gmail_signature">Regards,<br>Janmejay<br><a href="http://codehunk.wordpress.com">http://codehunk.wordpress.com</a><br></div>

</div></div>