From c.thiesset at newtech.fr Tue Jun 1 12:27:55 2010 From: c.thiesset at newtech.fr (Christophe THIESSET) Date: Tue, 1 Jun 2010 12:27:55 +0200 Subject: [rsyslog] message size limitation with tcp stream Message-ID: <201006011227.55267.c.thiesset@newtech.fr> My rsyslog is 4.6.2-1 running on debian lenny. Messages larger than 2k received via tcp are constantly truncated. Playing with MaxMessageSize is useless. On the other hand the same messages via udp are properly processed (and follow the MaxMessageSize value). Bug or do I miss something? Best Regards Christophe THIESSET From rgerhards at hq.adiscon.com Tue Jun 1 12:29:30 2010 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Tue, 1 Jun 2010 12:29:30 +0200 Subject: [rsyslog] message size limitation with tcp stream References: <201006011227.55267.c.thiesset@newtech.fr> Message-ID: <9B6E2A8877C38245BFB15CC491A11DA7103E3D@GRFEXC.intern.adiscon.com> $MaxMesssageSize must be at the TOP of rsyslog.conf, at least before loading imtcp. Rainer > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of Christophe THIESSET > Sent: Tuesday, June 01, 2010 12:28 PM > To: rsyslog at lists.adiscon.com > Subject: [rsyslog] message size limitation with tcp stream > > My rsyslog is 4.6.2-1 running on debian lenny. > Messages larger than 2k received via tcp are constantly truncated. > Playing > with MaxMessageSize is useless. On the other hand the same messages via > udp > are properly processed (and follow the MaxMessageSize value). > > Bug or do I miss something? > > Best Regards > > Christophe THIESSET > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com From rgerhards at hq.adiscon.com Tue Jun 1 12:32:10 2010 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Tue, 1 Jun 2010 12:32:10 +0200 Subject: [rsyslog] Where is the output module for the udptransportationtoremote syslog server References: <2054128934449600685@unknownmsgid> Message-ID: <9B6E2A8877C38245BFB15CC491A11DA7103E3E@GRFEXC.intern.adiscon.com> John, quick question: > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of John Li > Sent: Monday, May 31, 2010 2:17 PM > To: david at lang.hm; rsyslog-users > Subject: Re: [rsyslog] Where is the output module for the > udptransportationtoremote syslog server > > Thanks a lot. > Currently i am stucked at the design that output module can not modify > the msg to be seen by other output modules. While I think I understand why you need this functionality, I would appreciate if you could elaborate on that need a bit. I am asking because I want to understand the potential use cases (hopefully all) BEFORE I even consider implementing a facility to support them. Also, do you have a comment to the longer message on template modules I posted yesterday? Thanks, Rainer From jli at jlisbz.com Tue Jun 1 15:31:02 2010 From: jli at jlisbz.com (John Li) Date: Tue, 1 Jun 2010 09:31:02 -0400 Subject: [rsyslog] Where is the output module for the udptransportationtoremote syslog server In-Reply-To: <9B6E2A8877C38245BFB15CC491A11DA7103E3E@GRFEXC.intern.adiscon.com> References: <2054128934449600685@unknownmsgid> <9B6E2A8877C38245BFB15CC491A11DA7103E3E@GRFEXC.intern.adiscon.com> Message-ID: Hi Rainer, Sorry I didn't finish reading the long email yet as I just dived into the ruleset module and tried to rewrite the message with submitMsg but no success yet. In general, the use case is for those SEM (Security Event Management). They have their recommended syslog format and it will be much easier to convert the event in their format before send it over. I promise will read the long email and hope I can provide some useful things here. Thanks a lot for your work. -- John Jun Li jli at jlisbz.com On Tue, Jun 1, 2010 at 6:32 AM, Rainer Gerhards wrote: > John, > > quick question: > > > -----Original Message----- > > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > > bounces at lists.adiscon.com] On Behalf Of John Li > > Sent: Monday, May 31, 2010 2:17 PM > > To: david at lang.hm; rsyslog-users > > Subject: Re: [rsyslog] Where is the output module for the > > udptransportationtoremote syslog server > > > > Thanks a lot. > > Currently i am stucked at the design that output module can not modify > > the msg to be seen by other output modules. > > While I think I understand why you need this functionality, I would > appreciate if you could elaborate on that need a bit. I am asking because I > want to understand the potential use cases (hopefully all) BEFORE I even > consider implementing a facility to support them. > > Also, do you have a comment to the longer message on template modules I > posted yesterday? > > Thanks, > Rainer > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com > From rgerhards at hq.adiscon.com Tue Jun 1 16:03:53 2010 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Tue, 1 Jun 2010 16:03:53 +0200 Subject: [rsyslog] Where is the output module for theudptransportationtoremote syslog server References: <2054128934449600685@unknownmsgid><9B6E2A8877C38245BFB15CC491A11DA7103E3E@GRFEXC.intern.adiscon.com> Message-ID: <9B6E2A8877C38245BFB15CC491A11DA7103E43@GRFEXC.intern.adiscon.com> > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of John Li > Sent: Tuesday, June 01, 2010 3:31 PM > To: rsyslog-users > Subject: Re: [rsyslog] Where is the output module for > theudptransportationtoremote syslog server > > Hi Rainer, > > Sorry I didn't finish reading the long email yet as I just dived into > the > ruleset module and tried to rewrite the message with submitMsg but no > success yet. No problem, but keep on your mind that I have something boiling right now. I will blog about it soon, but am currently tied in some other activity. But have a look at this git commit: http://git.adiscon.com/?p=rsyslog.git;a=commitdiff;h=59227a861821b2e0e37357c0 695f6b3d9f11dd9d > > In general, the use case is for those SEM (Security Event Management). > They > have their recommended syslog format and it will be much easier to > convert > the event in their format before send it over. My question is why you need to persist the string you generate. Do you use it multiple times or just because you need to feed it ONE time into ONE other action? Rainer > I promise will read the long email and hope I can provide some useful > things > here. > > Thanks a lot for your work. > > -- > John Jun Li > jli at jlisbz.com > > > > On Tue, Jun 1, 2010 at 6:32 AM, Rainer Gerhards > wrote: > > > John, > > > > quick question: > > > > > -----Original Message----- > > > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > > > bounces at lists.adiscon.com] On Behalf Of John Li > > > Sent: Monday, May 31, 2010 2:17 PM > > > To: david at lang.hm; rsyslog-users > > > Subject: Re: [rsyslog] Where is the output module for the > > > udptransportationtoremote syslog server > > > > > > Thanks a lot. > > > Currently i am stucked at the design that output module can not > modify > > > the msg to be seen by other output modules. > > > > While I think I understand why you need this functionality, I would > > appreciate if you could elaborate on that need a bit. I am asking > because I > > want to understand the potential use cases (hopefully all) BEFORE I > even > > consider implementing a facility to support them. > > > > Also, do you have a comment to the longer message on template modules > I > > posted yesterday? > > > > Thanks, > > Rainer > > _______________________________________________ > > rsyslog mailing list > > http://lists.adiscon.net/mailman/listinfo/rsyslog > > http://www.rsyslog.com > > > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com From david at lang.hm Tue Jun 1 16:03:42 2010 From: david at lang.hm (david at lang.hm) Date: Tue, 1 Jun 2010 07:03:42 -0700 (PDT) Subject: [rsyslog] Where is the output module for the udp transportationtoremote syslog server In-Reply-To: <9B6E2A8877C38245BFB15CC491A11DA7103E37@GRFEXC.intern.adiscon.com> References: <004501cafd09$9027db05$100013ac@intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103E26@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103E37@GRFEXC.intern.adiscon.com> Message-ID: On Mon, 31 May 2010, Rainer Gerhards wrote: > > You are looking in the same direction I am, and I think this is good news ;) > > The current engine supports functions coded in C, but not yet as real plugins > nor in an easy to see way. It is done via a crude function interface library > module, and only within the script engine. My original plan (over a year, or > even two, ago) was to generalize these library plugins, so that it is easy to > add new code and load them as plugins. Actually, making them available as > plugins should not be too much work given the already existing > infrastructure. There already exist a handful of "function modules", the > control structure is just statically created during compile time, much as > some of the output plugins are statically linked. > > Then the original plan was to enable templates to call scripts and enable > scripts to define templates (kind of). Unfortunately, I got distracted by > more important things before I could complete all of this. > > HOWEVER, at this time performance was not a major concern. With what has > evolved in the mean time, I do not like the original approach that much any > longer. At least the script engine must become much faster before I can take > a real look at that capability. Right now, scripts generate a interim code > that then is interpreted by a (kind of) virtual machine. A script invocation > inside a template would mean that a VM must be instantiated, the script > interpreted and the resulting string be used as template contents. Clearly, > this is not for high-performance use. Still, however, it may be useful to > have that capability for those cases, where performance is not the #1 > consideration. But given that everything would need to be implemented, it > does make limited sense to look into something known to be too slow in the > long run. BTW, this is one reason that I have not yet continued to work on > the script engine, knowing that some larger redesign is due to fit it into > the now much tighter runtime constraints. > > On the performance of the output system: I think the system in general is > quite fast and efficient, with only ONE important exception: that is, if > multiple replacements need to happen. Still, the algorithm is quite > efficient, but it is generic and needs to run though a number of steps. Of > course, it is definitely faster to permit a C plugin to look at the message > and then format, in an "atomic" way the resulting custom string. Thus, you > need to write multiple C codes instead of using a generic engine, but can do > so in a much higher performance way. I would assume, however, that this > approach cannot beat the simple templates we usually use (maybe by less than > 5% and, of course, there may be cases where this matters). > > As you know, my current focus is speed, together with some functional > enhancements. I was looking at queue operations improvements, but the > potential output speed improvements may be more interesting than the queue > mode improvements (and apply to more use cases). So it may make sense to look > into these, first. My challenge here is to find something that is > > a) generic enough to be useful in various (usual) cases > b) specific enough to be rather fast > > and it should also be able to implement within a few weeks at most, because I > can probably not spend much more time on a single feature/refactoring. > > One solution may be to create "template modules". I could envision a template > module to be something that generates the template string *as a whole* from > the input message. > > That is, we would have > > $template current-style,"%msg%\n" > > but also (**) > > $modload tplcustom > $template custom,tplcustom > > where tplcustom generates the template string. > > While this sounds promising, we have some issues. One immediately pops up my > mind: we will probably be able to use the same template for file writing or > forwarding, but for file writing we need a LF at the end, while for > forwarding we do not need it. this sounds very promising. I question if you really would use the same format for writing to a local file as you do when forwarding, the local file normall doesn't log the severity info (or at least not in the same format) I believe you already have differing templates for standard vs forwarding in many cases. rather than trying to do it in the config, is there a way to let the C module say "have that module do it's think, then I'll tweak the result" so that the code doesn't need to get duplicated between modules for the most common cases? > So the most natural way would be to have the ability to embed a "custom > template" into a regular template, like suggested by this syntax: > > $template both,"%=tplcustom%\n" > > however, this brings us down to the slippery slope of the original design. As > a next thing to be requested, I could ask for using not the msg object (with > its fixed unmodified properties), but rather of a transformation of the > message object. So we would end up with something like this: > > $template cmplx,"%=tplcustom(syslogtag & msg)%" > > Which would require a much more complex logic working behind the scenes. > > Of course, depending on the format used, the engine could select different > processing algorithms. Doing this on the fly seems possible, but requires > more work than I can commit in one sequence. this is definantly ugly > Also, it would be useful to have the ability to persist already-generated > properties with the message while it is continued to be processed in the rule > engine. So far, we do not have this ability, and the reason is processing > time (plus, as usual, implementation effort): for that, we would need to > maintain a list (or hash, ...) of name/value pairs, store them to disk for > disk queues and shuffle them through the rule engine as processing is carried > out. As I said, quite doable, but another big addition. I expect that this isn't worthwhile for a couple of reasons. 1. with something like this you need to worry about multi-thread protection and locks, which will kill your performance 2. with modern CPUs you really want to only work in the cache if you can. Any access to additional memory will stall the CPU long enough that you could have done a significant amount of processing instead. It's to the point where the kernel developers have measured and said that if the CPU needs to copy the contents of a TCP packet, they can do the checksum calculation at the same time and the overhead of the memory I/O will make the time taken for the calculateion to be a net zero additional time (the CPU internally processes things in parallel) > So I am somewhat stuck with things that sound interesting, but are a bit > interdependent. Doing them all together is too big to be useful, and it will > probably fail because I can probably not keep focus on all of the for the > next, say, 9 to 12 month that it would require to complete everything. > > So I am again down to picking what is most useful. Out of this discussion, it > looks like the idea I marked with (**), the plain C template generator could > be a useful route to take. I am saying this under the assumption that it > would be relatively easy to implement and cause at least some speedup in > standard cases (contrary to what I expect, I have to admit...). But that > approach is highly specialized, requiring a C module for each custom format. > So does it really serve the rsyslog community well - or just some very > isolated use cases? > > Thinking more about it, it would probably be useful if it is both > > a) relatively easy to implement and > b) causes some speedup in standard cases > > But b) cannot be proven without actually implementing the interface. So, in > practice, the questions boils down to what we *expect* about the usefulness > of this utility. well, rather than creating an entire interface, what about creating a patch to hard-code TraditionalFormat and TraditionalForwardFormat (or pick a couple) and we can benchmark the system with the hard-coded C formats vs the current process. David Lang > Having said that, I'd appreciate feedback, both on the concrete question of > the usefulness of this feature as well as any and all comments on the > situation at large. I am trying to put my development resources, which > thankfully have been somewhat increased nowadays :) to the area where they > provide greatest benefit. > > Rainer > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com > From david at lang.hm Tue Jun 1 16:07:51 2010 From: david at lang.hm (david at lang.hm) Date: Tue, 1 Jun 2010 07:07:51 -0700 (PDT) Subject: [rsyslog] Where is the output module for theudptransportationtoremote syslog server In-Reply-To: <9B6E2A8877C38245BFB15CC491A11DA7103E43@GRFEXC.intern.adiscon.com> References: <2054128934449600685@unknownmsgid><9B6E2A8877C38245BFB15CC491A11DA7103E3E@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103E43@GRFEXC.intern.adiscon.com> Message-ID: Ok, this looks like something I can test. I won't be working on it today (I've been on a call since 4am local time so won't be doing much of anything today:-) David Lang On Tue, 1 Jun 2010, Rainer Gerhards wrote: >> -----Original Message----- >> From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- >> bounces at lists.adiscon.com] On Behalf Of John Li >> >> Hi Rainer, >> >> Sorry I didn't finish reading the long email yet as I just dived into >> the >> ruleset module and tried to rewrite the message with submitMsg but no >> success yet. > > No problem, but keep on your mind that I have something boiling right now. I > will blog about it soon, but am currently tied in some other activity. But > have a look at this git commit: > > http://git.adiscon.com/?p=rsyslog.git;a=commitdiff;h=59227a861821b2e0e37357c0 > 695f6b3d9f11dd9d > >> >> In general, the use case is for those SEM (Security Event Management). >> They >> have their recommended syslog format and it will be much easier to >> convert >> the event in their format before send it over. > > My question is why you need to persist the string you generate. Do you use it > multiple times or just because you need to feed it ONE time into ONE other > action? > > Rainer > >> I promise will read the long email and hope I can provide some useful >> things >> here. >> >> Thanks a lot for your work. >> >> -- >> John Jun Li >> jli at jlisbz.com >> >> >> >> On Tue, Jun 1, 2010 at 6:32 AM, Rainer Gerhards >> wrote: >> >>> John, >>> >>> quick question: >>> >>>> -----Original Message----- >>>> From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- >>>> bounces at lists.adiscon.com] On Behalf Of John Li >>>> Sent: Monday, May 31, 2010 2:17 PM >>>> To: david at lang.hm; rsyslog-users >>>> Subject: Re: [rsyslog] Where is the output module for the >>>> udptransportationtoremote syslog server >>>> >>>> Thanks a lot. >>>> Currently i am stucked at the design that output module can not >> modify >>>> the msg to be seen by other output modules. >>> >>> While I think I understand why you need this functionality, I would >>> appreciate if you could elaborate on that need a bit. I am asking >> because I >>> want to understand the potential use cases (hopefully all) BEFORE I >> even >>> consider implementing a facility to support them. >>> >>> Also, do you have a comment to the longer message on template modules >> I >>> posted yesterday? >>> >>> Thanks, >>> Rainer >>> _______________________________________________ >>> rsyslog mailing list >>> http://lists.adiscon.net/mailman/listinfo/rsyslog >>> http://www.rsyslog.com >>> >> _______________________________________________ >> rsyslog mailing list >> http://lists.adiscon.net/mailman/listinfo/rsyslog >> http://www.rsyslog.com > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com > From rgerhards at hq.adiscon.com Tue Jun 1 17:03:25 2010 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Tue, 1 Jun 2010 17:03:25 +0200 Subject: [rsyslog] Where is the output module for theudptransportationtoremote syslog server References: <2054128934449600685@unknownmsgid><9B6E2A8877C38245BFB15CC491A11DA7103E3E@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103E43@GRFEXC.intern.adiscon.com> Message-ID: <9B6E2A8877C38245BFB15CC491A11DA7103E45@GRFEXC.intern.adiscon.com> OK, I finally managed to complete my blog post that has all the details: http://blog.gerhards.net/2010/06/rsyslog-template-plugins.html In short: the effort is most probably worth the (somewhat surprisingly small) effort and I'll see that I get that functionality in soon. Rainer > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of david at lang.hm > Sent: Tuesday, June 01, 2010 4:08 PM > To: rsyslog-users > Subject: Re: [rsyslog] Where is the output module for > theudptransportationtoremote syslog server > > Ok, this looks like something I can test. I won't be working on it > today > (I've been on a call since 4am local time so won't be doing much of > anything today:-) > > David Lang > > On Tue, 1 Jun 2010, Rainer Gerhards wrote: > > >> -----Original Message----- > >> From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > >> bounces at lists.adiscon.com] On Behalf Of John Li > >> > >> Hi Rainer, > >> > >> Sorry I didn't finish reading the long email yet as I just dived > into > >> the > >> ruleset module and tried to rewrite the message with submitMsg but > no > >> success yet. > > > > No problem, but keep on your mind that I have something boiling right > now. I > > will blog about it soon, but am currently tied in some other > activity. But > > have a look at this git commit: > > > > > http://git.adiscon.com/?p=rsyslog.git;a=commitdiff;h=59227a861821b2e0e3 > 7357c0 > > 695f6b3d9f11dd9d > > > >> > >> In general, the use case is for those SEM (Security Event > Management). > >> They > >> have their recommended syslog format and it will be much easier to > >> convert > >> the event in their format before send it over. > > > > My question is why you need to persist the string you generate. Do > you use it > > multiple times or just because you need to feed it ONE time into ONE > other > > action? > > > > Rainer > > > >> I promise will read the long email and hope I can provide some > useful > >> things > >> here. > >> > >> Thanks a lot for your work. > >> > >> -- > >> John Jun Li > >> jli at jlisbz.com > >> > >> > >> > >> On Tue, Jun 1, 2010 at 6:32 AM, Rainer Gerhards > >> wrote: > >> > >>> John, > >>> > >>> quick question: > >>> > >>>> -----Original Message----- > >>>> From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > >>>> bounces at lists.adiscon.com] On Behalf Of John Li > >>>> Sent: Monday, May 31, 2010 2:17 PM > >>>> To: david at lang.hm; rsyslog-users > >>>> Subject: Re: [rsyslog] Where is the output module for the > >>>> udptransportationtoremote syslog server > >>>> > >>>> Thanks a lot. > >>>> Currently i am stucked at the design that output module can not > >> modify > >>>> the msg to be seen by other output modules. > >>> > >>> While I think I understand why you need this functionality, I would > >>> appreciate if you could elaborate on that need a bit. I am asking > >> because I > >>> want to understand the potential use cases (hopefully all) BEFORE I > >> even > >>> consider implementing a facility to support them. > >>> > >>> Also, do you have a comment to the longer message on template > modules > >> I > >>> posted yesterday? > >>> > >>> Thanks, > >>> Rainer > >>> _______________________________________________ > >>> rsyslog mailing list > >>> http://lists.adiscon.net/mailman/listinfo/rsyslog > >>> http://www.rsyslog.com > >>> > >> _______________________________________________ > >> rsyslog mailing list > >> http://lists.adiscon.net/mailman/listinfo/rsyslog > >> http://www.rsyslog.com > > _______________________________________________ > > rsyslog mailing list > > http://lists.adiscon.net/mailman/listinfo/rsyslog > > http://www.rsyslog.com > > > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com From rgerhards at hq.adiscon.com Tue Jun 1 17:30:07 2010 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Tue, 1 Jun 2010 17:30:07 +0200 Subject: [rsyslog] Where is the output module for the udp transportationtoremote syslog server References: <004501cafd09$9027db05$100013ac@intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103E26@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103E37@GRFEXC.intern.adiscon.com> Message-ID: <9B6E2A8877C38245BFB15CC491A11DA7103E46@GRFEXC.intern.adiscon.com> Hi David, thanks for the feedback, comments inline below [template generators...] > > this sounds very promising. I question if you really would use the same > format for writing to a local file as you do when forwarding, the local > file normall doesn't log the severity info (or at least not in the same > format) > > I believe you already have differing templates for standard vs > forwarding > in many cases. > > rather than trying to do it in the config, is there a way to let the C > module say "have that module do it's think, then I'll tweak the result" > so > that the code doesn't need to get duplicated between modules for the > most > common cases? Actually, this complicates things AND adds some overhead again. On the other hand, I have now checked, code duplication is very small and really a one-time job. So I think I will accept that duplication for the time being. In any case, generators could load common code via the usual module loader (in just the same way as e.g. imtcp loads the relevant network stream driver). [...] > > Also, it would be useful to have the ability to persist already- > generated > > properties with the message while it is continued to be processed in > the rule > > engine. So far, we do not have this ability, and the reason is > processing > > time (plus, as usual, implementation effort): for that, we would need > to > > maintain a list (or hash, ...) of name/value pairs, store them to > disk for > > disk queues and shuffle them through the rule engine as processing is > carried > > out. As I said, quite doable, but another big addition. > > I expect that this isn't worthwhile for a couple of reasons. > > 1. with something like this you need to worry about multi-thread > protection and locks, which will kill your performance Not really, only if the locks need actually to be applied. Note that we already have some locking inside message processing (e.g. when a date string is generated), but in general a real lock is never taken, so this is not causing problems. Even then, we could improve the situation by spinlocks or other very light mechanisms. > > 2. with modern CPUs you really want to only work in the cache if you > can. > Any access to additional memory will stall the CPU long enough that you > could have done a significant amount of processing instead. It's to the > point where the kernel developers have measured and said that if the > CPU > needs to copy the contents of a TCP packet, they can do the checksum > calculation at the same time and the overhead of the memory I/O will > make > the time taken for the calculateion to be a net zero additional time > (the > CPU internally processes things in parallel) yes, and this was actually one of the performance improvements done to the output part. But it really depends on the question if memory is reused or not. Think about writing the same message to more than one file (a common scenario). Without the ability to store anything inside the message object, we actually need to generate the string twice. That means we need to write it twice to memory, and this means that the cache cannot be re-used. If we now have a capability to store the generated string together with the message, we could re-use it on the second invokation. Most probably, it is still available inside the cache. So we do not only not need to generate it, we can also access it at highest speed directly from the CPU cache. That should be a big saving. But it comes at some expense, and that is maintaining the property inside the message, which involves at least some writes (and maybe rebalancing operations, based on data structure used). So it would probably best to make persisting the string optional so that it will only be done when there is value in it. [...] > well, rather than creating an entire interface, what about creating a > patch to hard-code TraditionalFormat and TraditionalForwardFormat (or > pick > a couple) and we can benchmark the system with the hard-coded C formats > vs > the current process. good idea -- and as you have seen, already done :) Rainer From c.thiesset at newtech.fr Tue Jun 1 18:01:51 2010 From: c.thiesset at newtech.fr (Christophe THIESSET) Date: Tue, 1 Jun 2010 18:01:51 +0200 Subject: [rsyslog] message size limitation with tcp stream In-Reply-To: <9B6E2A8877C38245BFB15CC491A11DA7103E3D@GRFEXC.intern.adiscon.com> References: <201006011227.55267.c.thiesset@newtech.fr> <9B6E2A8877C38245BFB15CC491A11DA7103E3D@GRFEXC.intern.adiscon.com> Message-ID: <201006011801.52142.c.thiesset@newtech.fr> Raaaaah!! Thanks a lot. I've wasted my morning on that trick. But have a look at my conf: # provides UDP syslog reception $ModLoad imudp $UDPServerRun 514 # provides TCP syslog reception $ModLoad imtcp $InputTCPServerRun 514 $MaxMessageSize 6k And please tell me why MaxMessageSize was applied to imudp despite being defined after the modload ? Le mardi 1 juin 2010, Rainer Gerhards a ?crit?: > $MaxMesssageSize must be at the TOP of rsyslog.conf, at least before > loading imtcp. > > Rainer > > > -----Original Message----- > > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > > bounces at lists.adiscon.com] On Behalf Of Christophe THIESSET > > Sent: Tuesday, June 01, 2010 12:28 PM > > To: rsyslog at lists.adiscon.com > > Subject: [rsyslog] message size limitation with tcp stream > > > > My rsyslog is 4.6.2-1 running on debian lenny. > > Messages larger than 2k received via tcp are constantly truncated. > > Playing > > with MaxMessageSize is useless. On the other hand the same messages via > > udp > > are properly processed (and follow the MaxMessageSize value). > > > > Bug or do I miss something? > > > > Best Regards > > > > Christophe THIESSET > > _______________________________________________ > > rsyslog mailing list > > http://lists.adiscon.net/mailman/listinfo/rsyslog > > http://www.rsyslog.com From rgerhards at hq.adiscon.com Tue Jun 1 19:03:57 2010 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Tue, 1 Jun 2010 19:03:57 +0200 Subject: [rsyslog] Where is the output module for theudptransportationtoremote syslog server References: <2054128934449600685@unknownmsgid><9B6E2A8877C38245BFB15CC491A11DA7103E3E@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103E43@GRFEXC.intern.adiscon.com> Message-ID: <9B6E2A8877C38245BFB15CC491A11DA7103E48@GRFEXC.intern.adiscon.com> David, John, I made excellent progress today (it looks like it pays to take a few days off). I have just committed a new version that contains the actual plugin interface. I will work more on it tomorrow. If all goes well, the full functionality may become available tomorrow :) [but no promises]. John may want to have a look at ./tools/smtradfile.c to get an idea of what the module does and check if it is useful for him. Rainer > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of david at lang.hm > Sent: Tuesday, June 01, 2010 4:08 PM > To: rsyslog-users > Subject: Re: [rsyslog] Where is the output module for > theudptransportationtoremote syslog server > > Ok, this looks like something I can test. I won't be working on it > today > (I've been on a call since 4am local time so won't be doing much of > anything today:-) > > David Lang > > On Tue, 1 Jun 2010, Rainer Gerhards wrote: > > >> -----Original Message----- > >> From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > >> bounces at lists.adiscon.com] On Behalf Of John Li > >> > >> Hi Rainer, > >> > >> Sorry I didn't finish reading the long email yet as I just dived > into > >> the > >> ruleset module and tried to rewrite the message with submitMsg but > no > >> success yet. > > > > No problem, but keep on your mind that I have something boiling right > now. I > > will blog about it soon, but am currently tied in some other > activity. But > > have a look at this git commit: > > > > > http://git.adiscon.com/?p=rsyslog.git;a=commitdiff;h=59227a861821b2e0e3 > 7357c0 > > 695f6b3d9f11dd9d > > > >> > >> In general, the use case is for those SEM (Security Event > Management). > >> They > >> have their recommended syslog format and it will be much easier to > >> convert > >> the event in their format before send it over. > > > > My question is why you need to persist the string you generate. Do > you use it > > multiple times or just because you need to feed it ONE time into ONE > other > > action? > > > > Rainer > > > >> I promise will read the long email and hope I can provide some > useful > >> things > >> here. > >> > >> Thanks a lot for your work. > >> > >> -- > >> John Jun Li > >> jli at jlisbz.com > >> > >> > >> > >> On Tue, Jun 1, 2010 at 6:32 AM, Rainer Gerhards > >> wrote: > >> > >>> John, > >>> > >>> quick question: > >>> > >>>> -----Original Message----- > >>>> From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > >>>> bounces at lists.adiscon.com] On Behalf Of John Li > >>>> Sent: Monday, May 31, 2010 2:17 PM > >>>> To: david at lang.hm; rsyslog-users > >>>> Subject: Re: [rsyslog] Where is the output module for the > >>>> udptransportationtoremote syslog server > >>>> > >>>> Thanks a lot. > >>>> Currently i am stucked at the design that output module can not > >> modify > >>>> the msg to be seen by other output modules. > >>> > >>> While I think I understand why you need this functionality, I would > >>> appreciate if you could elaborate on that need a bit. I am asking > >> because I > >>> want to understand the potential use cases (hopefully all) BEFORE I > >> even > >>> consider implementing a facility to support them. > >>> > >>> Also, do you have a comment to the longer message on template > modules > >> I > >>> posted yesterday? > >>> > >>> Thanks, > >>> Rainer > >>> _______________________________________________ > >>> rsyslog mailing list > >>> http://lists.adiscon.net/mailman/listinfo/rsyslog > >>> http://www.rsyslog.com > >>> > >> _______________________________________________ > >> rsyslog mailing list > >> http://lists.adiscon.net/mailman/listinfo/rsyslog > >> http://www.rsyslog.com > > _______________________________________________ > > rsyslog mailing list > > http://lists.adiscon.net/mailman/listinfo/rsyslog > > http://www.rsyslog.com > > > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com From jli at jlisbz.com Tue Jun 1 19:34:37 2010 From: jli at jlisbz.com (John Li) Date: Tue, 1 Jun 2010 13:34:37 -0400 Subject: [rsyslog] Where is the output module for theudptransportationtoremote syslog server In-Reply-To: <9B6E2A8877C38245BFB15CC491A11DA7103E48@GRFEXC.intern.adiscon.com> References: <2054128934449600685@unknownmsgid> <9B6E2A8877C38245BFB15CC491A11DA7103E3E@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103E43@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103E48@GRFEXC.intern.adiscon.com> Message-ID: Great. Will check it out soon. Thanks a lot. -- John Jun Li jli at jlisbz.com On Tue, Jun 1, 2010 at 1:03 PM, Rainer Gerhards wrote: > David, John, > > I made excellent progress today (it looks like it pays to take a few days > off). I have just committed a new version that contains the actual plugin > interface. I will work more on it tomorrow. If all goes well, the full > functionality may become available tomorrow :) [but no promises]. > > John may want to have a look at ./tools/smtradfile.c to get an idea of what > the module does and check if it is useful for him. > > Rainer > > > -----Original Message----- > > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > > bounces at lists.adiscon.com] On Behalf Of david at lang.hm > > Sent: Tuesday, June 01, 2010 4:08 PM > > To: rsyslog-users > > Subject: Re: [rsyslog] Where is the output module for > > theudptransportationtoremote syslog server > > > > Ok, this looks like something I can test. I won't be working on it > > today > > (I've been on a call since 4am local time so won't be doing much of > > anything today:-) > > > > David Lang > > > > On Tue, 1 Jun 2010, Rainer Gerhards wrote: > > > > >> -----Original Message----- > > >> From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > > >> bounces at lists.adiscon.com] On Behalf Of John Li > > >> > > >> Hi Rainer, > > >> > > >> Sorry I didn't finish reading the long email yet as I just dived > > into > > >> the > > >> ruleset module and tried to rewrite the message with submitMsg but > > no > > >> success yet. > > > > > > No problem, but keep on your mind that I have something boiling right > > now. I > > > will blog about it soon, but am currently tied in some other > > activity. But > > > have a look at this git commit: > > > > > > > > http://git.adiscon.com/?p=rsyslog.git;a=commitdiff;h=59227a861821b2e0e3 > > 7357c0 > > > 695f6b3d9f11dd9d > > > > > >> > > >> In general, the use case is for those SEM (Security Event > > Management). > > >> They > > >> have their recommended syslog format and it will be much easier to > > >> convert > > >> the event in their format before send it over. > > > > > > My question is why you need to persist the string you generate. Do > > you use it > > > multiple times or just because you need to feed it ONE time into ONE > > other > > > action? > > > > > > Rainer > > > > > >> I promise will read the long email and hope I can provide some > > useful > > >> things > > >> here. > > >> > > >> Thanks a lot for your work. > > >> > > >> -- > > >> John Jun Li > > >> jli at jlisbz.com > > >> > > >> > > >> > > >> On Tue, Jun 1, 2010 at 6:32 AM, Rainer Gerhards > > >> wrote: > > >> > > >>> John, > > >>> > > >>> quick question: > > >>> > > >>>> -----Original Message----- > > >>>> From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > > >>>> bounces at lists.adiscon.com] On Behalf Of John Li > > >>>> Sent: Monday, May 31, 2010 2:17 PM > > >>>> To: david at lang.hm; rsyslog-users > > >>>> Subject: Re: [rsyslog] Where is the output module for the > > >>>> udptransportationtoremote syslog server > > >>>> > > >>>> Thanks a lot. > > >>>> Currently i am stucked at the design that output module can not > > >> modify > > >>>> the msg to be seen by other output modules. > > >>> > > >>> While I think I understand why you need this functionality, I would > > >>> appreciate if you could elaborate on that need a bit. I am asking > > >> because I > > >>> want to understand the potential use cases (hopefully all) BEFORE I > > >> even > > >>> consider implementing a facility to support them. > > >>> > > >>> Also, do you have a comment to the longer message on template > > modules > > >> I > > >>> posted yesterday? > > >>> > > >>> Thanks, > > >>> Rainer > > >>> _______________________________________________ > > >>> rsyslog mailing list > > >>> http://lists.adiscon.net/mailman/listinfo/rsyslog > > >>> http://www.rsyslog.com > > >>> > > >> _______________________________________________ > > >> rsyslog mailing list > > >> http://lists.adiscon.net/mailman/listinfo/rsyslog > > >> http://www.rsyslog.com > > > _______________________________________________ > > > rsyslog mailing list > > > http://lists.adiscon.net/mailman/listinfo/rsyslog > > > http://www.rsyslog.com > > > > > _______________________________________________ > > rsyslog mailing list > > http://lists.adiscon.net/mailman/listinfo/rsyslog > > http://www.rsyslog.com > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com > From jli at jlisbz.com Wed Jun 2 04:57:04 2010 From: jli at jlisbz.com (John Li) Date: Tue, 1 Jun 2010 22:57:04 -0400 Subject: [rsyslog] Where is the output module for theudptransportationtoremote syslog server In-Reply-To: <9B6E2A8877C38245BFB15CC491A11DA7103E48@GRFEXC.intern.adiscon.com> References: <2054128934449600685@unknownmsgid> <9B6E2A8877C38245BFB15CC491A11DA7103E3E@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103E43@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103E48@GRFEXC.intern.adiscon.com> Message-ID: I cloned the rsyslog project and switched to master-templateFuncation but could not build it. Here is the output: [jli at dev01 rsyslog]$ ./configure configure: error: cannot find install-sh or install.sh in "." "./.." "./../.." [jli at dev01 rsyslog]$ autoconf configure.ac:29: warning: AC_CACHE_VAL(lt_prog_compiler_pic_works, ...): suspicious cache-id, must contain _cv_ to be cached ../../lib/autoconf/general.m4:1974: AC_CACHE_VAL is expanded from... ../../lib/autoconf/general.m4:1994: AC_CACHE_CHECK is expanded from... aclocal.m4:621: AC_LIBTOOL_COMPILER_OPTION is expanded from... aclocal.m4:4829: AC_LIBTOOL_PROG_COMPILER_PIC is expanded from... aclocal.m4:2674: _LT_AC_LANG_C_CONFIG is expanded from... aclocal.m4:2673: AC_LIBTOOL_LANG_C_CONFIG is expanded from... aclocal.m4:86: AC_LIBTOOL_SETUP is expanded from... aclocal.m4:66: _AC_PROG_LIBTOOL is expanded from... aclocal.m4:31: AC_PROG_LIBTOOL is expanded from... configure.ac:29: the top level configure.ac:29: warning: AC_CACHE_VAL(lt_prog_compiler_static_works, ...): suspicious cache-id, must contain _cv_ to be cached aclocal.m4:666: AC_LIBTOOL_LINKER_OPTION is expanded from... configure.ac:29: warning: AC_CACHE_VAL(lt_prog_compiler_pic_works_CXX, ...): suspicious cache-id, must contain _cv_ to be cached aclocal.m4:2751: _LT_AC_LANG_CXX_CONFIG is expanded from... aclocal.m4:2750: AC_LIBTOOL_LANG_CXX_CONFIG is expanded from... aclocal.m4:1810: _LT_AC_TAGCONFIG is expanded from... configure.ac:29: warning: AC_CACHE_VAL(lt_prog_compiler_static_works_CXX, ...): suspicious cache-id, must contain _cv_ to be cached configure.ac:29: warning: AC_CACHE_VAL(lt_prog_compiler_pic_works_F77, ...): suspicious cache-id, must contain _cv_ to be cached aclocal.m4:3914: _LT_AC_LANG_F77_CONFIG is expanded from... aclocal.m4:3913: AC_LIBTOOL_LANG_F77_CONFIG is expanded from... configure.ac:29: warning: AC_CACHE_VAL(lt_prog_compiler_static_works_F77, ...): suspicious cache-id, must contain _cv_ to be cached configure.ac:29: warning: AC_CACHE_VAL(lt_prog_compiler_pic_works_GCJ, ...): suspicious cache-id, must contain _cv_ to be cached aclocal.m4:4016: _LT_AC_LANG_GCJ_CONFIG is expanded from... aclocal.m4:4015: AC_LIBTOOL_LANG_GCJ_CONFIG is expanded from... configure.ac:29: warning: AC_CACHE_VAL(lt_prog_compiler_static_works_GCJ, ...): suspicious cache-id, must contain _cv_ to be cached In the same box, I am able to build with the 5.5.5 release download. Any idea? Thanks. -- John Jun Li jli at jlisbz.com On Tue, Jun 1, 2010 at 1:03 PM, Rainer Gerhards wrote: > David, John, > > I made excellent progress today (it looks like it pays to take a few days > off). I have just committed a new version that contains the actual plugin > interface. I will work more on it tomorrow. If all goes well, the full > functionality may become available tomorrow :) [but no promises]. > > John may want to have a look at ./tools/smtradfile.c to get an idea of what > the module does and check if it is useful for him. > > Rainer > > > -----Original Message----- > > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > > bounces at lists.adiscon.com] On Behalf Of david at lang.hm > > Sent: Tuesday, June 01, 2010 4:08 PM > > To: rsyslog-users > > Subject: Re: [rsyslog] Where is the output module for > > theudptransportationtoremote syslog server > > > > Ok, this looks like something I can test. I won't be working on it > > today > > (I've been on a call since 4am local time so won't be doing much of > > anything today:-) > > > > David Lang > > > > On Tue, 1 Jun 2010, Rainer Gerhards wrote: > > > > >> -----Original Message----- > > >> From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > > >> bounces at lists.adiscon.com] On Behalf Of John Li > > >> > > >> Hi Rainer, > > >> > > >> Sorry I didn't finish reading the long email yet as I just dived > > into > > >> the > > >> ruleset module and tried to rewrite the message with submitMsg but > > no > > >> success yet. > > > > > > No problem, but keep on your mind that I have something boiling right > > now. I > > > will blog about it soon, but am currently tied in some other > > activity. But > > > have a look at this git commit: > > > > > > > > http://git.adiscon.com/?p=rsyslog.git;a=commitdiff;h=59227a861821b2e0e3 > > 7357c0 > > > 695f6b3d9f11dd9d > > > > > >> > > >> In general, the use case is for those SEM (Security Event > > Management). > > >> They > > >> have their recommended syslog format and it will be much easier to > > >> convert > > >> the event in their format before send it over. > > > > > > My question is why you need to persist the string you generate. Do > > you use it > > > multiple times or just because you need to feed it ONE time into ONE > > other > > > action? > > > > > > Rainer > > > > > >> I promise will read the long email and hope I can provide some > > useful > > >> things > > >> here. > > >> > > >> Thanks a lot for your work. > > >> > > >> -- > > >> John Jun Li > > >> jli at jlisbz.com > > >> > > >> > > >> > > >> On Tue, Jun 1, 2010 at 6:32 AM, Rainer Gerhards > > >> wrote: > > >> > > >>> John, > > >>> > > >>> quick question: > > >>> > > >>>> -----Original Message----- > > >>>> From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > > >>>> bounces at lists.adiscon.com] On Behalf Of John Li > > >>>> Sent: Monday, May 31, 2010 2:17 PM > > >>>> To: david at lang.hm; rsyslog-users > > >>>> Subject: Re: [rsyslog] Where is the output module for the > > >>>> udptransportationtoremote syslog server > > >>>> > > >>>> Thanks a lot. > > >>>> Currently i am stucked at the design that output module can not > > >> modify > > >>>> the msg to be seen by other output modules. > > >>> > > >>> While I think I understand why you need this functionality, I would > > >>> appreciate if you could elaborate on that need a bit. I am asking > > >> because I > > >>> want to understand the potential use cases (hopefully all) BEFORE I > > >> even > > >>> consider implementing a facility to support them. > > >>> > > >>> Also, do you have a comment to the longer message on template > > modules > > >> I > > >>> posted yesterday? > > >>> > > >>> Thanks, > > >>> Rainer > > >>> _______________________________________________ > > >>> rsyslog mailing list > > >>> http://lists.adiscon.net/mailman/listinfo/rsyslog > > >>> http://www.rsyslog.com > > >>> > > >> _______________________________________________ > > >> rsyslog mailing list > > >> http://lists.adiscon.net/mailman/listinfo/rsyslog > > >> http://www.rsyslog.com > > > _______________________________________________ > > > rsyslog mailing list > > > http://lists.adiscon.net/mailman/listinfo/rsyslog > > > http://www.rsyslog.com > > > > > _______________________________________________ > > rsyslog mailing list > > http://lists.adiscon.net/mailman/listinfo/rsyslog > > http://www.rsyslog.com > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com > From rgerhards at hq.adiscon.com Wed Jun 2 14:42:10 2010 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Wed, 2 Jun 2010 14:42:10 +0200 Subject: [rsyslog] Where is the output module fortheudptransportationtoremote syslog server References: <2054128934449600685@unknownmsgid><9B6E2A8877C38245BFB15CC491A11DA7103E3E@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103E43@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103E48@GRFEXC.intern.adiscon.com> Message-ID: <9B6E2A8877C38245BFB15CC491A11DA7103E4C@GRFEXC.intern.adiscon.com> I guess autoreconf -fvi (options!) is missing, please also see http://www.rsyslog.com/doc-build_from_repo.html Rainer > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of John Li > Sent: Wednesday, June 02, 2010 4:57 AM > To: rsyslog-users > Subject: Re: [rsyslog] Where is the output module > fortheudptransportationtoremote syslog server > > I cloned the rsyslog project and switched to master-templateFuncation > but > could not build it. Here is the output: > > [jli at dev01 rsyslog]$ ./configure > configure: error: cannot find install-sh or install.sh in "." "./.." > "./../.." > [jli at dev01 rsyslog]$ autoconf > configure.ac:29: warning: AC_CACHE_VAL(lt_prog_compiler_pic_works, > ...): > suspicious cache-id, must contain _cv_ to be cached > ../../lib/autoconf/general.m4:1974: AC_CACHE_VAL is expanded from... > ../../lib/autoconf/general.m4:1994: AC_CACHE_CHECK is expanded from... > aclocal.m4:621: AC_LIBTOOL_COMPILER_OPTION is expanded from... > aclocal.m4:4829: AC_LIBTOOL_PROG_COMPILER_PIC is expanded from... > aclocal.m4:2674: _LT_AC_LANG_C_CONFIG is expanded from... > aclocal.m4:2673: AC_LIBTOOL_LANG_C_CONFIG is expanded from... > aclocal.m4:86: AC_LIBTOOL_SETUP is expanded from... > aclocal.m4:66: _AC_PROG_LIBTOOL is expanded from... > aclocal.m4:31: AC_PROG_LIBTOOL is expanded from... > configure.ac:29: the top level > configure.ac:29: warning: AC_CACHE_VAL(lt_prog_compiler_static_works, > ...): > suspicious cache-id, must contain _cv_ to be cached > aclocal.m4:666: AC_LIBTOOL_LINKER_OPTION is expanded from... > configure.ac:29: warning: AC_CACHE_VAL(lt_prog_compiler_pic_works_CXX, > ...): > suspicious cache-id, must contain _cv_ to be cached > aclocal.m4:2751: _LT_AC_LANG_CXX_CONFIG is expanded from... > aclocal.m4:2750: AC_LIBTOOL_LANG_CXX_CONFIG is expanded from... > aclocal.m4:1810: _LT_AC_TAGCONFIG is expanded from... > configure.ac:29: warning: > AC_CACHE_VAL(lt_prog_compiler_static_works_CXX, > ...): suspicious cache-id, must contain _cv_ to be cached > configure.ac:29: warning: AC_CACHE_VAL(lt_prog_compiler_pic_works_F77, > ...): > suspicious cache-id, must contain _cv_ to be cached > aclocal.m4:3914: _LT_AC_LANG_F77_CONFIG is expanded from... > aclocal.m4:3913: AC_LIBTOOL_LANG_F77_CONFIG is expanded from... > configure.ac:29: warning: > AC_CACHE_VAL(lt_prog_compiler_static_works_F77, > ...): suspicious cache-id, must contain _cv_ to be cached > configure.ac:29: warning: AC_CACHE_VAL(lt_prog_compiler_pic_works_GCJ, > ...): > suspicious cache-id, must contain _cv_ to be cached > aclocal.m4:4016: _LT_AC_LANG_GCJ_CONFIG is expanded from... > aclocal.m4:4015: AC_LIBTOOL_LANG_GCJ_CONFIG is expanded from... > configure.ac:29: warning: > AC_CACHE_VAL(lt_prog_compiler_static_works_GCJ, > ...): suspicious cache-id, must contain _cv_ to be cached > > In the same box, I am able to build with the 5.5.5 release download. > > Any idea? > > Thanks. > > > -- > John Jun Li > jli at jlisbz.com > > > On Tue, Jun 1, 2010 at 1:03 PM, Rainer Gerhards > wrote: > > > David, John, > > > > I made excellent progress today (it looks like it pays to take a few > days > > off). I have just committed a new version that contains the actual > plugin > > interface. I will work more on it tomorrow. If all goes well, the > full > > functionality may become available tomorrow :) [but no promises]. > > > > John may want to have a look at ./tools/smtradfile.c to get an idea > of what > > the module does and check if it is useful for him. > > > > Rainer > > > > > -----Original Message----- > > > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > > > bounces at lists.adiscon.com] On Behalf Of david at lang.hm > > > Sent: Tuesday, June 01, 2010 4:08 PM > > > To: rsyslog-users > > > Subject: Re: [rsyslog] Where is the output module for > > > theudptransportationtoremote syslog server > > > > > > Ok, this looks like something I can test. I won't be working on it > > > today > > > (I've been on a call since 4am local time so won't be doing much of > > > anything today:-) > > > > > > David Lang > > > > > > On Tue, 1 Jun 2010, Rainer Gerhards wrote: > > > > > > >> -----Original Message----- > > > >> From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > > > >> bounces at lists.adiscon.com] On Behalf Of John Li > > > >> > > > >> Hi Rainer, > > > >> > > > >> Sorry I didn't finish reading the long email yet as I just dived > > > into > > > >> the > > > >> ruleset module and tried to rewrite the message with submitMsg > but > > > no > > > >> success yet. > > > > > > > > No problem, but keep on your mind that I have something boiling > right > > > now. I > > > > will blog about it soon, but am currently tied in some other > > > activity. But > > > > have a look at this git commit: > > > > > > > > > > > > http://git.adiscon.com/?p=rsyslog.git;a=commitdiff;h=59227a861821b2e0e3 > > > 7357c0 > > > > 695f6b3d9f11dd9d > > > > > > > >> > > > >> In general, the use case is for those SEM (Security Event > > > Management). > > > >> They > > > >> have their recommended syslog format and it will be much easier > to > > > >> convert > > > >> the event in their format before send it over. > > > > > > > > My question is why you need to persist the string you generate. > Do > > > you use it > > > > multiple times or just because you need to feed it ONE time into > ONE > > > other > > > > action? > > > > > > > > Rainer > > > > > > > >> I promise will read the long email and hope I can provide some > > > useful > > > >> things > > > >> here. > > > >> > > > >> Thanks a lot for your work. > > > >> > > > >> -- > > > >> John Jun Li > > > >> jli at jlisbz.com > > > >> > > > >> > > > >> > > > >> On Tue, Jun 1, 2010 at 6:32 AM, Rainer Gerhards > > > >> wrote: > > > >> > > > >>> John, > > > >>> > > > >>> quick question: > > > >>> > > > >>>> -----Original Message----- > > > >>>> From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > > > >>>> bounces at lists.adiscon.com] On Behalf Of John Li > > > >>>> Sent: Monday, May 31, 2010 2:17 PM > > > >>>> To: david at lang.hm; rsyslog-users > > > >>>> Subject: Re: [rsyslog] Where is the output module for the > > > >>>> udptransportationtoremote syslog server > > > >>>> > > > >>>> Thanks a lot. > > > >>>> Currently i am stucked at the design that output module can > not > > > >> modify > > > >>>> the msg to be seen by other output modules. > > > >>> > > > >>> While I think I understand why you need this functionality, I > would > > > >>> appreciate if you could elaborate on that need a bit. I am > asking > > > >> because I > > > >>> want to understand the potential use cases (hopefully all) > BEFORE I > > > >> even > > > >>> consider implementing a facility to support them. > > > >>> > > > >>> Also, do you have a comment to the longer message on template > > > modules > > > >> I > > > >>> posted yesterday? > > > >>> > > > >>> Thanks, > > > >>> Rainer > > > >>> _______________________________________________ > > > >>> rsyslog mailing list > > > >>> http://lists.adiscon.net/mailman/listinfo/rsyslog > > > >>> http://www.rsyslog.com > > > >>> > > > >> _______________________________________________ > > > >> rsyslog mailing list > > > >> http://lists.adiscon.net/mailman/listinfo/rsyslog > > > >> http://www.rsyslog.com > > > > _______________________________________________ > > > > rsyslog mailing list > > > > http://lists.adiscon.net/mailman/listinfo/rsyslog > > > > http://www.rsyslog.com > > > > > > > _______________________________________________ > > > rsyslog mailing list > > > http://lists.adiscon.net/mailman/listinfo/rsyslog > > > http://www.rsyslog.com > > _______________________________________________ > > rsyslog mailing list > > http://lists.adiscon.net/mailman/listinfo/rsyslog > > http://www.rsyslog.com > > > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com From jli at jlisbz.com Wed Jun 2 14:56:40 2010 From: jli at jlisbz.com (John Li) Date: Wed, 2 Jun 2010 08:56:40 -0400 Subject: [rsyslog] Where is the output module fortheudptransportationtoremote syslog server In-Reply-To: <9B6E2A8877C38245BFB15CC491A11DA7103E4C@GRFEXC.intern.adiscon.com> References: <2054128934449600685@unknownmsgid> <9B6E2A8877C38245BFB15CC491A11DA7103E3E@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103E43@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103E48@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103E4C@GRFEXC.intern.adiscon.com> Message-ID: It works. Sorry I missed the document. Thanks. -- John Jun Li jli at jlisbz.com On Wed, Jun 2, 2010 at 8:42 AM, Rainer Gerhards wrote: > I guess autoreconf -fvi (options!) is missing, please also see > > http://www.rsyslog.com/doc-build_from_repo.html > > Rainer > > > -----Original Message----- > > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > > bounces at lists.adiscon.com] On Behalf Of John Li > > Sent: Wednesday, June 02, 2010 4:57 AM > > To: rsyslog-users > > Subject: Re: [rsyslog] Where is the output module > > fortheudptransportationtoremote syslog server > > > > I cloned the rsyslog project and switched to master-templateFuncation > > but > > could not build it. Here is the output: > > > > [jli at dev01 rsyslog]$ ./configure > > configure: error: cannot find install-sh or install.sh in "." "./.." > > "./../.." > > [jli at dev01 rsyslog]$ autoconf > > configure.ac:29: warning: AC_CACHE_VAL(lt_prog_compiler_pic_works, > > ...): > > suspicious cache-id, must contain _cv_ to be cached > > ../../lib/autoconf/general.m4:1974: AC_CACHE_VAL is expanded from... > > ../../lib/autoconf/general.m4:1994: AC_CACHE_CHECK is expanded from... > > aclocal.m4:621: AC_LIBTOOL_COMPILER_OPTION is expanded from... > > aclocal.m4:4829: AC_LIBTOOL_PROG_COMPILER_PIC is expanded from... > > aclocal.m4:2674: _LT_AC_LANG_C_CONFIG is expanded from... > > aclocal.m4:2673: AC_LIBTOOL_LANG_C_CONFIG is expanded from... > > aclocal.m4:86: AC_LIBTOOL_SETUP is expanded from... > > aclocal.m4:66: _AC_PROG_LIBTOOL is expanded from... > > aclocal.m4:31: AC_PROG_LIBTOOL is expanded from... > > configure.ac:29: the top level > > configure.ac:29: warning: AC_CACHE_VAL(lt_prog_compiler_static_works, > > ...): > > suspicious cache-id, must contain _cv_ to be cached > > aclocal.m4:666: AC_LIBTOOL_LINKER_OPTION is expanded from... > > configure.ac:29: warning: AC_CACHE_VAL(lt_prog_compiler_pic_works_CXX, > > ...): > > suspicious cache-id, must contain _cv_ to be cached > > aclocal.m4:2751: _LT_AC_LANG_CXX_CONFIG is expanded from... > > aclocal.m4:2750: AC_LIBTOOL_LANG_CXX_CONFIG is expanded from... > > aclocal.m4:1810: _LT_AC_TAGCONFIG is expanded from... > > configure.ac:29: warning: > > AC_CACHE_VAL(lt_prog_compiler_static_works_CXX, > > ...): suspicious cache-id, must contain _cv_ to be cached > > configure.ac:29: warning: AC_CACHE_VAL(lt_prog_compiler_pic_works_F77, > > ...): > > suspicious cache-id, must contain _cv_ to be cached > > aclocal.m4:3914: _LT_AC_LANG_F77_CONFIG is expanded from... > > aclocal.m4:3913: AC_LIBTOOL_LANG_F77_CONFIG is expanded from... > > configure.ac:29: warning: > > AC_CACHE_VAL(lt_prog_compiler_static_works_F77, > > ...): suspicious cache-id, must contain _cv_ to be cached > > configure.ac:29: warning: AC_CACHE_VAL(lt_prog_compiler_pic_works_GCJ, > > ...): > > suspicious cache-id, must contain _cv_ to be cached > > aclocal.m4:4016: _LT_AC_LANG_GCJ_CONFIG is expanded from... > > aclocal.m4:4015: AC_LIBTOOL_LANG_GCJ_CONFIG is expanded from... > > configure.ac:29: warning: > > AC_CACHE_VAL(lt_prog_compiler_static_works_GCJ, > > ...): suspicious cache-id, must contain _cv_ to be cached > > > > In the same box, I am able to build with the 5.5.5 release download. > > > > Any idea? > > > > Thanks. > > > > > > -- > > John Jun Li > > jli at jlisbz.com > > > > > > On Tue, Jun 1, 2010 at 1:03 PM, Rainer Gerhards > > wrote: > > > > > David, John, > > > > > > I made excellent progress today (it looks like it pays to take a few > > days > > > off). I have just committed a new version that contains the actual > > plugin > > > interface. I will work more on it tomorrow. If all goes well, the > > full > > > functionality may become available tomorrow :) [but no promises]. > > > > > > John may want to have a look at ./tools/smtradfile.c to get an idea > > of what > > > the module does and check if it is useful for him. > > > > > > Rainer > > > > > > > -----Original Message----- > > > > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > > > > bounces at lists.adiscon.com] On Behalf Of david at lang.hm > > > > Sent: Tuesday, June 01, 2010 4:08 PM > > > > To: rsyslog-users > > > > Subject: Re: [rsyslog] Where is the output module for > > > > theudptransportationtoremote syslog server > > > > > > > > Ok, this looks like something I can test. I won't be working on it > > > > today > > > > (I've been on a call since 4am local time so won't be doing much of > > > > anything today:-) > > > > > > > > David Lang > > > > > > > > On Tue, 1 Jun 2010, Rainer Gerhards wrote: > > > > > > > > >> -----Original Message----- > > > > >> From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > > > > >> bounces at lists.adiscon.com] On Behalf Of John Li > > > > >> > > > > >> Hi Rainer, > > > > >> > > > > >> Sorry I didn't finish reading the long email yet as I just dived > > > > into > > > > >> the > > > > >> ruleset module and tried to rewrite the message with submitMsg > > but > > > > no > > > > >> success yet. > > > > > > > > > > No problem, but keep on your mind that I have something boiling > > right > > > > now. I > > > > > will blog about it soon, but am currently tied in some other > > > > activity. But > > > > > have a look at this git commit: > > > > > > > > > > > > > > > > http://git.adiscon.com/?p=rsyslog.git;a=commitdiff;h=59227a861821b2e0e3 > > > > 7357c0 > > > > > 695f6b3d9f11dd9d > > > > > > > > > >> > > > > >> In general, the use case is for those SEM (Security Event > > > > Management). > > > > >> They > > > > >> have their recommended syslog format and it will be much easier > > to > > > > >> convert > > > > >> the event in their format before send it over. > > > > > > > > > > My question is why you need to persist the string you generate. > > Do > > > > you use it > > > > > multiple times or just because you need to feed it ONE time into > > ONE > > > > other > > > > > action? > > > > > > > > > > Rainer > > > > > > > > > >> I promise will read the long email and hope I can provide some > > > > useful > > > > >> things > > > > >> here. > > > > >> > > > > >> Thanks a lot for your work. > > > > >> > > > > >> -- > > > > >> John Jun Li > > > > >> jli at jlisbz.com > > > > >> > > > > >> > > > > >> > > > > >> On Tue, Jun 1, 2010 at 6:32 AM, Rainer Gerhards > > > > >> wrote: > > > > >> > > > > >>> John, > > > > >>> > > > > >>> quick question: > > > > >>> > > > > >>>> -----Original Message----- > > > > >>>> From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > > > > >>>> bounces at lists.adiscon.com] On Behalf Of John Li > > > > >>>> Sent: Monday, May 31, 2010 2:17 PM > > > > >>>> To: david at lang.hm; rsyslog-users > > > > >>>> Subject: Re: [rsyslog] Where is the output module for the > > > > >>>> udptransportationtoremote syslog server > > > > >>>> > > > > >>>> Thanks a lot. > > > > >>>> Currently i am stucked at the design that output module can > > not > > > > >> modify > > > > >>>> the msg to be seen by other output modules. > > > > >>> > > > > >>> While I think I understand why you need this functionality, I > > would > > > > >>> appreciate if you could elaborate on that need a bit. I am > > asking > > > > >> because I > > > > >>> want to understand the potential use cases (hopefully all) > > BEFORE I > > > > >> even > > > > >>> consider implementing a facility to support them. > > > > >>> > > > > >>> Also, do you have a comment to the longer message on template > > > > modules > > > > >> I > > > > >>> posted yesterday? > > > > >>> > > > > >>> Thanks, > > > > >>> Rainer > > > > >>> _______________________________________________ > > > > >>> rsyslog mailing list > > > > >>> http://lists.adiscon.net/mailman/listinfo/rsyslog > > > > >>> http://www.rsyslog.com > > > > >>> > > > > >> _______________________________________________ > > > > >> rsyslog mailing list > > > > >> http://lists.adiscon.net/mailman/listinfo/rsyslog > > > > >> http://www.rsyslog.com > > > > > _______________________________________________ > > > > > rsyslog mailing list > > > > > http://lists.adiscon.net/mailman/listinfo/rsyslog > > > > > http://www.rsyslog.com > > > > > > > > > _______________________________________________ > > > > rsyslog mailing list > > > > http://lists.adiscon.net/mailman/listinfo/rsyslog > > > > http://www.rsyslog.com > > > _______________________________________________ > > > rsyslog mailing list > > > http://lists.adiscon.net/mailman/listinfo/rsyslog > > > http://www.rsyslog.com > > > > > _______________________________________________ > > rsyslog mailing list > > http://lists.adiscon.net/mailman/listinfo/rsyslog > > http://www.rsyslog.com > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com > From rgerhards at hq.adiscon.com Fri Jun 4 13:41:09 2010 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Fri, 4 Jun 2010 13:41:09 +0200 Subject: [rsyslog] Where is the output modulefortheudptransportationtoremote syslog server References: <2054128934449600685@unknownmsgid><9B6E2A8877C38245BFB15CC491A11DA7103E3E@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103E43@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103E48@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103E4C@GRFEXC.intern.adiscon.com> Message-ID: <9B6E2A8877C38245BFB15CC491A11DA7103E55@GRFEXC.intern.adiscon.com> John, David, I have just completed the implementation of this feature. The new code has been moved to the master branch and will probably released as a new version early next week. Details in my blog post (be sure to follow the links): http://blog.gerhards.net/2010/06/rsyslog-string-generators-done.html Any experience reports are appreciated. John, I think that you can use this for your project. The new modules are located in ./tools/sm*.[ch]. They are not in ./plugins because the current set are build-in modules, only. Rainer > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of Rainer Gerhards > Sent: Wednesday, June 02, 2010 2:42 PM > To: rsyslog-users > Subject: Re: [rsyslog] Where is the output > modulefortheudptransportationtoremote syslog server > > I guess autoreconf -fvi (options!) is missing, please also see > > http://www.rsyslog.com/doc-build_from_repo.html > > Rainer > > > -----Original Message----- > > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > > bounces at lists.adiscon.com] On Behalf Of John Li > > Sent: Wednesday, June 02, 2010 4:57 AM > > To: rsyslog-users > > Subject: Re: [rsyslog] Where is the output module > > fortheudptransportationtoremote syslog server > > > > I cloned the rsyslog project and switched to master-templateFuncation > > but > > could not build it. Here is the output: > > > > [jli at dev01 rsyslog]$ ./configure > > configure: error: cannot find install-sh or install.sh in "." "./.." > > "./../.." > > [jli at dev01 rsyslog]$ autoconf > > configure.ac:29: warning: AC_CACHE_VAL(lt_prog_compiler_pic_works, > > ...): > > suspicious cache-id, must contain _cv_ to be cached > > ../../lib/autoconf/general.m4:1974: AC_CACHE_VAL is expanded from... > > ../../lib/autoconf/general.m4:1994: AC_CACHE_CHECK is expanded > from... > > aclocal.m4:621: AC_LIBTOOL_COMPILER_OPTION is expanded from... > > aclocal.m4:4829: AC_LIBTOOL_PROG_COMPILER_PIC is expanded from... > > aclocal.m4:2674: _LT_AC_LANG_C_CONFIG is expanded from... > > aclocal.m4:2673: AC_LIBTOOL_LANG_C_CONFIG is expanded from... > > aclocal.m4:86: AC_LIBTOOL_SETUP is expanded from... > > aclocal.m4:66: _AC_PROG_LIBTOOL is expanded from... > > aclocal.m4:31: AC_PROG_LIBTOOL is expanded from... > > configure.ac:29: the top level > > configure.ac:29: warning: AC_CACHE_VAL(lt_prog_compiler_static_works, > > ...): > > suspicious cache-id, must contain _cv_ to be cached > > aclocal.m4:666: AC_LIBTOOL_LINKER_OPTION is expanded from... > > configure.ac:29: warning: > AC_CACHE_VAL(lt_prog_compiler_pic_works_CXX, > > ...): > > suspicious cache-id, must contain _cv_ to be cached > > aclocal.m4:2751: _LT_AC_LANG_CXX_CONFIG is expanded from... > > aclocal.m4:2750: AC_LIBTOOL_LANG_CXX_CONFIG is expanded from... > > aclocal.m4:1810: _LT_AC_TAGCONFIG is expanded from... > > configure.ac:29: warning: > > AC_CACHE_VAL(lt_prog_compiler_static_works_CXX, > > ...): suspicious cache-id, must contain _cv_ to be cached > > configure.ac:29: warning: > AC_CACHE_VAL(lt_prog_compiler_pic_works_F77, > > ...): > > suspicious cache-id, must contain _cv_ to be cached > > aclocal.m4:3914: _LT_AC_LANG_F77_CONFIG is expanded from... > > aclocal.m4:3913: AC_LIBTOOL_LANG_F77_CONFIG is expanded from... > > configure.ac:29: warning: > > AC_CACHE_VAL(lt_prog_compiler_static_works_F77, > > ...): suspicious cache-id, must contain _cv_ to be cached > > configure.ac:29: warning: > AC_CACHE_VAL(lt_prog_compiler_pic_works_GCJ, > > ...): > > suspicious cache-id, must contain _cv_ to be cached > > aclocal.m4:4016: _LT_AC_LANG_GCJ_CONFIG is expanded from... > > aclocal.m4:4015: AC_LIBTOOL_LANG_GCJ_CONFIG is expanded from... > > configure.ac:29: warning: > > AC_CACHE_VAL(lt_prog_compiler_static_works_GCJ, > > ...): suspicious cache-id, must contain _cv_ to be cached > > > > In the same box, I am able to build with the 5.5.5 release download. > > > > Any idea? > > > > Thanks. > > > > > > -- > > John Jun Li > > jli at jlisbz.com > > > > > > On Tue, Jun 1, 2010 at 1:03 PM, Rainer Gerhards > > wrote: > > > > > David, John, > > > > > > I made excellent progress today (it looks like it pays to take a > few > > days > > > off). I have just committed a new version that contains the actual > > plugin > > > interface. I will work more on it tomorrow. If all goes well, the > > full > > > functionality may become available tomorrow :) [but no promises]. > > > > > > John may want to have a look at ./tools/smtradfile.c to get an idea > > of what > > > the module does and check if it is useful for him. > > > > > > Rainer > > > > > > > -----Original Message----- > > > > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > > > > bounces at lists.adiscon.com] On Behalf Of david at lang.hm > > > > Sent: Tuesday, June 01, 2010 4:08 PM > > > > To: rsyslog-users > > > > Subject: Re: [rsyslog] Where is the output module for > > > > theudptransportationtoremote syslog server > > > > > > > > Ok, this looks like something I can test. I won't be working on > it > > > > today > > > > (I've been on a call since 4am local time so won't be doing much > of > > > > anything today:-) > > > > > > > > David Lang > > > > > > > > On Tue, 1 Jun 2010, Rainer Gerhards wrote: > > > > > > > > >> -----Original Message----- > > > > >> From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > > > > >> bounces at lists.adiscon.com] On Behalf Of John Li > > > > >> > > > > >> Hi Rainer, > > > > >> > > > > >> Sorry I didn't finish reading the long email yet as I just > dived > > > > into > > > > >> the > > > > >> ruleset module and tried to rewrite the message with submitMsg > > but > > > > no > > > > >> success yet. > > > > > > > > > > No problem, but keep on your mind that I have something boiling > > right > > > > now. I > > > > > will blog about it soon, but am currently tied in some other > > > > activity. But > > > > > have a look at this git commit: > > > > > > > > > > > > > > > > > http://git.adiscon.com/?p=rsyslog.git;a=commitdiff;h=59227a861821b2e0e3 > > > > 7357c0 > > > > > 695f6b3d9f11dd9d > > > > > > > > > >> > > > > >> In general, the use case is for those SEM (Security Event > > > > Management). > > > > >> They > > > > >> have their recommended syslog format and it will be much > easier > > to > > > > >> convert > > > > >> the event in their format before send it over. > > > > > > > > > > My question is why you need to persist the string you generate. > > Do > > > > you use it > > > > > multiple times or just because you need to feed it ONE time > into > > ONE > > > > other > > > > > action? > > > > > > > > > > Rainer > > > > > > > > > >> I promise will read the long email and hope I can provide some > > > > useful > > > > >> things > > > > >> here. > > > > >> > > > > >> Thanks a lot for your work. > > > > >> > > > > >> -- > > > > >> John Jun Li > > > > >> jli at jlisbz.com > > > > >> > > > > >> > > > > >> > > > > >> On Tue, Jun 1, 2010 at 6:32 AM, Rainer Gerhards > > > > >> wrote: > > > > >> > > > > >>> John, > > > > >>> > > > > >>> quick question: > > > > >>> > > > > >>>> -----Original Message----- > > > > >>>> From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > > > > >>>> bounces at lists.adiscon.com] On Behalf Of John Li > > > > >>>> Sent: Monday, May 31, 2010 2:17 PM > > > > >>>> To: david at lang.hm; rsyslog-users > > > > >>>> Subject: Re: [rsyslog] Where is the output module for the > > > > >>>> udptransportationtoremote syslog server > > > > >>>> > > > > >>>> Thanks a lot. > > > > >>>> Currently i am stucked at the design that output module can > > not > > > > >> modify > > > > >>>> the msg to be seen by other output modules. > > > > >>> > > > > >>> While I think I understand why you need this functionality, I > > would > > > > >>> appreciate if you could elaborate on that need a bit. I am > > asking > > > > >> because I > > > > >>> want to understand the potential use cases (hopefully all) > > BEFORE I > > > > >> even > > > > >>> consider implementing a facility to support them. > > > > >>> > > > > >>> Also, do you have a comment to the longer message on template > > > > modules > > > > >> I > > > > >>> posted yesterday? > > > > >>> > > > > >>> Thanks, > > > > >>> Rainer > > > > >>> _______________________________________________ > > > > >>> rsyslog mailing list > > > > >>> http://lists.adiscon.net/mailman/listinfo/rsyslog > > > > >>> http://www.rsyslog.com > > > > >>> > > > > >> _______________________________________________ > > > > >> rsyslog mailing list > > > > >> http://lists.adiscon.net/mailman/listinfo/rsyslog > > > > >> http://www.rsyslog.com > > > > > _______________________________________________ > > > > > rsyslog mailing list > > > > > http://lists.adiscon.net/mailman/listinfo/rsyslog > > > > > http://www.rsyslog.com > > > > > > > > > _______________________________________________ > > > > rsyslog mailing list > > > > http://lists.adiscon.net/mailman/listinfo/rsyslog > > > > http://www.rsyslog.com > > > _______________________________________________ > > > rsyslog mailing list > > > http://lists.adiscon.net/mailman/listinfo/rsyslog > > > http://www.rsyslog.com > > > > > _______________________________________________ > > rsyslog mailing list > > http://lists.adiscon.net/mailman/listinfo/rsyslog > > http://www.rsyslog.com > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com From jli at jlisbz.com Fri Jun 4 14:31:54 2010 From: jli at jlisbz.com (John Li) Date: Fri, 4 Jun 2010 05:31:54 -0700 Subject: [rsyslog] Where is the output modulefortheudptransportationtoremote syslog server Message-ID: <-3986835649979497076@unknownmsgid> Thanks Rainer. This is great. I will create some of the custom string builder and put it back to your directory soon. Sent from my HTC -----Original Message----- From: Rainer Gerhards Sent: June 4, 2010 7:41 AM To: rsyslog-users Subject: Re: [rsyslog] Where is the output modulefortheudptransportationtoremote syslog server John, David, I have just completed the implementation of this feature. The new code has been moved to the master branch and will probably released as a new version early next week. Details in my blog post (be sure to follow the links): http://blog.gerhards.net/2010/06/rsyslog-string-generators-done.html Any experience reports are appreciated. John, I think that you can use this for your project. The new modules are located in ./tools/sm*.[ch]. They are not in ./plugins because the current set are build-in modules, only. Rainer > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of Rainer Gerhards > Sent: Wednesday, June 02, 2010 2:42 PM > To: rsyslog-users > Subject: Re: [rsyslog] Where is the output > modulefortheudptransportationtoremote syslog server > > I guess autoreconf -fvi (options!) is missing, please also see > > http://www.rsyslog.com/doc-build_from_repo.html > > Rainer > > > -----Original Message----- > > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > > bounces at lists.adiscon.com] On Behalf Of John Li > > Sent: Wednesday, June 02, 2010 4:57 AM > > To: rsyslog-users > > Subject: Re: [rsyslog] Where is the output module > > fortheudptransportationtoremote syslog server > > > > I cloned the rsyslog project and switched to master-templateFuncation > > but > > could not build it. Here is the output: > > > > [jli at dev01 rsyslog]$ ./configure > > configure: error: cannot find install-sh or install.sh in "." "./.." > > "./../.." > > [jli at dev01 rsyslog]$ autoconf > > configure.ac:29: warning: AC_CACHE_VAL(lt_prog_compiler_pic_works, > > ...): > > suspicious cache-id, must contain _cv_ to be cached > > ../../lib/autoconf/general.m4:1974: AC_CACHE_VAL is expanded from... > > ../../lib/autoconf/general.m4:1994: AC_CACHE_CHECK is expanded > from... > > aclocal.m4:621: AC_LIBTOOL_COMPILER_OPTION is expanded from... > > aclocal.m4:4829: AC_LIBTOOL_PROG_COMPILER_PIC is expanded from... > > aclocal.m4:2674: _LT_AC_LANG_C_CONFIG is expanded from... > > aclocal.m4:2673: AC_LIBTOOL_LANG_C_CONFIG is expanded from... > > aclocal.m4:86: AC_LIBTOOL_SETUP is expanded from... > > aclocal.m4:66: _AC_PROG_LIBTOOL is expanded from... > > aclocal.m4:31: AC_PROG_LIBTOOL is expanded from... > > configure.ac:29: the top level > > configure.ac:29: warning: AC_CACHE_VAL(lt_prog_compiler_static_works, > > ...): > > suspicious cache-id, must contain _cv_ to be cached > > aclocal.m4:666: AC_LIBTOOL_LINKER_OPTION is expanded from... > > configure.ac:29: warning: > AC_CACHE_VAL(lt_prog_compiler_pic_works_CXX, > > ...): > > suspicious cache-id, must contain _cv_ to be cached > > aclocal.m4:2751: _LT_AC_LANG_CXX_CONFIG is expanded from... > > aclocal.m4:2750: AC_LIBTOOL_LANG_CXX_CONFIG is expanded from... > > aclocal.m4:1810: _LT_AC_TAGCONFIG is expanded from... > > configure.ac:29: warning: > > AC_CACHE_VAL(lt_prog_compiler_static_works_CXX, > > ...): suspicious cache-id, must contain _cv_ to be cached > > configure.ac:29: warning: > AC_CACHE_VAL(lt_prog_compiler_pic_works_F77, > > ...): > > suspicious cache-id, must contain _cv_ to be cached > > aclocal.m4:3914: _LT_AC_LANG_F77_CONFIG is expanded from... > > aclocal.m4:3913: AC_LIBTOOL_LANG_F77_CONFIG is expanded from... > > configure.ac:29: warning: > > AC_CACHE_VAL(lt_prog_compiler_static_works_F77, > > ...): suspicious cache-id, must contain _cv_ to be cached > > configure.ac:29: warning: > AC_CACHE_VAL(lt_prog_compiler_pic_works_GCJ, > > ...): > > suspicious cache-id, must contain _cv_ to be cached > > aclocal.m4:4016: _LT_AC_LANG_GCJ_CONFIG is expanded from... > > aclocal.m4:4015: AC_LIBTOOL_LANG_GCJ_CONFIG is expanded from... > > configure.ac:29: warning: > > AC_CACHE_VAL(lt_prog_compiler_static_works_GCJ, > > ...): suspicious cache-id, must contain _cv_ to be cached > > > > In the same box, I am able to build with the 5.5.5 release download. > > > > Any idea? > > > > Thanks. > > > > > > -- > > John Jun Li > > jli at jlisbz.com > > > > > > On Tue, Jun 1, 2010 at 1:03 PM, Rainer Gerhards > > wrote: > > > > > David, John, > > > > > > I made excellent progress today (it looks like it pays to take a > few > > days > > > off). I have just committed a new version that contains the actual > > plugin > > > interface. I will work more on it tomorrow. If all goes well, the > > full > > > functionality may become available tomorrow :) [but no promises]. > > > > > > John may want to have a look at ./tools/smtradfile.c to get an idea > > of what > > > the module does and check if it is useful for him. > > > > > > Rainer > > > > > > > -----Original Message----- > > > > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > > > > bounces at lists.adiscon.com] On Behalf Of david at lang.hm > > > > Sent: Tuesday, June 01, 2010 4:08 PM > > > > To: rsyslog-users > > > > Subject: Re: [rsyslog] Where is the output module for > > > > theudptransportationtoremote syslog server > > > > > > > > Ok, this looks like something I can test. I won't be working on > it > > > > today > > > > (I've been on a call since 4am local time so won't be doing much > of > > > > anything today:-) > > > > > > > > David Lang > > > > > > > > On Tue, 1 Jun 2010, Rainer Gerhards wrote: > > > > > > > > >> -----Original Message----- > > > > >> From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > > > > >> bounces at lists.adiscon.com] On Behalf Of John Li > > > > >> > > > > >> Hi Rainer, > > > > >> > > > > >> Sorry I didn't finish reading the long email yet as I just > dived > > > > into > > > > >> the > > > > >> ruleset module and tried to rewrite the message with submitMsg > > but > > > > no > > > > >> success yet. > > > > > > > > > > No problem, but keep on your mind that I have something boiling > > right > > > > now. I > > > > > will blog about it soon, but am currently tied in some other > > > > activity. But > > > > > have a look at this git commit: > > > > > > > > > > > > > > > > > http://git.adiscon.com/?p=rsyslog.git;a=commitdiff;h=59227a861821b2e0e3 > > > > 7357c0 > > > > > 695f6b3d9f11dd9d > > > > > > > > > >> > > > > >> In general, the use case is for those SEM (Security Event > > > > Management). > > > > >> They > > > > >> have their recommended syslog format and it will be much > easier > > to > > > > >> convert > > > > >> the event in their format before send it over. > > > > > > > > > > My question is why you need to persist the string you generate. > > Do > > > > you use it > > > > > multiple times or just because you need to feed it ONE time > into > > ONE > > > > other > > > > > action? > > > > > > > > > > Rainer > > > > > > > > > >> I promise will read the long email and hope I can provide some > > > > useful > > > > >> things > > > > >> here. > > > > >> > > > > >> Thanks a lot for your work. > > > > >> > > > > >> -- > > > > >> John Jun Li > > > > >> jli at jlisbz.com > > > > >> > > > > >> > > > > >> > > > > >> On Tue, Jun 1, 2010 at 6:32 AM, Rainer Gerhards > > > > >> wrote: > > > > >> > > > > >>> John, > > > > >>> > > > > >>> quick question: > > > > >>> > > > > >>>> -----Original Message----- > > > > >>>> From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > > > > >>>> bounces at lists.adiscon.com] On Behalf Of John Li > > > > >>>> Sent: Monday, May 31, 2010 2:17 PM > > > > >>>> To: david at lang.hm; rsyslog-users > > > > >>>> Subject: Re: [rsyslog] Where is the output module for the > > > > >>>> udptransportationtoremote syslog server > > > > >>>> > > > > >>>> Thanks a lot. > > > > >>>> Currently i am stucked at the design that output module can > > not > > > > >> modify > > > > >>>> the msg to be seen by other output modules. > > > > >>> > > > > >>> While I think I understand why you need this functionality, I > > would > > > > >>> appreciate if you could elaborate on that need a bit. I am > > asking > > > > >> because I > > > > >>> want to understand the potential use cases (hopefully all) > > BEFORE I > > > > >> even > > > > >>> consider implementing a facility to support them. > > > > >>> > > > > >>> Also, do you have a comment to the longer message on template > > > > modules > > > > >> I > > > > >>> posted yesterday? > > > > >>> > > > > >>> Thanks, > > > > >>> Rainer > > > > >>> _______________________________________________ > > > > >>> rsyslog mailing list > > > > >>> http://lists.adiscon.net/mailman/listinfo/rsyslog > > > > >>> http://www.rsyslog.com > > > > >>> > > > > >> _______________________________________________ > > > > >> rsyslog mailing list > > > > >> http://lists.adiscon.net/mailman/listinfo/rsyslog > > > > >> http://www.rsyslog.com > > > > > _______________________________________________ > > > > > rsyslog mailing list > > > > > http://lists.adiscon.net/mailman/listinfo/rsyslog > > > > > http://www.rsyslog.com > > > > > > > > > _______________________________________________ > > > > rsyslog mailing list > > > > http://lists.adiscon.net/mailman/listinfo/rsyslog > > > > http://www.rsyslog.com > > > _______________________________________________ > > > rsyslog mailing list > > > http://lists.adiscon.net/mailman/listinfo/rsyslog > > > http://www.rsyslog.com > > > > > _______________________________________________ > > rsyslog mailing list > > http://lists.adiscon.net/mailman/listinfo/rsyslog > > http://www.rsyslog.com > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com _______________________________________________ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com From jli at jlisbz.com Mon Jun 7 07:53:36 2010 From: jli at jlisbz.com (John Li) Date: Mon, 7 Jun 2010 01:53:36 -0400 Subject: [rsyslog] Where is the output modulefortheudptransportationtoremote syslog server In-Reply-To: <-3986835649979497076@unknownmsgid> References: <-3986835649979497076@unknownmsgid> Message-ID: Hi Rainer, The system is giving error like " rsyslogd-2159: Template 'TestTemplate': error -2159 defining template via strgen module [try http://www.rsyslog.com/e/2159 ]" when I try to use my strgen module with the test template. I traced the call to FindStrgen and found the pStrgenLstRoot has only the four standard strgen there. I guess that I need to find a place to add the new strgen module there but exhausted my ideas on where that could be. Can you please give some hint? Thanks a lot. -- John Jun Li jli at jlisbz.com On Fri, Jun 4, 2010 at 8:31 AM, John Li wrote: > Thanks Rainer. This is great. I will create some of the custom string > builder and put it back to your directory soon. > > Sent from my HTC > > -----Original Message----- > From: Rainer Gerhards > Sent: June 4, 2010 7:41 AM > To: rsyslog-users > Subject: Re: [rsyslog] Where is the output > modulefortheudptransportationtoremote syslog server > > > John, David, > > I have just completed the implementation of this feature. The new code has > been moved to the master branch and will probably released as a new version > early next week. > > Details in my blog post (be sure to follow the links): > > http://blog.gerhards.net/2010/06/rsyslog-string-generators-done.html > > Any experience reports are appreciated. > > John, I think that you can use this for your project. The new modules are > located in ./tools/sm*.[ch]. They are not in ./plugins because the current > set are build-in modules, only. > > Rainer > > > -----Original Message----- > > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > > bounces at lists.adiscon.com] On Behalf Of Rainer Gerhards > > Sent: Wednesday, June 02, 2010 2:42 PM > > To: rsyslog-users > > Subject: Re: [rsyslog] Where is the output > > modulefortheudptransportationtoremote syslog server > > > > I guess autoreconf -fvi (options!) is missing, please also see > > > > http://www.rsyslog.com/doc-build_from_repo.html > > > > Rainer > > > > > -----Original Message----- > > > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > > > bounces at lists.adiscon.com] On Behalf Of John Li > > > Sent: Wednesday, June 02, 2010 4:57 AM > > > To: rsyslog-users > > > Subject: Re: [rsyslog] Where is the output module > > > fortheudptransportationtoremote syslog server > > > > > > I cloned the rsyslog project and switched to master-templateFuncation > > > but > > > could not build it. Here is the output: > > > > > > [jli at dev01 rsyslog]$ ./configure > > > configure: error: cannot find install-sh or install.sh in "." "./.." > > > "./../.." > > > [jli at dev01 rsyslog]$ autoconf > > > configure.ac:29: warning: AC_CACHE_VAL(lt_prog_compiler_pic_works, > > > ...): > > > suspicious cache-id, must contain _cv_ to be cached > > > ../../lib/autoconf/general.m4:1974: AC_CACHE_VAL is expanded from... > > > ../../lib/autoconf/general.m4:1994: AC_CACHE_CHECK is expanded > > from... > > > aclocal.m4:621: AC_LIBTOOL_COMPILER_OPTION is expanded from... > > > aclocal.m4:4829: AC_LIBTOOL_PROG_COMPILER_PIC is expanded from... > > > aclocal.m4:2674: _LT_AC_LANG_C_CONFIG is expanded from... > > > aclocal.m4:2673: AC_LIBTOOL_LANG_C_CONFIG is expanded from... > > > aclocal.m4:86: AC_LIBTOOL_SETUP is expanded from... > > > aclocal.m4:66: _AC_PROG_LIBTOOL is expanded from... > > > aclocal.m4:31: AC_PROG_LIBTOOL is expanded from... > > > configure.ac:29: the top level > > > configure.ac:29: warning: AC_CACHE_VAL(lt_prog_compiler_static_works, > > > ...): > > > suspicious cache-id, must contain _cv_ to be cached > > > aclocal.m4:666: AC_LIBTOOL_LINKER_OPTION is expanded from... > > > configure.ac:29: warning: > > AC_CACHE_VAL(lt_prog_compiler_pic_works_CXX, > > > ...): > > > suspicious cache-id, must contain _cv_ to be cached > > > aclocal.m4:2751: _LT_AC_LANG_CXX_CONFIG is expanded from... > > > aclocal.m4:2750: AC_LIBTOOL_LANG_CXX_CONFIG is expanded from... > > > aclocal.m4:1810: _LT_AC_TAGCONFIG is expanded from... > > > configure.ac:29: warning: > > > AC_CACHE_VAL(lt_prog_compiler_static_works_CXX, > > > ...): suspicious cache-id, must contain _cv_ to be cached > > > configure.ac:29: warning: > > AC_CACHE_VAL(lt_prog_compiler_pic_works_F77, > > > ...): > > > suspicious cache-id, must contain _cv_ to be cached > > > aclocal.m4:3914: _LT_AC_LANG_F77_CONFIG is expanded from... > > > aclocal.m4:3913: AC_LIBTOOL_LANG_F77_CONFIG is expanded from... > > > configure.ac:29: warning: > > > AC_CACHE_VAL(lt_prog_compiler_static_works_F77, > > > ...): suspicious cache-id, must contain _cv_ to be cached > > > configure.ac:29: warning: > > AC_CACHE_VAL(lt_prog_compiler_pic_works_GCJ, > > > ...): > > > suspicious cache-id, must contain _cv_ to be cached > > > aclocal.m4:4016: _LT_AC_LANG_GCJ_CONFIG is expanded from... > > > aclocal.m4:4015: AC_LIBTOOL_LANG_GCJ_CONFIG is expanded from... > > > configure.ac:29: warning: > > > AC_CACHE_VAL(lt_prog_compiler_static_works_GCJ, > > > ...): suspicious cache-id, must contain _cv_ to be cached > > > > > > In the same box, I am able to build with the 5.5.5 release download. > > > > > > Any idea? > > > > > > Thanks. > > > > > > > > > -- > > > John Jun Li > > > jli at jlisbz.com > > > > > > > > > On Tue, Jun 1, 2010 at 1:03 PM, Rainer Gerhards > > > wrote: > > > > > > > David, John, > > > > > > > > I made excellent progress today (it looks like it pays to take a > > few > > > days > > > > off). I have just committed a new version that contains the actual > > > plugin > > > > interface. I will work more on it tomorrow. If all goes well, the > > > full > > > > functionality may become available tomorrow :) [but no promises]. > > > > > > > > John may want to have a look at ./tools/smtradfile.c to get an idea > > > of what > > > > the module does and check if it is useful for him. > > > > > > > > Rainer > > > > > > > > > -----Original Message----- > > > > > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > > > > > bounces at lists.adiscon.com] On Behalf Of david at lang.hm > > > > > Sent: Tuesday, June 01, 2010 4:08 PM > > > > > To: rsyslog-users > > > > > Subject: Re: [rsyslog] Where is the output module for > > > > > theudptransportationtoremote syslog server > > > > > > > > > > Ok, this looks like something I can test. I won't be working on > > it > > > > > today > > > > > (I've been on a call since 4am local time so won't be doing much > > of > > > > > anything today:-) > > > > > > > > > > David Lang > > > > > > > > > > On Tue, 1 Jun 2010, Rainer Gerhards wrote: > > > > > > > > > > >> -----Original Message----- > > > > > >> From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > > > > > >> bounces at lists.adiscon.com] On Behalf Of John Li > > > > > >> > > > > > >> Hi Rainer, > > > > > >> > > > > > >> Sorry I didn't finish reading the long email yet as I just > > dived > > > > > into > > > > > >> the > > > > > >> ruleset module and tried to rewrite the message with submitMsg > > > but > > > > > no > > > > > >> success yet. > > > > > > > > > > > > No problem, but keep on your mind that I have something boiling > > > right > > > > > now. I > > > > > > will blog about it soon, but am currently tied in some other > > > > > activity. But > > > > > > have a look at this git commit: > > > > > > > > > > > > > > > > > > > > > > http://git.adiscon.com/?p=rsyslog.git;a=commitdiff;h=59227a861821b2e0e3 > > > > > 7357c0 > > > > > > 695f6b3d9f11dd9d > > > > > > > > > > > >> > > > > > >> In general, the use case is for those SEM (Security Event > > > > > Management). > > > > > >> They > > > > > >> have their recommended syslog format and it will be much > > easier > > > to > > > > > >> convert > > > > > >> the event in their format before send it over. > > > > > > > > > > > > My question is why you need to persist the string you generate. > > > Do > > > > > you use it > > > > > > multiple times or just because you need to feed it ONE time > > into > > > ONE > > > > > other > > > > > > action? > > > > > > > > > > > > Rainer > > > > > > > > > > > >> I promise will read the long email and hope I can provide some > > > > > useful > > > > > >> things > > > > > >> here. > > > > > >> > > > > > >> Thanks a lot for your work. > > > > > >> > > > > > >> -- > > > > > >> John Jun Li > > > > > >> jli at jlisbz.com > > > > > >> > > > > > >> > > > > > >> > > > > > >> On Tue, Jun 1, 2010 at 6:32 AM, Rainer Gerhards > > > > > >> wrote: > > > > > >> > > > > > >>> John, > > > > > >>> > > > > > >>> quick question: > > > > > >>> > > > > > >>>> -----Original Message----- > > > > > >>>> From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > > > > > >>>> bounces at lists.adiscon.com] On Behalf Of John Li > > > > > >>>> Sent: Monday, May 31, 2010 2:17 PM > > > > > >>>> To: david at lang.hm; rsyslog-users > > > > > >>>> Subject: Re: [rsyslog] Where is the output module for the > > > > > >>>> udptransportationtoremote syslog server > > > > > >>>> > > > > > >>>> Thanks a lot. > > > > > >>>> Currently i am stucked at the design that output module can > > > not > > > > > >> modify > > > > > >>>> the msg to be seen by other output modules. > > > > > >>> > > > > > >>> While I think I understand why you need this functionality, I > > > would > > > > > >>> appreciate if you could elaborate on that need a bit. I am > > > asking > > > > > >> because I > > > > > >>> want to understand the potential use cases (hopefully all) > > > BEFORE I > > > > > >> even > > > > > >>> consider implementing a facility to support them. > > > > > >>> > > > > > >>> Also, do you have a comment to the longer message on template > > > > > modules > > > > > >> I > > > > > >>> posted yesterday? > > > > > >>> > > > > > >>> Thanks, > > > > > >>> Rainer > > > > > >>> _______________________________________________ > > > > > >>> rsyslog mailing list > > > > > >>> http://lists.adiscon.net/mailman/listinfo/rsyslog > > > > > >>> http://www.rsyslog.com > > > > > >>> > > > > > >> _______________________________________________ > > > > > >> rsyslog mailing list > > > > > >> http://lists.adiscon.net/mailman/listinfo/rsyslog > > > > > >> http://www.rsyslog.com > > > > > > _______________________________________________ > > > > > > rsyslog mailing list > > > > > > http://lists.adiscon.net/mailman/listinfo/rsyslog > > > > > > http://www.rsyslog.com > > > > > > > > > > > _______________________________________________ > > > > > rsyslog mailing list > > > > > http://lists.adiscon.net/mailman/listinfo/rsyslog > > > > > http://www.rsyslog.com > > > > _______________________________________________ > > > > rsyslog mailing list > > > > http://lists.adiscon.net/mailman/listinfo/rsyslog > > > > http://www.rsyslog.com > > > > > > > _______________________________________________ > > > rsyslog mailing list > > > http://lists.adiscon.net/mailman/listinfo/rsyslog > > > http://www.rsyslog.com > > _______________________________________________ > > rsyslog mailing list > > http://lists.adiscon.net/mailman/listinfo/rsyslog > > http://www.rsyslog.com > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com > From rgerhards at hq.adiscon.com Mon Jun 7 14:43:45 2010 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Mon, 7 Jun 2010 14:43:45 +0200 Subject: [rsyslog] discussion request: performance enhancement for imtcp Message-ID: <9B6E2A8877C38245BFB15CC491A11DA7103E61@GRFEXC.intern.adiscon.com> Hi all, I have now shifted my focus to enhancing the multi-core utilization with imtcp. So far, we have a single epoll-loop (or select, if epoll is not supported), which obviously limits concurrency in some environments. I intend to remove that limit, or at least move the actual value much further. There are a couple of things to think about. There is one relatively simple approach, but if it works is pretty much depending on how things are deployed in practice. So I need your help. I would appreciate if you could read http://blog.gerhards.net/2010/06/further-improving-tcp-input-performance.html and share your comments. I have created a blogpost because that makes it easier for me to keep the text as reference. While the post is not exactly short, it also is not exhaustive, so please don't feel discouraged from reading it just because my thoughts are on a web site rather than included inline. Thanks to all, Rainer From dirk.schulz at kinzesberg.de Mon Jun 7 15:08:12 2010 From: dirk.schulz at kinzesberg.de (Dirk H. Schulz) Date: Mon, 07 Jun 2010 15:08:12 +0200 Subject: [rsyslog] Pulling syslog messages? Message-ID: <4C0CEF3C.8040005@kinzesberg.de> Hi folks, I have stumbled over a difficult question. In a developed security environment where you run several network zones with the most important data/servers in the inner zones and "outside contact servers" like web proxies in the outer zone - in such an environment a central syslog server should be positioned somewhere in the inner zones, but the servers in the outer zones must not be allowed to push messages into the inner zones - these messages have to be fetched by the central servers from the outside zones' servers. As far as I understand it, the syslogds implement a push model for remote logging, and I never heard of a syslogd pull model, but there clearly is the need for one. Has anyone out there already thought about this and what did you do? Any ideas from those who didn't? Thanks for any hint or help. Dirk From ktm at rice.edu Mon Jun 7 15:26:43 2010 From: ktm at rice.edu (Kenneth Marshall) Date: Mon, 7 Jun 2010 08:26:43 -0500 Subject: [rsyslog] Pulling syslog messages? In-Reply-To: <4C0CEF3C.8040005@kinzesberg.de> References: <4C0CEF3C.8040005@kinzesberg.de> Message-ID: <20100607132642.GJ3063@aart.is.rice.edu> On Mon, Jun 07, 2010 at 03:08:12PM +0200, Dirk H. Schulz wrote: > Hi folks, > > I have stumbled over a difficult question. > > In a developed security environment where you run several network zones > with the most important data/servers in the inner zones and "outside > contact servers" like web proxies in the outer zone - in such an > environment a central syslog server should be positioned somewhere in > the inner zones, but the servers in the outer zones must not be allowed > to push messages into the inner zones - these messages have to be > fetched by the central servers from the outside zones' servers. > > As far as I understand it, the syslogds implement a push model for > remote logging, and I never heard of a syslogd pull model, but there > clearly is the need for one. > > Has anyone out there already thought about this and what did you do? Any > ideas from those who didn't? > > Thanks for any hint or help. > > Dirk Hi Dirk, One approach would be to use ssh/scp to grab the log files from the outside systems. Then you could use the omfile module to inject them into the system. Obviously, you would need to provide a lot of data sanity checking before actually loading the data if the inside zone is really so locked down that even a logging connection to the syslog server is not acceptable. It sounds like you may want to put your log server outside and not inside. Regards, Ken From david at lang.hm Mon Jun 7 16:19:09 2010 From: david at lang.hm (david at lang.hm) Date: Mon, 7 Jun 2010 07:19:09 -0700 (PDT) Subject: [rsyslog] Pulling syslog messages? In-Reply-To: <4C0CEF3C.8040005@kinzesberg.de> References: <4C0CEF3C.8040005@kinzesberg.de> Message-ID: On Mon, 7 Jun 2010, Dirk H. Schulz wrote: > Hi folks, > > I have stumbled over a difficult question. > > In a developed security environment where you run several network zones > with the most important data/servers in the inner zones and "outside > contact servers" like web proxies in the outer zone - in such an > environment a central syslog server should be positioned somewhere in > the inner zones, but the servers in the outer zones must not be allowed > to push messages into the inner zones - these messages have to be > fetched by the central servers from the outside zones' servers. Yes and No. The reason for not wanting to push data in is that the data originating from the less secure zones may be corrupt and/or an attack. Pulling the same data in doesn't help protect against this. On the other hand, one of the big reasons for getting the logs from the less secure zone is for forensic purposes. In this case you want the logs to not sit on the outside (where they can be tampered with) if you can avoid it. Pushing the data into the central server as each log is generated is far better than polling to get the accumulated messages would be. The syslog protocol is a fairly nice combination of some strictly defined elements and loosely defined content. As a result, things accepting and processing the messages have to accpet a lot of strange stuff and deal with it sanely, but at the same time you can setup fairly simple systems to detect strange things being sent to you. What I do is to have dedicated relay boxes that are hardened and monitored to receive the messages from the untrusted zones, they then perform any fixups needed to the logs (including escaping binary characters in log messages), and then sending the santitized messages on to the central log server. There are many schemes out there to have systems write logs locally and then poll to retrieve the files. rsyslog supports reading log messages from files to you could easily insert the logs back into a normal syslog stream, but I always try to avoid such schemes. They tend to be fragile, and they leave the logs on untrusted systems where they can be tampered with for a window of time. David Lang From david at lang.hm Mon Jun 7 16:35:15 2010 From: david at lang.hm (david at lang.hm) Date: Mon, 7 Jun 2010 07:35:15 -0700 (PDT) Subject: [rsyslog] discussion request: performance enhancement for imtcp In-Reply-To: <9B6E2A8877C38245BFB15CC491A11DA7103E61@GRFEXC.intern.adiscon.com> References: <9B6E2A8877C38245BFB15CC491A11DA7103E61@GRFEXC.intern.adiscon.com> Message-ID: On Mon, 7 Jun 2010, Rainer Gerhards wrote: > I have now shifted my focus to enhancing the multi-core utilization with > imtcp. So far, we have a single epoll-loop (or select, if epoll is not > supported), which obviously limits concurrency in some environments. I intend > to remove that limit, or at least move the actual value much further. > > There are a couple of things to think about. There is one relatively simple > approach, but if it works is pretty much depending on how things are deployed > in practice. > > So I need your help. I would appreciate if you could read > > http://blog.gerhards.net/2010/06/further-improving-tcp-input-performance.html > > and share your comments. I have created a blogpost because that makes it > easier for me to keep the text as reference. While the post is not exactly > short, it also is not exhaustive, so please don't feel discouraged from > reading it just because my thoughts are on a web site rather than included > inline. for some reason I can't comment on the blogpost, so I'll do it here. I'm surprised to see this as a problem (especially as my experiance has been that the bottlenecks are on the output side, not the input side) the data is serialized as it arrives over the wire (at least if you have a single ethernet port in use), and with epoll I would expect a single thread to have no problem pulling the data from the network stack and putting it somewhere. I think that more research needs to be done on what is eating up the time in your test cases. If it's DNS lookups, they can be disabled (and/or a name cache can be created as we have discussed before) It may be that the parsing that's being done is what's taking the time here, so I would consider soemthing like the following one thread to pull the data from the wire and dispatch it to N worker threads that would parse the message and put the result into the main queue. even late last year with UDP messages I was able to saturate a Gig-E network with packets and receive them with <25% of a single cpu. I would not expect that TCP would have noticably more overhead. David Lang From friedl at hq.adiscon.com Mon Jun 7 16:38:49 2010 From: friedl at hq.adiscon.com (Florian Riedl) Date: Mon, 7 Jun 2010 16:38:49 +0200 Subject: [rsyslog] New rsyslog website - Message-ID: <9B6E2A8877C38245BFB15CC491A11DA7103E63@GRFEXC.intern.adiscon.com> Dear members of the rsyslog mailing list. My name is Florian Riedl and I work for Adiscon, the creators of rsyslog. We are currently in the process of creating a new website for rsyslog. We found the old website to be not comprehendible enough and the design was pretty old and far from any other current web experience you can find out there in the web. And since rsyslog is a open source project, we want the community behind rsyslog to be involved in the process of developing the new website. We want you to think about some ideas we have how other ideas could be achieved. Currently, we need to discuss the following: - Design - Primarily the template with color schemes and graphics - How far back should we preserve the changelog, news entries and downloads? - Content additions that would be useful? We would be happy about just every kind of ideas and opinions. We would also appreciate a lot if someone is willing to aid us in regards of graphics or maintaining the site itself. Contributions to the website will be mentioned of course. The new website is currently available under http://new.rsyslog.com. So please take a look at the website and helps us create a good web experience for rsyslog. In the end, this will hopefully result in the rsyslog community growing and thus making rsyslog even better. Best wishes and thank you in advance, Florian Riedl From dirk.schulz at kinzesberg.de Mon Jun 7 17:10:22 2010 From: dirk.schulz at kinzesberg.de (Dirk H. Schulz) Date: Mon, 07 Jun 2010 17:10:22 +0200 Subject: [rsyslog] Pulling syslog messages? In-Reply-To: References: <4C0CEF3C.8040005@kinzesberg.de> Message-ID: <4C0D0BDE.80907@kinzesberg.de> Hi David, Am 07.06.10 16:19, schrieb david at lang.hm: > On Mon, 7 Jun 2010, Dirk H. Schulz wrote: > > >> Hi folks, >> >> I have stumbled over a difficult question. >> >> In a developed security environment where you run several network zones >> with the most important data/servers in the inner zones and "outside >> contact servers" like web proxies in the outer zone - in such an >> environment a central syslog server should be positioned somewhere in >> the inner zones, but the servers in the outer zones must not be allowed >> to push messages into the inner zones - these messages have to be >> fetched by the central servers from the outside zones' servers. >> > Yes and No. > > The reason for not wanting to push data in is that the data originating > from the less secure zones may be corrupt and/or an attack. > > Pulling the same data in doesn't help protect against this. > > On the other hand, one of the big reasons for getting the logs from the > less secure zone is for forensic purposes. In this case you want the logs > to not sit on the outside (where they can be tampered with) if you can > avoid it. Pushing the data into the central server as each log is > generated is far better than polling to get the accumulated messages would > be. > Thanks for your elaborated answer. There is a phenomenon you did not address: enterprise management. :-) We have to implement policies defining the rules I described, be they technically sensible or not. > The syslog protocol is a fairly nice combination of some strictly defined > elements and loosely defined content. As a result, things accepting and > processing the messages have to accpet a lot of strange stuff and deal > with it sanely, but at the same time you can setup fairly simple systems > to detect strange things being sent to you. > > What I do is to have dedicated relay boxes that are hardened and monitored > to receive the messages from the untrusted zones, they then perform any > fixups needed to the logs (including escaping binary characters in log > messages), and then sending the santitized messages on to the central log > server. > That sounds like a promising compromise I could offer the policy guys. Could you please post an example of how you escape binary characters? > There are many schemes out there to have systems write logs locally and > then poll to retrieve the files. rsyslog supports reading log messages > from files to you could easily insert the logs back into a normal syslog > stream, but I always try to avoid such schemes. They tend to be fragile, > and they leave the logs on untrusted systems where they can be tampered > with for a window of time. > That is why I instinctively did not want to explore that kind of solution. Thanks a lot, David! Dirk From david at lang.hm Mon Jun 7 17:27:25 2010 From: david at lang.hm (david at lang.hm) Date: Mon, 7 Jun 2010 08:27:25 -0700 (PDT) Subject: [rsyslog] Pulling syslog messages? In-Reply-To: <4C0D0BDE.80907@kinzesberg.de> References: <4C0CEF3C.8040005@kinzesberg.de> <4C0D0BDE.80907@kinzesberg.de> Message-ID: On Mon, 7 Jun 2010, Dirk H. Schulz wrote: > Hi David, > > Am 07.06.10 16:19, schrieb david at lang.hm: >> On Mon, 7 Jun 2010, Dirk H. Schulz wrote: >> >> >>> Hi folks, >>> >>> I have stumbled over a difficult question. >>> >>> In a developed security environment where you run several network zones >>> with the most important data/servers in the inner zones and "outside >>> contact servers" like web proxies in the outer zone - in such an >>> environment a central syslog server should be positioned somewhere in >>> the inner zones, but the servers in the outer zones must not be allowed >>> to push messages into the inner zones - these messages have to be >>> fetched by the central servers from the outside zones' servers. >>> >> Yes and No. >> >> The reason for not wanting to push data in is that the data originating >> from the less secure zones may be corrupt and/or an attack. >> >> Pulling the same data in doesn't help protect against this. >> >> On the other hand, one of the big reasons for getting the logs from the >> less secure zone is for forensic purposes. In this case you want the logs >> to not sit on the outside (where they can be tampered with) if you can >> avoid it. Pushing the data into the central server as each log is >> generated is far better than polling to get the accumulated messages would >> be. >> > Thanks for your elaborated answer. There is a phenomenon you did not > address: enterprise management. :-) > We have to implement policies defining the rules I described, be they > technically sensible or not. as one of the paranoid security people setting such policies, I understand this. but it's something that can be addressed. >> The syslog protocol is a fairly nice combination of some strictly defined >> elements and loosely defined content. As a result, things accepting and >> processing the messages have to accpet a lot of strange stuff and deal >> with it sanely, but at the same time you can setup fairly simple systems >> to detect strange things being sent to you. >> >> What I do is to have dedicated relay boxes that are hardened and monitored >> to receive the messages from the untrusted zones, they then perform any >> fixups needed to the logs (including escaping binary characters in log >> messages), and then sending the santitized messages on to the central log >> server. >> > That sounds like a promising compromise I could offer the policy guys. > Could you please post an example of how you escape binary characters? rsyslog does this by default, replacing any non-printable ascii characters with #nnn where nnn is the octal value of the byte this confuses some parsers (when there is a tab for example), but it means that the output is pretty safe to deal with. On these hardened servers I also have extra stuff to deal with 'problem' sources (things that don't send properly formatted messages, like things that put a message sequence nubmer in the server position) David Lang >> There are many schemes out there to have systems write logs locally and >> then poll to retrieve the files. rsyslog supports reading log messages >> from files to you could easily insert the logs back into a normal syslog >> stream, but I always try to avoid such schemes. They tend to be fragile, >> and they leave the logs on untrusted systems where they can be tampered >> with for a window of time. >> > That is why I instinctively did not want to explore that kind of solution. > > Thanks a lot, David! > > Dirk > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com > From rgerhards at hq.adiscon.com Mon Jun 7 17:53:26 2010 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Mon, 7 Jun 2010 17:53:26 +0200 Subject: [rsyslog] discussion request: performance enhancement for imtcp References: <9B6E2A8877C38245BFB15CC491A11DA7103E61@GRFEXC.intern.adiscon.com> Message-ID: <9B6E2A8877C38245BFB15CC491A11DA7103E65@GRFEXC.intern.adiscon.com> > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of david at lang.hm > Sent: Monday, June 07, 2010 4:35 PM > To: rsyslog-users > Subject: Re: [rsyslog] discussion request: performance enhancement for > imtcp > > On Mon, 7 Jun 2010, Rainer Gerhards wrote: > > > I have now shifted my focus to enhancing the multi-core utilization > with > > imtcp. So far, we have a single epoll-loop (or select, if epoll is > not > > supported), which obviously limits concurrency in some environments. > I intend > > to remove that limit, or at least move the actual value much further. > > > > There are a couple of things to think about. There is one relatively > simple > > approach, but if it works is pretty much depending on how things are > deployed > > in practice. > > > > So I need your help. I would appreciate if you could read > > > > http://blog.gerhards.net/2010/06/further-improving-tcp-input- > performance.html > > > > and share your comments. I have created a blogpost because that makes > it > > easier for me to keep the text as reference. While the post is not > exactly > > short, it also is not exhaustive, so please don't feel discouraged > from > > reading it just because my thoughts are on a web site rather than > included > > inline. > > for some reason I can't comment on the blogpost, so I'll do it here. sorry, I forgot to mention comments should go to the mailing list ;) Some Chineese spammer recently rendered the blogger comment function useless, especially as blogger is dumb enough to force you to delete each (of the vast number) of spam comments manually. I did one round of deletion, but gave up after an hour or so... > I'm surprised to see this as a problem (especially as my experiance has > been that the bottlenecks are on the output side, not the input side) > > the data is serialized as it arrives over the wire (at least if you > have a > single ethernet port in use), and with epoll I would expect a single > thread to have no problem pulling the data from the network stack and > putting it somewhere. > At least this is a problem I got from some high performance sites. They had in common that the actual rule processing was very, very easy, like a *.* filter and just write to file actions. These are *extremely fast* (if you disagree, please do so on list, I would be very interested in that). BUT I need to mention that this was in v4, unfortunately not in v5. That meant that the event handling was done by select() and with select's bad performance for larger connection sets, that may be the culprit. HOWEVER, I got from the reports that the CPUs were NOT saturated (and the message rate lower) when listeners were run inside a single instance, but CPUs got saturated (and the message rate higher) when a couple of rsyslog instances ran. The only explanation I have for this is that the single instance actually did not manage to pull the data from the operating system buffers. I also failed to ask if the machines had multiple NICs, what would some more explain the effect seen. I myself unfortunately seem to have an insufficient lab environment to see this effect, that makes it a bit hard for me to judge. > I think that more research needs to be done on what is eating up the > time > in your test cases. > > If it's DNS lookups, they can be disabled (and/or a > name cache can be created as we have discussed before) That was for the cases I have seen hardly an issue -- many message were sent over each connection and for tcp the DNS lookup is only done during connection setup. Still it is a good reminder to finish that part of the code (full dns cache). > It may be that the parsing that's being done is what's taking the time > here, so I would consider soemthing like the following > > one thread to pull the data from the wire and dispatch it to N worker > threads that would parse the message and put the result into the main > queue. a) for v5 and some of v4, parsing is done no longer on the input side (and thus runs via a worker pool, main queue worker pool to be precise) b) this architecture requires more context switches, something I would really like to avoid. I guess it would even lead to far worse performance in the single listener case. > > > even late last year with UDP messages I was able to saturate a Gig-E > network with packets and receive them with <25% of a single cpu. I > would > not expect that TCP would have noticably more overhead. I fully agree, definitely far less (just think that I do need to do one API call for each message with UDP, while a can receive a hundreds of messages with a single API call in the case of TCP -- depending on receive buffer and message size). Rainer From david at lang.hm Mon Jun 7 18:10:54 2010 From: david at lang.hm (david at lang.hm) Date: Mon, 7 Jun 2010 09:10:54 -0700 (PDT) Subject: [rsyslog] discussion request: performance enhancement for imtcp In-Reply-To: <9B6E2A8877C38245BFB15CC491A11DA7103E65@GRFEXC.intern.adiscon.com> References: <9B6E2A8877C38245BFB15CC491A11DA7103E61@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103E65@GRFEXC.intern.adiscon.com> Message-ID: On Mon, 7 Jun 2010, Rainer Gerhards wrote: >> -----Original Message----- >> From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- >> bounces at lists.adiscon.com] On Behalf Of david at lang.hm > >> I'm surprised to see this as a problem (especially as my experiance has >> been that the bottlenecks are on the output side, not the input side) >> >> the data is serialized as it arrives over the wire (at least if you >> have a >> single ethernet port in use), and with epoll I would expect a single >> thread to have no problem pulling the data from the network stack and >> putting it somewhere. >> > > At least this is a problem I got from some high performance sites. They had > in common that the actual rule processing was very, very easy, like a *.* > filter and just write to file actions. These are *extremely fast* (if you > disagree, please do so on list, I would be very interested in that). in my experiance (with v5 and UDP messages) the thread that receives the messages is ~20% of the cpu utilization of the thread that writes the messages, even with a simple ruleset (mine is typically *.* /var/log/messages on the central boxes as well) > BUT I need to mention that this was in v4, unfortunately not in v5. That > meant that the event handling was done by select() and with select's bad > performance for larger connection sets, that may be the culprit. HOWEVER, I > got from the reports that the CPUs were NOT saturated (and the message rate > lower) when listeners were run inside a single instance, but CPUs got > saturated (and the message rate higher) when a couple of rsyslog instances > ran. The only explanation I have for this is that the single instance > actually did not manage to pull the data from the operating system buffers. hmm, this could be locking overhead as well. One thing that you did early in v5 (I don't think it made it into v4) was to allow the UDP receiver to insert multiple messages into the queue at once. That made a huge difference. > I also failed to ask if the machines had multiple NICs, what would some more > explain the effect seen. > > I myself unfortunately seem to have an insufficient lab environment to see > this effect, that makes it a bit hard for me to judge. see if they can do a strace of the various threads for a few seconds under high load. also, can they get you a tcpdump for a few seconds so you can see the number of sources, connections, etc? >> I think that more research needs to be done on what is eating up the >> time >> in your test cases. >> >> If it's DNS lookups, they can be disabled (and/or a >> name cache can be created as we have discussed before) > > That was for the cases I have seen hardly an issue -- many message were sent > over each connection and for tcp the DNS lookup is only done during > connection setup. Still it is a good reminder to finish that part of the code > (full dns cache). good point >> It may be that the parsing that's being done is what's taking the time >> here, so I would consider soemthing like the following >> >> one thread to pull the data from the wire and dispatch it to N worker >> threads that would parse the message and put the result into the main >> queue. > > a) for v5 and some of v4, parsing is done no longer on the input side (and > thus runs via a worker pool, main queue worker pool to be precise) > > b) this architecture requires more context switches, something I would really > like to avoid. I guess it would even lead to far worse performance in the > single listener case. >> >> >> even late last year with UDP messages I was able to saturate a Gig-E >> network with packets and receive them with <25% of a single cpu. I >> would >> not expect that TCP would have noticably more overhead. > > I fully agree, definitely far less (just think that I do need to do one API > call for each message with UDP, while a can receive a hundreds of messages > with a single API call in the case of TCP -- depending on receive buffer and > message size). so where is the time being spent? high performace http servers serving static content can do hundreds of thousands of connections in a single thread and saturate gig-E while doing so, this is more processing than rsyslog should have to do, so I am having trouble believing that you need to go to multiple threads to handle the input side of things. David Lang From dirk.schulz at kinzesberg.de Mon Jun 7 18:18:30 2010 From: dirk.schulz at kinzesberg.de (Dirk H. Schulz) Date: Mon, 07 Jun 2010 18:18:30 +0200 Subject: [rsyslog] New rsyslog website - In-Reply-To: <9B6E2A8877C38245BFB15CC491A11DA7103E63@GRFEXC.intern.adiscon.com> References: <9B6E2A8877C38245BFB15CC491A11DA7103E63@GRFEXC.intern.adiscon.com> Message-ID: <4C0D1BD6.7010403@kinzesberg.de> Hi Florian, Am 07.06.10 16:38, schrieb Florian Riedl: > Dear members of the rsyslog mailing list. > > > > My name is Florian Riedl and I work for Adiscon, the creators of rsyslog. > > > > We are currently in the process of creating a new website for rsyslog. We > found the old website to be not comprehendible enough and the design was > pretty old and far from any other current web experience you can find out > there in the web. > That is generally a good idea. I tried to find out if professional support for rsyslog is offered (and did not succeed), stumbled upon at least one dead link, and found the structure hard to understand. And the look and feel is ... somewhat antique, to say the best. > > > And since rsyslog is a open source project, we want the community behind > rsyslog to be involved in the process of developing the new website. > Well, to some extent this sounds like a bad idea - if I may say so. Technicians (like most of us community guys are) are simply unable to develop understandible information structures and visualization strategies. The current website sufferes exactly from this: it is structured from the perspective of the people offering something, not from the perspective of the people looking for something. I hope being so brusque is not too much of a problem. > We want you to think about some ideas we have how other ideas could be > achieved. Currently, we need to discuss the following: > > > > - Design - Primarily the template with color schemes and graphics > > - How far back should we preserve the changelog, news entries and > downloads? > > - Content additions that would be useful? > For someone coming from rsyslog to adiscon an explanation of how rsyslog and monitorware are related would be useful. > > > We would be happy about just every kind of ideas and opinions. We would also > appreciate a lot if someone is willing to aid us in regards of graphics or > maintaining the site itself. Contributions to the website will be mentioned > of course. > > > > The new website is currently available under http://new.rsyslog.com. > Just one more attempt at making you dislike me: The adiscon website should be designed to offer professional services, that means to target at enterprise customers. I personally think you should desperately aim at a professional design. This kind of homemade "construction kit" look is not appealing to the target people: enterprise customers. I am on the way to persuade my current customer (large national carrier) to deploy Rsyslog and buy professional support for it, but both websites (old and new) would be a real obstacle to that. I hope I can finish this via the feature track. :-) Okay, I shut up now. *cough* Dirk From rgerhards at hq.adiscon.com Mon Jun 7 18:23:22 2010 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Mon, 7 Jun 2010 18:23:22 +0200 Subject: [rsyslog] discussion request: performance enhancement for imtcp References: <9B6E2A8877C38245BFB15CC491A11DA7103E61@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103E65@GRFEXC.intern.adiscon.com> Message-ID: <9B6E2A8877C38245BFB15CC491A11DA7103E66@GRFEXC.intern.adiscon.com> > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of david at lang.hm > Sent: Monday, June 07, 2010 6:11 PM > To: rsyslog-users > Subject: Re: [rsyslog] discussion request: performance enhancement for > imtcp > > On Mon, 7 Jun 2010, Rainer Gerhards wrote: > > >> -----Original Message----- > >> From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > >> bounces at lists.adiscon.com] On Behalf Of david at lang.hm > > > >> I'm surprised to see this as a problem (especially as my experiance > has > >> been that the bottlenecks are on the output side, not the input > side) > >> > >> the data is serialized as it arrives over the wire (at least if you > >> have a > >> single ethernet port in use), and with epoll I would expect a single > >> thread to have no problem pulling the data from the network stack > and > >> putting it somewhere. > >> > > > > At least this is a problem I got from some high performance sites. > They had > > in common that the actual rule processing was very, very easy, like a > *.* > > filter and just write to file actions. These are *extremely fast* (if > you > > disagree, please do so on list, I would be very interested in that). > > in my experiance (with v5 and UDP messages) the thread that receives > the > messages is ~20% of the cpu utilization of the thread that writes the > messages, even with a simple ruleset (mine is typically *.* > /var/log/messages on the central boxes as well) that's interesting. I'll try to see if I can reproduce similar behavior. Do you have a chance to do a quick test with TCP in your lab? The input should be even further down (for lower, I'd expect). > > BUT I need to mention that this was in v4, unfortunately not in v5. > That > > meant that the event handling was done by select() and with select's > bad > > performance for larger connection sets, that may be the culprit. > HOWEVER, I > > got from the reports that the CPUs were NOT saturated (and the > message rate > > lower) when listeners were run inside a single instance, but CPUs got > > saturated (and the message rate higher) when a couple of rsyslog > instances > > ran. The only explanation I have for this is that the single instance > > actually did not manage to pull the data from the operating system > buffers. > > hmm, this could be locking overhead as well. One thing that you did > early > in v5 (I don't think it made it into v4) was to allow the UDP receiver > to > insert multiple messages into the queue at once. That made a huge > difference. No, I think that was something I did to both versions. At some time, I did optimizations to both v4 and v5, things like reducing copies, reducing malloc calls and so on. I am pretty sure submission batching was among them. > > > I also failed to ask if the machines had multiple NICs, what would > some more > > explain the effect seen. > > > > I myself unfortunately seem to have an insufficient lab environment > to see > > this effect, that makes it a bit hard for me to judge. > > see if they can do a strace of the various threads for a few seconds > under > high load. > > also, can they get you a tcpdump for a few seconds so you can see the > number of sources, connections, etc? I'll try to obtain more info. > > >> I think that more research needs to be done on what is eating up the > >> time > >> in your test cases. > >> > >> If it's DNS lookups, they can be disabled (and/or a > >> name cache can be created as we have discussed before) > > > > That was for the cases I have seen hardly an issue -- many message > were sent > > over each connection and for tcp the DNS lookup is only done during > > connection setup. Still it is a good reminder to finish that part of > the code > > (full dns cache). > > good point > > >> It may be that the parsing that's being done is what's taking the > time > >> here, so I would consider soemthing like the following > >> > >> one thread to pull the data from the wire and dispatch it to N > worker > >> threads that would parse the message and put the result into the > main > >> queue. > > > > a) for v5 and some of v4, parsing is done no longer on the input side > (and > > thus runs via a worker pool, main queue worker pool to be precise) > > > > b) this architecture requires more context switches, something I > would really > > like to avoid. I guess it would even lead to far worse performance in > the > > single listener case. > >> > >> > >> even late last year with UDP messages I was able to saturate a Gig-E > >> network with packets and receive them with <25% of a single cpu. I > >> would > >> not expect that TCP would have noticably more overhead. > > > > I fully agree, definitely far less (just think that I do need to do > one API > > call for each message with UDP, while a can receive a hundreds of > messages > > with a single API call in the case of TCP -- depending on receive > buffer and > > message size). > > so where is the time being spent? > > high performace http servers serving static content can do hundreds of > thousands of connections in a single thread and saturate gig-E while > doing > so, this is more processing than rsyslog should have to do, so I am > having > trouble believing that you need to go to multiple threads to handle the > input side of things. That's a very convincing argument. I'll go back in my cycle to find evidence of the problem and then see if I am addressing the proper culprit. So it looks like some other optimization is going to come first ;) Thanks, Rainer From epiphani at gmail.com Mon Jun 7 19:20:25 2010 From: epiphani at gmail.com (Aaron Wiebe) Date: Mon, 7 Jun 2010 13:20:25 -0400 Subject: [rsyslog] New rsyslog website - In-Reply-To: <9B6E2A8877C38245BFB15CC491A11DA7103E63@GRFEXC.intern.adiscon.com> References: <9B6E2A8877C38245BFB15CC491A11DA7103E63@GRFEXC.intern.adiscon.com> Message-ID: One thing that I would personally recommend is a complete revamp of the documentation - it tends to be a fairly large complaint that I've heard. Specifically, the docs tend to be fragmented and organized in such a way that don't necessarily flow. Simple things around adding a simple tree structure to the docs would help immensely. As it stands, it can be quite difficult to find any specific docs. -Aaron On Mon, Jun 7, 2010 at 10:38 AM, Florian Riedl wrote: > Dear members of the rsyslog mailing list. > > > > My name is Florian Riedl and I work for Adiscon, the creators of rsyslog. > > > > We are currently in the process of creating a new website for rsyslog. We > found the old website to be not comprehendible enough and the design was > pretty old and far from any other current web experience you can find out > there in the web. > > > > And since rsyslog is a open source project, we want the community behind > rsyslog to be involved in the process of developing the new website. > > > > We want you to think about some ideas we have how other ideas could be > achieved. Currently, we need to discuss the following: > > > > - ? ? ? ? ?Design - Primarily the template with color schemes and graphics > > - ? ? ? ? ?How far back should we preserve the changelog, news entries and > downloads? > > - ? ? ? ? ?Content additions that would be useful? > > > > We would be happy about just every kind of ideas and opinions. We would also > appreciate a lot if someone is willing to aid us in regards of graphics or > maintaining the site itself. Contributions to the website will be mentioned > of course. > > > > The new website is currently available under http://new.rsyslog.com. > > > > So please take a look at the website and helps us create a good web > experience for rsyslog. In the end, this will hopefully result in the rsyslog > community growing and thus making rsyslog even better. > > > > Best wishes and thank you in advance, > > > > Florian Riedl > > > > > > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com > From epiphani at gmail.com Mon Jun 7 19:16:40 2010 From: epiphani at gmail.com (Aaron Wiebe) Date: Mon, 7 Jun 2010 13:16:40 -0400 Subject: [rsyslog] discussion request: performance enhancement for imtcp In-Reply-To: <9B6E2A8877C38245BFB15CC491A11DA7103E66@GRFEXC.intern.adiscon.com> References: <9B6E2A8877C38245BFB15CC491A11DA7103E61@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103E65@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103E66@GRFEXC.intern.adiscon.com> Message-ID: Greetings, >> hmm, this could be locking overhead as well. One thing that you did >> early >> in v5 (I don't think it made it into v4) was to allow the UDP receiver >> to >> insert multiple messages into the queue at once. That made a huge >> difference. > > No, I think that was something I did to both versions. At some time, I did > optimizations to both v4 and v5, things like reducing copies, reducing malloc > calls and so on. I am pretty sure submission batching was among them. I agree with David actually. While multiple tcp threads on the input side certainly would be helpful, I believe the locking overhead is likely the real culprit behind the inability to fully utilize a multi-core machine with a single instance of rsyslog. In my experience, while the input thread was certainly relatively busy, the thread itself wasn't hitting a cpu bottleneck. Reducing some of the latencies around queuing and context switching is probably the best place to spend time if the goal is improved performance. The earlier investigations into lockless queues combined with some batching may help to address these. As it stands, I don't regularly see specific threads hitting cpu bottlenecks (assuming top -H is accurate). Also, if that is the problem (queues and context switching), adding further division of work into imtcp may actually make the problem worse. That said, I'm not against reducing possible bottlenecks to get into the 1-10 gig input levels (at which this would probably become an issue) - but I think the queues should be more closely examined first. -Aaron From rgerhards at hq.adiscon.com Mon Jun 7 22:11:37 2010 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Mon, 7 Jun 2010 22:11:37 +0200 Subject: [rsyslog] discussion request: performance enhancement for imtcp References: <9B6E2A8877C38245BFB15CC491A11DA7103E61@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103E65@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103E66@GRFEXC.intern.adiscon.com> Message-ID: <9B6E2A8877C38245BFB15CC491A11DA7103E67@GRFEXC.intern.adiscon.com> Aaron, David, thanks again for the feedback. I did some testing the past few hours, but all of them on v5, so the number may be quite different for v4. Based on what I have seen, the issue seems not to be lock contention, but rather locking quite long code sequences. In v5, we introduced batching, but we also (somewhat later) introduced transactions. A transaction, however, is atomic, and so rsyslog must make sure that no two transactions are mixed together. To do so, the engine applies the action lock when the batch is being processed and releases it when the batch is completely processed. During that processing, not only the action is being called (which would be fine), but also the to-be-passed strings are generated. However, if there is a quick filter "in front of the action" (as *.* is!) this is the majority of work that is being carried out. So in essence, due to transaction support, there can not be more than one worker thread in that large code sequence. I have tested on a quad core machine, and I can't get any more than 1.3 CPUs utilized when I have a single input and a single rule *.* /path/to/file I have then *completely* removed omfile processing (by commenting out everything in doAction), and I still can't get up to more than 1.5 CPUs being utilized, all of this with 4 worker thread defined (the imtcp input uses around 10 to 15% of a single CPU). So this seem to be the culprit as least as far as v5 is involved. For v4, I am pretty sure I will get a totally different picture, because there is no such coarse locking and the string generation can be run in parallel (but there we have lots of locking activity and contention). V5 seems to be so much faster, that this effect did not really surface. So what is the cure? An easy way would be to generate the strings for the entire batch before beginning to process it. However, there is a reason this so far is not done: especially with large batches, that would require large amount of memory. For example, let's say we have a batch of 1,000 messages and each string is 200 bytes long. So we would need to use 200,000 bytes, or roughly 200k to hold just the strings, where we currently use only 200 (but repeatedly overwrite it). That amount probably has big impact on the cache hit ratio: at worst, the first messages will be evicted from cache by the later messages. And as a batch is being processed from begin to end, each message will evict some strings that we will soon need. The end result could be really bad. For smaller batch sizes, of course, that would not be as much of a problem. The other (partial) solution that immediately came up my mind was to check /be able to configure if we really *need* (or should use) transactions. If not, there is no problem in interleaving messages and so we would not need to make sure the batch is being processed atomically and thus we could release the lock while generating messages -- much like v4 does. HOWEVER, that also means we have much more locking activity, and so lock contention begins to become a concern again (somehow going in circles, here). Maybe it would be good to have both capabilities and let the operator decide which one suites best. For starters, I will probably just try to change the string generation phase to use a big buffer and place that outside of the atomic transaction code. While I have quite some concerns, outlined above, I find it interesting to see the effect this has on the overall performance (when used with reasonable batch sizes). This may be useful as a guideline for further work. As usual, comments are greatly appreciated. Rainer > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of Aaron Wiebe > Sent: Monday, June 07, 2010 7:17 PM > To: rsyslog-users > Subject: Re: [rsyslog] discussion request: performance enhancement for > imtcp > > Greetings, > > >> hmm, this could be locking overhead as well. One thing that you did > >> early > >> in v5 (I don't think it made it into v4) was to allow the UDP > receiver > >> to > >> insert multiple messages into the queue at once. That made a huge > >> difference. > > > > No, I think that was something I did to both versions. At some time, > I did > > optimizations to both v4 and v5, things like reducing copies, > reducing malloc > > calls and so on. I am pretty sure submission batching was among them. > > I agree with David actually. While multiple tcp threads on the input > side certainly would be helpful, I believe the locking overhead is > likely the real culprit behind the inability to fully utilize a > multi-core machine with a single instance of rsyslog. In my > experience, while the input thread was certainly relatively busy, the > thread itself wasn't hitting a cpu bottleneck. Reducing some of the > latencies around queuing and context switching is probably the best > place to spend time if the goal is improved performance. The earlier > investigations into lockless queues combined with some batching may > help to address these. As it stands, I don't regularly see specific > threads hitting cpu bottlenecks (assuming top -H is accurate). > > Also, if that is the problem (queues and context switching), adding > further division of work into imtcp may actually make the problem > worse. That said, I'm not against reducing possible bottlenecks to > get into the 1-10 gig input levels (at which this would probably > become an issue) - but I think the queues should be more closely > examined first. > > -Aaron > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com From rgerhards at hq.adiscon.com Tue Jun 8 07:22:37 2010 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Tue, 8 Jun 2010 07:22:37 +0200 Subject: [rsyslog] discussion request: performance enhancement for imtcp References: <9B6E2A8877C38245BFB15CC491A11DA7103E61@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103E65@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103E66@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103E67@GRFEXC.intern.adiscon.com> <25a9276339df3684fcdeb6af67d30467@asgard.lang.hm> Message-ID: <9B6E2A8877C38245BFB15CC491A11DA7103E6A@GRFEXC.intern.adiscon.com> David, thanks for the comments, they are quite useful. A real answer will follow, but I am currently runing a test and would like to see that completed. I am creating a very special version (not for actual use) that will remove the action locking for omfile. As a test, that should be fairly easy (but there *are* some subleties), and it would provide us some good numbers to work on. Just some thing I'd like to verify: as of my understanding, once a write() request has been issued, the OS will make sure that this write is handled in an atomic manner, even though another thread may be issuing another write for the same fd in parallel and thread one's timeslice is exhausted. I am pretty sure this is the case, but as it becomes fundamental, I'd like to get some confirmation from others. Thanks, Rainer > -----Original Message----- > From: dlang at lang.hm [mailto:dlang at lang.hm] > Sent: Monday, June 07, 2010 11:49 PM > To: rsyslog-users > Cc: Rainer Gerhards > Subject: Re: [rsyslog] discussion request: performance enhancement for > imtcp > > First off a comment that this may not make it through to the list. I'm > having to use webmail to access my home e-mail from work and I got > bounced > earlier today. > > On Mon, 7 Jun 2010 22:11:37 +0200, "Rainer Gerhards" > wrote: > > Aaron, David, > > > > thanks again for the feedback. I did some testing the past few hours, > but > > all > > of them on v5, so the number may be quite different for v4. > > > > Based on what I have seen, the issue seems not to be lock contention, > but > > rather locking quite long code sequences. In v5, we introduced > batching, > > but > > we also (somewhat later) introduced transactions. A transaction, > however, > > is > > atomic, and so rsyslog must make sure that no two transactions are > mixed > > together. To do so, the engine applies the action lock when the batch > is > > being processed and releases it when the batch is completely > processed. > > During that processing, not only the action is being called (which > would > be > > fine), but also the to-be-passed strings are generated. However, if > there > > is > > a quick filter "in front of the action" (as *.* is!) this is the > majority > > of > > work that is being carried out. So in essence, due to transaction > support, > > there can not be more than one worker thread in that large code > sequence. I > > have tested on a quad core machine, and I can't get any more than 1.3 > CPUs > > utilized when I have a single input and a single rule > > > > *.* /path/to/file > > > > I have then *completely* removed omfile processing (by commenting out > > everything in doAction), and I still can't get up to more than 1.5 > CPUs > > being > > utilized, all of this with 4 worker thread defined (the imtcp input > uses > > around 10 to 15% of a single CPU). > > could this be worked around by using multiple rulesets so that the main > queue is not locked as long (or does this just result in the secondary > queues getting blocked) > > I thought that the transaction support worked like this (which would > not > cause the problem you are describing) > > lock queue > mark N messages as being worked on > unlock queue > process messages > lock queue > if successful > mark the messages tagged above as being completed (and therefor > available to be removed, which may remove them) > else > mark the messages tagged above as not processed > unlock queue > > you don't need to hold the lock while processing the records, only when > tagging them (to make sure another thread doesn't tag them as well) > > from your description of the process it sounds like what is happening > is > > lock queue > mark N messages as being worked on > create output strings for N messages > unlock queue > send messages > lock queue > if successful > mark the messages tagged above as being completed (and therefor > available to be removed, which may remove them) > else > mark the messages tagged above as not processed > unlock queue > > the one thing you need to watch out for here is that you don't move > messages around while they are live (which will cause a little bit of > fragmentation) > > If this is a problem you can create a separate lock for modifying > existing > messages > > the locking rules would be something like > > Allow an unlimited number of worker threads to hold it as a 'read' lock > A process trying to modify (defragment) the queue would try to get it > as a > 'write' lock. > If there are any readers holding the lock the writer stalls waiting for > them > If there is a process that has indicated that it wants the write lock, > new > readers block waiting for the writer > > as long as the defragmentation runs are fairly rare compared to normal > operation this will be very efficient. > > But I suspect that there is not really a need to do the defragmentation > like this, just accept that some space may be wasted until the oldest > messages finish being processed. > > > So this seem to be the culprit as least as far as v5 is involved. For > v4, I > > am pretty sure I will get a totally different picture, because there > is > no > > such coarse locking and the string generation can be run in parallel > (but > > there we have lots of locking activity and contention). V5 seems to > be > so > > much faster, that this effect did not really surface. > > > > So what is the cure? An easy way would be to generate the strings for > the > > entire batch before beginning to process it. However, there is a > reason > > this > > so far is not done: especially with large batches, that would require > large > > amount of memory. For example, let's say we have a batch of 1,000 > messages > > and each string is 200 bytes long. So we would need to use 200,000 > bytes, > > or > > roughly 200k to hold just the strings, where we currently use only > 200 > (but > > repeatedly overwrite it). That amount probably has big impact on the > cache > > hit ratio: at worst, the first messages will be evicted from cache by > the > > later messages. And as a batch is being processed from begin to end, > each > > message will evict some strings that we will soon need. The end > result > > could > > be really bad. For smaller batch sizes, of course, that would not be > as > > much > > of a problem. > > I'm not sure how much of a problem this would be. it's not like you > will > be repeatedly accessing the same strings, it's just that by the time > you > get to the end of the string the beginning may no longer be in the > cache. > > this is inefficient from the cache point of view, but it may still be a > win overall if there is a high transaction overhead in the output. In > addition, depending on the output type you may not end up having to > read > the data back into the cache at all (for example, the system may be > able to > DMA the data directly to the output device, network card or disk > controller) > > the number of messages to try and batch is already tunable, if it's set > low you use small strings which fit in cache, if large you don't fit in > cache, but may save elsewhere. > > > The other (partial) solution that immediately came up my mind was to > check > > /be able to configure if we really *need* (or should use) > transactions. > If > > not, there is no problem in interleaving messages and so we would not > need > > to > > make sure the batch is being processed atomically and thus we could > release > > the lock while generating messages -- much like v4 does. HOWEVER, > that > also > > means we have much more locking activity, and so lock contention > begins > to > > become a concern again (somehow going in circles, here). > > > > Maybe it would be good to have both capabilities and let the operator > > decide > > which one suites best. For starters, I will probably just try to > change > the > > string generation phase to use a big buffer and place that outside of > the > > atomic transaction code. While I have quite some concerns, outlined > above, > > I > > find it interesting to see the effect this has on the overall > performance > > (when used with reasonable batch sizes). This may be useful as a > guideline > > for further work. > > we already allow the transaction size to be configured down to 1 > (effectively disabled) so there's no need for a tunable there. > > But if the string generation is being done while the lock is being held > it > should definitely be moved outside of the lock. > > David Lang > > > As usual, comments are greatly appreciated. > > > > Rainer > > > >> -----Original Message----- > >> From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > >> bounces at lists.adiscon.com] On Behalf Of Aaron Wiebe > >> Sent: Monday, June 07, 2010 7:17 PM > >> To: rsyslog-users > >> Subject: Re: [rsyslog] discussion request: performance enhancement > for > >> imtcp > >> > >> Greetings, > >> > >> >> hmm, this could be locking overhead as well. One thing that you > did > >> >> early > >> >> in v5 (I don't think it made it into v4) was to allow the UDP > >> receiver > >> >> to > >> >> insert multiple messages into the queue at once. That made a huge > >> >> difference. > >> > > >> > No, I think that was something I did to both versions. At some > time, > >> I did > >> > optimizations to both v4 and v5, things like reducing copies, > >> reducing malloc > >> > calls and so on. I am pretty sure submission batching was among > them. > >> > >> I agree with David actually. While multiple tcp threads on the > input > >> side certainly would be helpful, I believe the locking overhead is > >> likely the real culprit behind the inability to fully utilize a > >> multi-core machine with a single instance of rsyslog. In my > >> experience, while the input thread was certainly relatively busy, > the > >> thread itself wasn't hitting a cpu bottleneck. Reducing some of the > >> latencies around queuing and context switching is probably the best > >> place to spend time if the goal is improved performance. The > earlier > >> investigations into lockless queues combined with some batching may > >> help to address these. As it stands, I don't regularly see specific > >> threads hitting cpu bottlenecks (assuming top -H is accurate). > >> > >> Also, if that is the problem (queues and context switching), adding > >> further division of work into imtcp may actually make the problem > >> worse. That said, I'm not against reducing possible bottlenecks to > >> get into the 1-10 gig input levels (at which this would probably > >> become an issue) - but I think the queues should be more closely > >> examined first. > >> > >> -Aaron > >> _______________________________________________ > >> rsyslog mailing list > >> http://lists.adiscon.net/mailman/listinfo/rsyslog > >> http://www.rsyslog.com > > _______________________________________________ > > rsyslog mailing list > > http://lists.adiscon.net/mailman/listinfo/rsyslog > > http://www.rsyslog.com From rgerhards at hq.adiscon.com Tue Jun 8 07:52:37 2010 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Tue, 8 Jun 2010 07:52:37 +0200 Subject: [rsyslog] discussion request: performance enhancement for imtcp References: <9B6E2A8877C38245BFB15CC491A11DA7103E61@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103E65@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103E66@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103E67@GRFEXC.intern.adiscon.com> <25a9276339df3684fcdeb6af67d30467@asgard.lang.hm> <9B6E2A8877C38245BFB15CC491A11DA7103E69@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103E6B@GRFEXC.intern.adiscon.com> Message-ID: <9B6E2A8877C38245BFB15CC491A11DA7103E6D@GRFEXC.intern.adiscon.com> > -----Original Message----- > From: Rainer Gerhards > Sent: Tuesday, June 08, 2010 7:52 AM > To: 'david at lang.hm' > Subject: RE: [rsyslog] discussion request: performance enhancement for > imtcp > > > > > -----Original Message----- > > From: david at lang.hm [mailto:david at lang.hm] > > Sent: Tuesday, June 08, 2010 7:50 AM > > To: Rainer Gerhards > > Subject: RE: [rsyslog] discussion request: performance enhancement > for > > imtcp > > > > On Tue, 8 Jun 2010, Rainer Gerhards wrote: > > > > >> -----Original Message----- > > >> From: david at lang.hm [mailto:david at lang.hm] > > >> Sent: Tuesday, June 08, 2010 7:45 AM > > >> > > >> On Tue, 8 Jun 2010, Rainer Gerhards wrote: > > >> > > >>> Just some thing I'd like to verify: as of my understanding, once > a > > >> write() > > >>> request has been issued, the OS will make sure that this write is > > >> handled in > > >>> an atomic manner, even though another thread may be issuing > another > > >> write for > > >>> the same fd in parallel and thread one's timeslice is exhausted. > I > > am > > >> pretty > > >>> sure this is the case, but as it becomes fundamental, I'd like to > > get > > >> some > > >>> confirmation from others. > > >> > > >> I don't believe that that is the case, write() isn't even > guaranteed > > to > > >> write out all the data that is is passed, it's return is the > number > > of > > >> bytes written and the programmer needs to check and see what is > > >> written. > > > > > > > > > Good reminder, and rsyslog actually has this loop. So that means > that > > the > > > write call itself must be guarded by a mutex, else we may get > > incomplete > > > lines into the file... :( > > > > from a experimental point of view, I think you didn't have this mutex > > in > > some of the v5 series and I was seeing horrible overlaps between > > different > > messages. The more recent versions have not has the problem to the > same > > extent. > > That was caused by an invalid default for the lazy writer. Remember > that the current interface always guarantees that an action is called > only once, so each action call is guarded by a mutex. This is part of > the interface description. So what I am talking about can not happen > with the current interface, it handles that. I am just thinking on how > to permit concurrency at the output module level as well (something > thought about a long time, but not yet implemented). > > Rainer > > > > David Lang From rgerhards at hq.adiscon.com Tue Jun 8 09:16:19 2010 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Tue, 8 Jun 2010 09:16:19 +0200 Subject: [rsyslog] discussion request: performance enhancement for imtcp References: <9B6E2A8877C38245BFB15CC491A11DA7103E61@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103E65@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103E66@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103E67@GRFEXC.intern.adiscon.com> <25a9276339df3684fcdeb6af67d30467@asgard.lang.hm> Message-ID: <9B6E2A8877C38245BFB15CC491A11DA7103E6E@GRFEXC.intern.adiscon.com> > -----Original Message----- > From: dlang at lang.hm [mailto:dlang at lang.hm] > Sent: Monday, June 07, 2010 11:49 PM > To: rsyslog-users > Cc: Rainer Gerhards > Subject: Re: [rsyslog] discussion request: performance enhancement for > imtcp > > First off a comment that this may not make it through to the list. I'm > having to use webmail to access my home e-mail from work and I got > bounced > earlier today. > > On Mon, 7 Jun 2010 22:11:37 +0200, "Rainer Gerhards" > wrote: > > Aaron, David, > > > > thanks again for the feedback. I did some testing the past few hours, > but > > all > > of them on v5, so the number may be quite different for v4. > > > > Based on what I have seen, the issue seems not to be lock contention, > but > > rather locking quite long code sequences. In v5, we introduced > batching, > > but > > we also (somewhat later) introduced transactions. A transaction, > however, > > is > > atomic, and so rsyslog must make sure that no two transactions are > mixed > > together. To do so, the engine applies the action lock when the batch > is > > being processed and releases it when the batch is completely > processed. > > During that processing, not only the action is being called (which > would > be > > fine), but also the to-be-passed strings are generated. However, if > there > > is > > a quick filter "in front of the action" (as *.* is!) this is the > majority > > of > > work that is being carried out. So in essence, due to transaction > support, > > there can not be more than one worker thread in that large code > sequence. I > > have tested on a quad core machine, and I can't get any more than 1.3 > CPUs > > utilized when I have a single input and a single rule > > > > *.* /path/to/file > > > > I have then *completely* removed omfile processing (by commenting out > > everything in doAction), and I still can't get up to more than 1.5 > CPUs > > being > > utilized, all of this with 4 worker thread defined (the imtcp input > uses > > around 10 to 15% of a single CPU). > > could this be worked around by using multiple rulesets so that the main > queue is not locked as long (or does this just result in the secondary > queues getting blocked) that later is the case: queue processing/output is not fast enough Let me differentiate output, that is the action, from (queue) processing here. This is important, because when I hear output, I always constrain my thinking to the actual output part minus other internal processing like filters and the like. A core question is if the interim processing or the actual output takes most of the time. > I thought that the transaction support worked like this (which would > not > cause the problem you are describing) > > lock queue > mark N messages as being worked on > unlock queue > process messages > lock queue > if successful > mark the messages tagged above as being completed (and therefor > available to be removed, which may remove them) > else > mark the messages tagged above as not processed > unlock queue > > you don't need to hold the lock while processing the records, only when > tagging them (to make sure another thread doesn't tag them as well) From the POV of the queue this is correct and how things are done... > > from your description of the process it sounds like what is happening > is > > lock queue > mark N messages as being worked on > create output strings for N messages > unlock queue > send messages > lock queue > if successful > mark the messages tagged above as being completed (and therefor > available to be removed, which may remove them) > else > mark the messages tagged above as not processed > unlock queue > > the one thing you need to watch out for here is that you don't move > messages around while they are live (which will cause a little bit of > fragmentation) this is not correct. But you miss the action part. Remember that the output plugin interface specifies that only one thread may be inside an action at any time concurrently. This was introduced to facilitate writing output plugins. In theory, an output plugin can request to be called concurrently, but this is not yet implemented. So we need to hold on to the action lock (NOT queue lock) whenever we call an action. Even more, transactions mean that we must not interleave two or more batches. Let's say we had two batches A und B, each with 4 messages. Then calling the output as follows: Abegin Bbegin A1 A2 B1 A3 B2 A4 Acommit B3 B4 Bcommit would mean that at Acommit, messages A1,..,A4,B1,B2 would be committed. This could be worked around by far more complex ouput plugins. These would then need to not only support concurrency but also keep separate objects/connections for the various threads. This, if at all, makes only sense for database plugins. I don't see if the added overhead would make any sense at all to things like the file writer. But as we have already discussed, it is not so easy to keep the file writer problem free in that case as well -- because it may get interrupted during writes (which means we need a lock, even if we manage to permit more concurrency inside the file writer). So in essence, the area to look at is that we can restructure the output plugin interface in regard to its transaction support. I am currently looking at this area and have done some preliminary testing. My main concern at this time is to find those spots that actually are the primary bottlenecks (at this time, hopefully moving the border forward ;)). The past hours I thankfully was able to get same base results and match them with what I expect. At some other places, the results surprise me a bit. This is not unexpected -- I had no time to touch that code (under a performance poing of few) for roughly a year, so I need to gain some new understanding. Also, the code has evolved, and it may be possible to refactor it into something simpler (which is good for both performance and maintability). As one of the next things, I will probably use the "big memory, off sync" string generation, just to see the effects (it is rather complicated to get that in cleanly, because there was so much optimization in v4 on cache hit efficiency, parts of which must be undone). Along that way, I will also analyze the calling structure and search for simplifications. And, as usal, your feedback is very helpful and appreciated. Good questions often lead to good thinking and analysis and thus lead to thinks I had not thought about without them ;) As a side-note, I already identified one regression that caused locks to be applied to often when messages were discarded. I already committed a fix for that to the master branch. > > If this is a problem you can create a separate lock for modifying > existing > messages > > the locking rules would be something like > > Allow an unlimited number of worker threads to hold it as a 'read' lock > A process trying to modify (defragment) the queue would try to get it > as a > 'write' lock. > If there are any readers holding the lock the writer stalls waiting for > them > If there is a process that has indicated that it wants the write lock, > new > readers block waiting for the writer > > as long as the defragmentation runs are fairly rare compared to normal > operation this will be very efficient. > > But I suspect that there is not really a need to do the defragmentation > like this, just accept that some space may be wasted until the oldest > messages finish being processed. > > > So this seem to be the culprit as least as far as v5 is involved. For > v4, I > > am pretty sure I will get a totally different picture, because there > is > no > > such coarse locking and the string generation can be run in parallel > (but > > there we have lots of locking activity and contention). V5 seems to > be > so > > much faster, that this effect did not really surface. > > > > So what is the cure? An easy way would be to generate the strings for > the > > entire batch before beginning to process it. However, there is a > reason > > this > > so far is not done: especially with large batches, that would require > large > > amount of memory. For example, let's say we have a batch of 1,000 > messages > > and each string is 200 bytes long. So we would need to use 200,000 > bytes, > > or > > roughly 200k to hold just the strings, where we currently use only > 200 > (but > > repeatedly overwrite it). That amount probably has big impact on the > cache > > hit ratio: at worst, the first messages will be evicted from cache by > the > > later messages. And as a batch is being processed from begin to end, > each > > message will evict some strings that we will soon need. The end > result > > could > > be really bad. For smaller batch sizes, of course, that would not be > as > > much > > of a problem. > > I'm not sure how much of a problem this would be. it's not like you > will > be repeatedly accessing the same strings, it's just that by the time > you > get to the end of the string the beginning may no longer be in the > cache. > > this is inefficient from the cache point of view, but it may still be a > win overall if there is a high transaction overhead in the output. In > addition, depending on the output type you may not end up having to > read > the data back into the cache at all (for example, the system may be > able to > DMA the data directly to the output device, network card or disk > controller) > > the number of messages to try and batch is already tunable, if it's set > low you use small strings which fit in cache, if large you don't fit in > cache, but may save elsewhere. jup, and that's probably the next thing I look at (in code not yet meant to run in practice, so I may hardcode some things, just to see their effect). > > > The other (partial) solution that immediately came up my mind was to > check > > /be able to configure if we really *need* (or should use) > transactions. > If > > not, there is no problem in interleaving messages and so we would not > need > > to > > make sure the batch is being processed atomically and thus we could > release > > the lock while generating messages -- much like v4 does. HOWEVER, > that > also > > means we have much more locking activity, and so lock contention > begins > to > > become a concern again (somehow going in circles, here). > > > > Maybe it would be good to have both capabilities and let the operator > > decide > > which one suites best. For starters, I will probably just try to > change > the > > string generation phase to use a big buffer and place that outside of > the > > atomic transaction code. While I have quite some concerns, outlined > above, > > I > > find it interesting to see the effect this has on the overall > performance > > (when used with reasonable batch sizes). This may be useful as a > guideline > > for further work. > > we already allow the transaction size to be configured down to 1 > (effectively disabled) so there's no need for a tunable there. > > But if the string generation is being done while the lock is being held > it > should definitely be moved outside of the lock. I agree, but remember the problems I saw in my first post. Anyhow, we are now looking at them. I also think it makes a lot of sense to provide different code pathes for plugins that actually need a transactional interface (like database outputs) and those that are not actually transactional (like file output). The later code path can contain some optimization over the first. Rainer > > David Lang > > > As usual, comments are greatly appreciated. > > > > Rainer > > > >> -----Original Message----- > >> From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > >> bounces at lists.adiscon.com] On Behalf Of Aaron Wiebe > >> Sent: Monday, June 07, 2010 7:17 PM > >> To: rsyslog-users > >> Subject: Re: [rsyslog] discussion request: performance enhancement > for > >> imtcp > >> > >> Greetings, > >> > >> >> hmm, this could be locking overhead as well. One thing that you > did > >> >> early > >> >> in v5 (I don't think it made it into v4) was to allow the UDP > >> receiver > >> >> to > >> >> insert multiple messages into the queue at once. That made a huge > >> >> difference. > >> > > >> > No, I think that was something I did to both versions. At some > time, > >> I did > >> > optimizations to both v4 and v5, things like reducing copies, > >> reducing malloc > >> > calls and so on. I am pretty sure submission batching was among > them. > >> > >> I agree with David actually. While multiple tcp threads on the > input > >> side certainly would be helpful, I believe the locking overhead is > >> likely the real culprit behind the inability to fully utilize a > >> multi-core machine with a single instance of rsyslog. In my > >> experience, while the input thread was certainly relatively busy, > the > >> thread itself wasn't hitting a cpu bottleneck. Reducing some of the > >> latencies around queuing and context switching is probably the best > >> place to spend time if the goal is improved performance. The > earlier > >> investigations into lockless queues combined with some batching may > >> help to address these. As it stands, I don't regularly see specific > >> threads hitting cpu bottlenecks (assuming top -H is accurate). > >> > >> Also, if that is the problem (queues and context switching), adding > >> further division of work into imtcp may actually make the problem > >> worse. That said, I'm not against reducing possible bottlenecks to > >> get into the 1-10 gig input levels (at which this would probably > >> become an issue) - but I think the queues should be more closely > >> examined first. > >> > >> -Aaron > >> _______________________________________________ > >> rsyslog mailing list > >> http://lists.adiscon.net/mailman/listinfo/rsyslog > >> http://www.rsyslog.com > > _______________________________________________ > > rsyslog mailing list > > http://lists.adiscon.net/mailman/listinfo/rsyslog > > http://www.rsyslog.com From rgerhards at hq.adiscon.com Tue Jun 8 12:03:44 2010 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Tue, 8 Jun 2010 12:03:44 +0200 Subject: [rsyslog] discussion request: performance enhancement for imtcp References: <9B6E2A8877C38245BFB15CC491A11DA7103E61@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103E65@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103E66@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103E67@GRFEXC.intern.adiscon.com><25a9276339df3684fcdeb6af67d30467@asgard.lang.hm> <9B6E2A8877C38245BFB15CC491A11DA7103E6E@GRFEXC.intern.adiscon.com> Message-ID: <9B6E2A8877C38245BFB15CC491A11DA7103E75@GRFEXC.intern.adiscon.com> OK, *very* interesting: I have now partially analyzed control flow as it is after all the changes to v4 and v5. Over the years, more and more features went into rsyslog, most of them used only in the minority of cases. Plus, there is "last message repeated n time", that I have elaborated quite often so far (and which I really dislike). I have now checked that most of these features require serialization inside the first stage of action processing. For example, rate limiting features like "do not execute this action more than 10 times within 5 seconds" require a proper count of message, and so do others. The end result is that this stage needs to be guarded by a lock and is so. So during that processing, no concurrency is possible. Also, the code path has become rather long and complex. Now, these feature are there for a good reason. It doesn't make sense to remove them, just to get more performance. However, I now plan to partition the code, and serialize things only when selected features actually demand that. I have created a version that hardcodes a "firehose" mode where none of these esoteric features are used (and also "last message ..." is also disabled). It immediately more than doubled the throughput, without even a change to the action lock! I am not sure if this is a realistic speedup (as I created a very specific environment), but it clearly shows there is value in that approach. So I will now see if I can create some code that enables a "firehose mode" if action parameters permit that (and the default action parameters, I think, should do so). Rainer > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of Rainer Gerhards > Sent: Tuesday, June 08, 2010 9:16 AM > To: dlang at lang.hm; rsyslog-users > Subject: Re: [rsyslog] discussion request: performance enhancement for > imtcp > > > -----Original Message----- > > From: dlang at lang.hm [mailto:dlang at lang.hm] > > Sent: Monday, June 07, 2010 11:49 PM > > To: rsyslog-users > > Cc: Rainer Gerhards > > Subject: Re: [rsyslog] discussion request: performance enhancement > for > > imtcp > > > > First off a comment that this may not make it through to the list. > I'm > > having to use webmail to access my home e-mail from work and I got > > bounced > > earlier today. > > > > On Mon, 7 Jun 2010 22:11:37 +0200, "Rainer Gerhards" > > wrote: > > > Aaron, David, > > > > > > thanks again for the feedback. I did some testing the past few > hours, > > but > > > all > > > of them on v5, so the number may be quite different for v4. > > > > > > Based on what I have seen, the issue seems not to be lock > contention, > > but > > > rather locking quite long code sequences. In v5, we introduced > > batching, > > > but > > > we also (somewhat later) introduced transactions. A transaction, > > however, > > > is > > > atomic, and so rsyslog must make sure that no two transactions are > > mixed > > > together. To do so, the engine applies the action lock when the > batch > > is > > > being processed and releases it when the batch is completely > > processed. > > > During that processing, not only the action is being called (which > > would > > be > > > fine), but also the to-be-passed strings are generated. However, if > > there > > > is > > > a quick filter "in front of the action" (as *.* is!) this is the > > majority > > > of > > > work that is being carried out. So in essence, due to transaction > > support, > > > there can not be more than one worker thread in that large code > > sequence. I > > > have tested on a quad core machine, and I can't get any more than > 1.3 > > CPUs > > > utilized when I have a single input and a single rule > > > > > > *.* /path/to/file > > > > > > I have then *completely* removed omfile processing (by commenting > out > > > everything in doAction), and I still can't get up to more than 1.5 > > CPUs > > > being > > > utilized, all of this with 4 worker thread defined (the imtcp input > > uses > > > around 10 to 15% of a single CPU). > > > > could this be worked around by using multiple rulesets so that the > main > > queue is not locked as long (or does this just result in the > secondary > > queues getting blocked) > > that later is the case: queue processing/output is not fast enough > > Let me differentiate output, that is the action, from (queue) > processing > here. This is important, because when I hear output, I always constrain > my > thinking to the actual output part minus other internal processing like > filters and the like. A core question is if the interim processing or > the > actual output takes most of the time. > > > I thought that the transaction support worked like this (which would > > not > > cause the problem you are describing) > > > > lock queue > > mark N messages as being worked on > > unlock queue > > process messages > > lock queue > > if successful > > mark the messages tagged above as being completed (and therefor > > available to be removed, which may remove them) > > else > > mark the messages tagged above as not processed > > unlock queue > > > > you don't need to hold the lock while processing the records, only > when > > tagging them (to make sure another thread doesn't tag them as well) > > >From the POV of the queue this is correct and how things are done... > > > > > from your description of the process it sounds like what is happening > > is > > > > lock queue > > mark N messages as being worked on > > create output strings for N messages > > unlock queue > > send messages > > lock queue > > if successful > > mark the messages tagged above as being completed (and therefor > > available to be removed, which may remove them) > > else > > mark the messages tagged above as not processed > > unlock queue > > > > the one thing you need to watch out for here is that you don't move > > messages around while they are live (which will cause a little bit of > > fragmentation) > > this is not correct. But you miss the action part. > > Remember that the output plugin interface specifies that only one > thread may > be inside an action at any time concurrently. This was introduced to > facilitate writing output plugins. In theory, an output plugin can > request to > be called concurrently, but this is not yet implemented. So we need to > hold > on to the action lock (NOT queue lock) whenever we call an action. > > Even more, transactions mean that we must not interleave two or more > batches. > Let's say we had two batches A und B, each with 4 messages. Then > calling the > output as follows: > > Abegin > Bbegin > A1 > A2 > B1 > A3 > B2 > A4 > Acommit > B3 > B4 > Bcommit > > would mean that at Acommit, messages A1,..,A4,B1,B2 would be committed. > This > could be worked around by far more complex ouput plugins. These would > then > need to not only support concurrency but also keep separate > objects/connections for the various threads. This, if at all, makes > only > sense for database plugins. I don't see if the added overhead would > make any > sense at all to things like the file writer. > > But as we have already discussed, it is not so easy to keep the file > writer > problem free in that case as well -- because it may get interrupted > during > writes (which means we need a lock, even if we manage to permit more > concurrency inside the file writer). > > So in essence, the area to look at is that we can restructure the > output > plugin interface in regard to its transaction support. I am currently > looking > at this area and have done some preliminary testing. My main concern at > this > time is to find those spots that actually are the primary bottlenecks > (at > this time, hopefully moving the border forward ;)). The past hours I > thankfully was able to get same base results and match them with what I > expect. At some other places, the results surprise me a bit. This is > not > unexpected -- I had no time to touch that code (under a performance > poing of > few) for roughly a year, so I need to gain some new understanding. > Also, the > code has evolved, and it may be possible to refactor it into something > simpler (which is good for both performance and maintability). > > As one of the next things, I will probably use the "big memory, off > sync" > string generation, just to see the effects (it is rather complicated to > get > that in cleanly, because there was so much optimization in v4 on cache > hit > efficiency, parts of which must be undone). Along that way, I will also > analyze the calling structure and search for simplifications. > > And, as usal, your feedback is very helpful and appreciated. Good > questions > often lead to good thinking and analysis and thus lead to thinks I had > not > thought about without them ;) > > As a side-note, I already identified one regression that caused locks > to be > applied to often when messages were discarded. I already committed a > fix for > that to the master branch. > > > > If this is a problem you can create a separate lock for modifying > > existing > > messages > > > > the locking rules would be something like > > > > Allow an unlimited number of worker threads to hold it as a 'read' > lock > > A process trying to modify (defragment) the queue would try to get it > > as a > > 'write' lock. > > If there are any readers holding the lock the writer stalls waiting > for > > them > > If there is a process that has indicated that it wants the write > lock, > > new > > readers block waiting for the writer > > > > as long as the defragmentation runs are fairly rare compared to > normal > > operation this will be very efficient. > > > > But I suspect that there is not really a need to do the > defragmentation > > like this, just accept that some space may be wasted until the oldest > > messages finish being processed. > > > > > So this seem to be the culprit as least as far as v5 is involved. > For > > v4, I > > > am pretty sure I will get a totally different picture, because > there > > is > > no > > > such coarse locking and the string generation can be run in > parallel > > (but > > > there we have lots of locking activity and contention). V5 seems to > > be > > so > > > much faster, that this effect did not really surface. > > > > > > So what is the cure? An easy way would be to generate the strings > for > > the > > > entire batch before beginning to process it. However, there is a > > reason > > > this > > > so far is not done: especially with large batches, that would > require > > large > > > amount of memory. For example, let's say we have a batch of 1,000 > > messages > > > and each string is 200 bytes long. So we would need to use 200,000 > > bytes, > > > or > > > roughly 200k to hold just the strings, where we currently use only > > 200 > > (but > > > repeatedly overwrite it). That amount probably has big impact on > the > > cache > > > hit ratio: at worst, the first messages will be evicted from cache > by > > the > > > later messages. And as a batch is being processed from begin to > end, > > each > > > message will evict some strings that we will soon need. The end > > result > > > could > > > be really bad. For smaller batch sizes, of course, that would not > be > > as > > > much > > > of a problem. > > > > I'm not sure how much of a problem this would be. it's not like you > > will > > be repeatedly accessing the same strings, it's just that by the time > > you > > get to the end of the string the beginning may no longer be in the > > cache. > > > > this is inefficient from the cache point of view, but it may still be > a > > win overall if there is a high transaction overhead in the output. In > > addition, depending on the output type you may not end up having to > > read > > the data back into the cache at all (for example, the system may be > > able to > > DMA the data directly to the output device, network card or disk > > controller) > > > > the number of messages to try and batch is already tunable, if it's > set > > low you use small strings which fit in cache, if large you don't fit > in > > cache, but may save elsewhere. > > jup, and that's probably the next thing I look at (in code not yet > meant to > run in practice, so I may hardcode some things, just to see their > effect). > > > > > > The other (partial) solution that immediately came up my mind was > to > > check > > > /be able to configure if we really *need* (or should use) > > transactions. > > If > > > not, there is no problem in interleaving messages and so we would > not > > need > > > to > > > make sure the batch is being processed atomically and thus we could > > release > > > the lock while generating messages -- much like v4 does. HOWEVER, > > that > > also > > > means we have much more locking activity, and so lock contention > > begins > > to > > > become a concern again (somehow going in circles, here). > > > > > > Maybe it would be good to have both capabilities and let the > operator > > > decide > > > which one suites best. For starters, I will probably just try to > > change > > the > > > string generation phase to use a big buffer and place that outside > of > > the > > > atomic transaction code. While I have quite some concerns, outlined > > above, > > > I > > > find it interesting to see the effect this has on the overall > > performance > > > (when used with reasonable batch sizes). This may be useful as a > > guideline > > > for further work. > > > > we already allow the transaction size to be configured down to 1 > > (effectively disabled) so there's no need for a tunable there. > > > > But if the string generation is being done while the lock is being > held > > it > > should definitely be moved outside of the lock. > > I agree, but remember the problems I saw in my first post. Anyhow, we > are > now looking at them. I also think it makes a lot of sense to provide > different code pathes for plugins that actually need a transactional > interface (like database outputs) and those that are not actually > transactional (like file output). The later code path can contain some > optimization over the first. > > Rainer > > > > David Lang > > > > > As usual, comments are greatly appreciated. > > > > > > Rainer > > > > > >> -----Original Message----- > > >> From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > > >> bounces at lists.adiscon.com] On Behalf Of Aaron Wiebe > > >> Sent: Monday, June 07, 2010 7:17 PM > > >> To: rsyslog-users > > >> Subject: Re: [rsyslog] discussion request: performance enhancement > > for > > >> imtcp > > >> > > >> Greetings, > > >> > > >> >> hmm, this could be locking overhead as well. One thing that you > > did > > >> >> early > > >> >> in v5 (I don't think it made it into v4) was to allow the UDP > > >> receiver > > >> >> to > > >> >> insert multiple messages into the queue at once. That made a > huge > > >> >> difference. > > >> > > > >> > No, I think that was something I did to both versions. At some > > time, > > >> I did > > >> > optimizations to both v4 and v5, things like reducing copies, > > >> reducing malloc > > >> > calls and so on. I am pretty sure submission batching was among > > them. > > >> > > >> I agree with David actually. While multiple tcp threads on the > > input > > >> side certainly would be helpful, I believe the locking overhead is > > >> likely the real culprit behind the inability to fully utilize a > > >> multi-core machine with a single instance of rsyslog. In my > > >> experience, while the input thread was certainly relatively busy, > > the > > >> thread itself wasn't hitting a cpu bottleneck. Reducing some of > the > > >> latencies around queuing and context switching is probably the > best > > >> place to spend time if the goal is improved performance. The > > earlier > > >> investigations into lockless queues combined with some batching > may > > >> help to address these. As it stands, I don't regularly see > specific > > >> threads hitting cpu bottlenecks (assuming top -H is accurate). > > >> > > >> Also, if that is the problem (queues and context switching), > adding > > >> further division of work into imtcp may actually make the problem > > >> worse. That said, I'm not against reducing possible bottlenecks > to > > >> get into the 1-10 gig input levels (at which this would probably > > >> become an issue) - but I think the queues should be more closely > > >> examined first. > > >> > > >> -Aaron > > >> _______________________________________________ > > >> rsyslog mailing list > > >> http://lists.adiscon.net/mailman/listinfo/rsyslog > > >> http://www.rsyslog.com > > > _______________________________________________ > > > rsyslog mailing list > > > http://lists.adiscon.net/mailman/listinfo/rsyslog > > > http://www.rsyslog.com > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com From david at lang.hm Tue Jun 8 13:10:12 2010 From: david at lang.hm (david at lang.hm) Date: Tue, 8 Jun 2010 04:10:12 -0700 (PDT) Subject: [rsyslog] discussion request: performance enhancement for imtcp In-Reply-To: <9B6E2A8877C38245BFB15CC491A11DA7103E6E@GRFEXC.intern.adiscon.com> References: <9B6E2A8877C38245BFB15CC491A11DA7103E61@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103E65@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103E66@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103E67@GRFEXC.intern.adiscon.com> <25a9276339df3684fcdeb6af67d30467@asgard.lang.hm> <9B6E2A8877C38245BFB15CC491A11DA7103E6E@GRFEXC.intern.adiscon.com> Message-ID: On Tue, 8 Jun 2010, Rainer Gerhards wrote: >> -----Original Message----- >> From: dlang at lang.hm [mailto:dlang at lang.hm] >> On Mon, 7 Jun 2010 22:11:37 +0200, "Rainer Gerhards" >> wrote: > > this is not correct. But you miss the action part. > > Remember that the output plugin interface specifies that only one thread may > be inside an action at any time concurrently. This was introduced to > facilitate writing output plugins. In theory, an output plugin can request to > be called concurrently, but this is not yet implemented. So we need to hold > on to the action lock (NOT queue lock) whenever we call an action. > > Even more, transactions mean that we must not interleave two or more batches. > Let's say we had two batches A und B, each with 4 messages. Then calling the > output as follows: > > Abegin > Bbegin > A1 > A2 > B1 > A3 > B2 > A4 > Acommit > B3 > B4 > Bcommit > > would mean that at Acommit, messages A1,..,A4,B1,B2 would be committed. This > could be worked around by far more complex ouput plugins. These would then > need to not only support concurrency but also keep separate > objects/connections for the various threads. This, if at all, makes only > sense for database plugins. I don't see if the added overhead would make any > sense at all to things like the file writer. > > But as we have already discussed, it is not so easy to keep the file writer > problem free in that case as well -- because it may get interrupted during > writes (which means we need a lock, even if we manage to permit more > concurrency inside the file writer). > > So in essence, the area to look at is that we can restructure the output > plugin interface in regard to its transaction support. I am currently looking > at this area and have done some preliminary testing. My main concern at this > time is to find those spots that actually are the primary bottlenecks (at > this time, hopefully moving the border forward ;)). The past hours I > thankfully was able to get same base results and match them with what I > expect. At some other places, the results surprise me a bit. This is not > unexpected -- I had no time to touch that code (under a performance poing of > few) for roughly a year, so I need to gain some new understanding. Also, the > code has evolved, and it may be possible to refactor it into something > simpler (which is good for both performance and maintability). > > As one of the next things, I will probably use the "big memory, off sync" > string generation, just to see the effects (it is rather complicated to get > that in cleanly, because there was so much optimization in v4 on cache hit > efficiency, parts of which must be undone). Along that way, I will also > analyze the calling structure and search for simplifications. hmm, I was thinking something along the lines of the following (crafting details as I type, so there may be errors here) queue a1 a2 a3 a4 b1 b2 b3 b4 c1 ..... worker thread 1 worker thread 2 lock queue mark a1-a4 'in process' unlock queue start processing action lock queue by creating output find a1-a4 'in process' strings (one per action) so mark b1-b4 'in process' unlock queue start processing action by creating output strings (one per action) time passes time passes for each action for each action lock output lock output send string send string unlock output unlock output lock queue mark b1-b4 complete find that b1 is not the beginning of the list and do nothing further unlock queue lock queue mark a1-a4 complete find that a1-b4 are all marked as complete so move start-of-queue to c1 the locks on the output are a simple mutex for each output (very cheap if nothing else is holding the lock, which since only writing takes place within it should be the common case), which worker thread gets to a particular output first doesn't matter, as long as it flushes all it's work before releasing the lock. note that the output lock is only needed when the two threads really are accessing the same thing (probably only for files, as you can have two network connections to the same destination at the same time, in which case you can use the path name as the lock id). For things like databases, network relays (including relp) it would probably be better if each worker thread opened it's own connection. In these cases the destination is designed to accept messages in parallel on multiple connections anyway. The good news is that the more complex (and slower) sending methods also tend to be the ones that can have multiple outbound connections. for writing to a file, you need some sort of lock to be able to have multiple threads without the threads stepping on top of each other with their writes anyway. this assumes that the two worker threads can do everything (except possibly output the data) for different messages in parallel. I seem to remember reading in the module explination that you do some trickery to take fairly normal code written in the module and make it thread-safe (by doing something with the variable access IIRC). A similar trick for the actual output could have a flag to toggle between 'single output with locking' and 'each worker thread gets a duplicate output with no locking' so that it's not a huge complexity in each output module ('just' a one-time complexity to setup the handling) if all you are doing is to have an action lock that single-threads all activity for that action, then this isn't possible. If you have this (and use the filename as the lock) you also gain protection against two different actions stepping on each other. I have a growing number of cases where I have things like :hostname, isequal, "foo" /var/log/messages;fixup_format & ~ *.* /var/log/messages this works today if I'm sending over the network instead of writing to a file, but on my relay boxes (which do both) I have a number of corrupted messages each day due to the different actions stepping on each other. note that if you do this output locking on files, it may be possible to do strange things like =*.info /var/log/messages =*.debug /var/log/messages etc and allow these to have multiple worker threads running so that each worker be processing messages with different severity as different actions in parallel (with just a write lock around the final output to the file). This is far uglier than being able to do the action processing in parallel, but may work. Having re-read your message and my thoughts, I think I end up arguing for changes to the output like you were speculating at above. I don't see much here where threads handling one message instead of multiple messages could speed things up much. Since writes are not atomic, you still need the output locks (or multiple outputs) even if only processing one message at a time. single thread, single message is a simpler case, but in that case the locking will be very close to a no-op anyway (since there will never be contention) David Lang From rgerhards at hq.adiscon.com Tue Jun 8 13:45:28 2010 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Tue, 8 Jun 2010 13:45:28 +0200 Subject: [rsyslog] discussion request: performance enhancement for imtcp References: <9B6E2A8877C38245BFB15CC491A11DA7103E61@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103E65@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103E66@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103E67@GRFEXC.intern.adiscon.com><25a9276339df3684fcdeb6af67d30467@asgard.lang.hm><9B6E2A8877C38245BFB15CC491A11DA7103E6E@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103E75@GRFEXC.intern.adiscon.com> Message-ID: <9B6E2A8877C38245BFB15CC491A11DA7103E76@GRFEXC.intern.adiscon.com> The good results seem to hold true in practice as well. I did a first somewhat solid implementation, commit can be found here: http://git.adiscon.com/?p=rsyslog.git;a=commitdiff;h=3e49a1075ab6750135e1a38c f0c213579fa30b4a If no special params are selected, firehose mode is used for stage 1 processing (up to enqueue in action queue), providing full concurrency. One problem so far is the setting to emit mark messages only when no messages have recently been written. Unfortunately, that mode is the default. It enforces some serialization (keep last written timestamp for action) and thus cannot be used in full firehose mode. For now, I have simply ignored that setting. In the somewhat longer term, I will probably end up with three or more slightly different algorithms, each dedicated to a specific set of parameters. The "mark issue", for example, I think can be relatively easy be solved by lock-free synchronization via a CAS loop. I have put considerably effort into researching this since march, and I hope that work now pays back. The test suite also tells me that something most not be 100% correct, at least omruleset has a problem with the new code (the test segfaults). So there is more work to do, but I am confident these are comparatively minor issues. Looking at the performance, on my "ad-hoc performance lab" (read: virtualized development environment) I get around ~ 70K mps both for the old and the new code (with the new one being a very slight bit faster) when I run them on one thread. This is a quad-core system, and three cores are almost idle. So when I select three main message queue workers, the old code goes *down* to around 40k mps -- this is the lock contention we have so often seen and obviously not yet solved. But the new code goes up to 120k mps, and three CPUs get utilized around 85%. So obviously, sync is still eating up more CPU than it should, but we get a considerable speedup of 1.7 (but low compared to almost three times as many CPUs). Anyhow, this is the work of maybe 10 hours of actual coding and analysis, so it doesn't look too bad ;) Also, it tells that I am probably on the right track -- and my previous investment in research pays back :). If someone want to test out the new version, please use the commit above, I am not sure what future commits will break. I'll tell you when I think there is another version suitable for testing. Please report back your findings on performance, I'd be very interested to hear about them. Keep in mind that you need to run the main message queue with more than one worker in order to see any difference. Rainer > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of Rainer Gerhards > Sent: Tuesday, June 08, 2010 12:04 PM > To: rsyslog-users > Subject: Re: [rsyslog] discussion request: performance enhancement for > imtcp > > OK, *very* interesting: I have now partially analyzed control flow as > it is > after all the changes to v4 and v5. Over the years, more and more > features > went into rsyslog, most of them used only in the minority of cases. > Plus, > there is "last message repeated n time", that I have elaborated quite > often > so far (and which I really dislike). > > I have now checked that most of these features require serialization > inside > the first stage of action processing. For example, rate limiting > features > like "do not execute this action more than 10 times within 5 seconds" > require > a proper count of message, and so do others. The end result is that > this > stage needs to be guarded by a lock and is so. So during that > processing, no > concurrency is possible. Also, the code path has become rather long and > complex. > > Now, these feature are there for a good reason. It doesn't make sense > to > remove them, just to get more performance. However, I now plan to > partition > the code, and serialize things only when selected features actually > demand > that. I have created a version that hardcodes a "firehose" mode where > none of > these esoteric features are used (and also "last message ..." is also > disabled). It immediately more than doubled the throughput, without > even a > change to the action lock! I am not sure if this is a realistic speedup > (as I > created a very specific environment), but it clearly shows there is > value in > that approach. > > So I will now see if I can create some code that enables a "firehose > mode" if > action parameters permit that (and the default action parameters, I > think, > should do so). > > Rainer > > > -----Original Message----- > > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > > bounces at lists.adiscon.com] On Behalf Of Rainer Gerhards > > Sent: Tuesday, June 08, 2010 9:16 AM > > To: dlang at lang.hm; rsyslog-users > > Subject: Re: [rsyslog] discussion request: performance enhancement > for > > imtcp > > > > > -----Original Message----- > > > From: dlang at lang.hm [mailto:dlang at lang.hm] > > > Sent: Monday, June 07, 2010 11:49 PM > > > To: rsyslog-users > > > Cc: Rainer Gerhards > > > Subject: Re: [rsyslog] discussion request: performance enhancement > > for > > > imtcp > > > > > > First off a comment that this may not make it through to the list. > > I'm > > > having to use webmail to access my home e-mail from work and I got > > > bounced > > > earlier today. > > > > > > On Mon, 7 Jun 2010 22:11:37 +0200, "Rainer Gerhards" > > > wrote: > > > > Aaron, David, > > > > > > > > thanks again for the feedback. I did some testing the past few > > hours, > > > but > > > > all > > > > of them on v5, so the number may be quite different for v4. > > > > > > > > Based on what I have seen, the issue seems not to be lock > > contention, > > > but > > > > rather locking quite long code sequences. In v5, we introduced > > > batching, > > > > but > > > > we also (somewhat later) introduced transactions. A transaction, > > > however, > > > > is > > > > atomic, and so rsyslog must make sure that no two transactions > are > > > mixed > > > > together. To do so, the engine applies the action lock when the > > batch > > > is > > > > being processed and releases it when the batch is completely > > > processed. > > > > During that processing, not only the action is being called > (which > > > would > > > be > > > > fine), but also the to-be-passed strings are generated. However, > if > > > there > > > > is > > > > a quick filter "in front of the action" (as *.* is!) this is the > > > majority > > > > of > > > > work that is being carried out. So in essence, due to transaction > > > support, > > > > there can not be more than one worker thread in that large code > > > sequence. I > > > > have tested on a quad core machine, and I can't get any more than > > 1.3 > > > CPUs > > > > utilized when I have a single input and a single rule > > > > > > > > *.* /path/to/file > > > > > > > > I have then *completely* removed omfile processing (by commenting > > out > > > > everything in doAction), and I still can't get up to more than > 1.5 > > > CPUs > > > > being > > > > utilized, all of this with 4 worker thread defined (the imtcp > input > > > uses > > > > around 10 to 15% of a single CPU). > > > > > > could this be worked around by using multiple rulesets so that the > > main > > > queue is not locked as long (or does this just result in the > > secondary > > > queues getting blocked) > > > > that later is the case: queue processing/output is not fast enough > > > > Let me differentiate output, that is the action, from (queue) > > processing > > here. This is important, because when I hear output, I always > constrain > > my > > thinking to the actual output part minus other internal processing > like > > filters and the like. A core question is if the interim processing or > > the > > actual output takes most of the time. > > > > > I thought that the transaction support worked like this (which > would > > > not > > > cause the problem you are describing) > > > > > > lock queue > > > mark N messages as being worked on > > > unlock queue > > > process messages > > > lock queue > > > if successful > > > mark the messages tagged above as being completed (and therefor > > > available to be removed, which may remove them) > > > else > > > mark the messages tagged above as not processed > > > unlock queue > > > > > > you don't need to hold the lock while processing the records, only > > when > > > tagging them (to make sure another thread doesn't tag them as well) > > > > >From the POV of the queue this is correct and how things are done... > > > > > > > > from your description of the process it sounds like what is > happening > > > is > > > > > > lock queue > > > mark N messages as being worked on > > > create output strings for N messages > > > unlock queue > > > send messages > > > lock queue > > > if successful > > > mark the messages tagged above as being completed (and therefor > > > available to be removed, which may remove them) > > > else > > > mark the messages tagged above as not processed > > > unlock queue > > > > > > the one thing you need to watch out for here is that you don't move > > > messages around while they are live (which will cause a little bit > of > > > fragmentation) > > > > this is not correct. But you miss the action part. > > > > Remember that the output plugin interface specifies that only one > > thread may > > be inside an action at any time concurrently. This was introduced to > > facilitate writing output plugins. In theory, an output plugin can > > request to > > be called concurrently, but this is not yet implemented. So we need > to > > hold > > on to the action lock (NOT queue lock) whenever we call an action. > > > > Even more, transactions mean that we must not interleave two or more > > batches. > > Let's say we had two batches A und B, each with 4 messages. Then > > calling the > > output as follows: > > > > Abegin > > Bbegin > > A1 > > A2 > > B1 > > A3 > > B2 > > A4 > > Acommit > > B3 > > B4 > > Bcommit > > > > would mean that at Acommit, messages A1,..,A4,B1,B2 would be > committed. > > This > > could be worked around by far more complex ouput plugins. These would > > then > > need to not only support concurrency but also keep separate > > objects/connections for the various threads. This, if at all, makes > > only > > sense for database plugins. I don't see if the added overhead would > > make any > > sense at all to things like the file writer. > > > > But as we have already discussed, it is not so easy to keep the file > > writer > > problem free in that case as well -- because it may get interrupted > > during > > writes (which means we need a lock, even if we manage to permit more > > concurrency inside the file writer). > > > > So in essence, the area to look at is that we can restructure the > > output > > plugin interface in regard to its transaction support. I am currently > > looking > > at this area and have done some preliminary testing. My main concern > at > > this > > time is to find those spots that actually are the primary bottlenecks > > (at > > this time, hopefully moving the border forward ;)). The past hours I > > thankfully was able to get same base results and match them with what > I > > expect. At some other places, the results surprise me a bit. This is > > not > > unexpected -- I had no time to touch that code (under a performance > > poing of > > few) for roughly a year, so I need to gain some new understanding. > > Also, the > > code has evolved, and it may be possible to refactor it into > something > > simpler (which is good for both performance and maintability). > > > > As one of the next things, I will probably use the "big memory, off > > sync" > > string generation, just to see the effects (it is rather complicated > to > > get > > that in cleanly, because there was so much optimization in v4 on > cache > > hit > > efficiency, parts of which must be undone). Along that way, I will > also > > analyze the calling structure and search for simplifications. > > > > And, as usal, your feedback is very helpful and appreciated. Good > > questions > > often lead to good thinking and analysis and thus lead to thinks I > had > > not > > thought about without them ;) > > > > As a side-note, I already identified one regression that caused locks > > to be > > applied to often when messages were discarded. I already committed a > > fix for > > that to the master branch. > > > > > > If this is a problem you can create a separate lock for modifying > > > existing > > > messages > > > > > > the locking rules would be something like > > > > > > Allow an unlimited number of worker threads to hold it as a 'read' > > lock > > > A process trying to modify (defragment) the queue would try to get > it > > > as a > > > 'write' lock. > > > If there are any readers holding the lock the writer stalls waiting > > for > > > them > > > If there is a process that has indicated that it wants the write > > lock, > > > new > > > readers block waiting for the writer > > > > > > as long as the defragmentation runs are fairly rare compared to > > normal > > > operation this will be very efficient. > > > > > > But I suspect that there is not really a need to do the > > defragmentation > > > like this, just accept that some space may be wasted until the > oldest > > > messages finish being processed. > > > > > > > So this seem to be the culprit as least as far as v5 is involved. > > For > > > v4, I > > > > am pretty sure I will get a totally different picture, because > > there > > > is > > > no > > > > such coarse locking and the string generation can be run in > > parallel > > > (but > > > > there we have lots of locking activity and contention). V5 seems > to > > > be > > > so > > > > much faster, that this effect did not really surface. > > > > > > > > So what is the cure? An easy way would be to generate the strings > > for > > > the > > > > entire batch before beginning to process it. However, there is a > > > reason > > > > this > > > > so far is not done: especially with large batches, that would > > require > > > large > > > > amount of memory. For example, let's say we have a batch of 1,000 > > > messages > > > > and each string is 200 bytes long. So we would need to use > 200,000 > > > bytes, > > > > or > > > > roughly 200k to hold just the strings, where we currently use > only > > > 200 > > > (but > > > > repeatedly overwrite it). That amount probably has big impact on > > the > > > cache > > > > hit ratio: at worst, the first messages will be evicted from > cache > > by > > > the > > > > later messages. And as a batch is being processed from begin to > > end, > > > each > > > > message will evict some strings that we will soon need. The end > > > result > > > > could > > > > be really bad. For smaller batch sizes, of course, that would not > > be > > > as > > > > much > > > > of a problem. > > > > > > I'm not sure how much of a problem this would be. it's not like you > > > will > > > be repeatedly accessing the same strings, it's just that by the > time > > > you > > > get to the end of the string the beginning may no longer be in the > > > cache. > > > > > > this is inefficient from the cache point of view, but it may still > be > > a > > > win overall if there is a high transaction overhead in the output. > In > > > addition, depending on the output type you may not end up having to > > > read > > > the data back into the cache at all (for example, the system may be > > > able to > > > DMA the data directly to the output device, network card or disk > > > controller) > > > > > > the number of messages to try and batch is already tunable, if it's > > set > > > low you use small strings which fit in cache, if large you don't > fit > > in > > > cache, but may save elsewhere. > > > > jup, and that's probably the next thing I look at (in code not yet > > meant to > > run in practice, so I may hardcode some things, just to see their > > effect). > > > > > > > > > The other (partial) solution that immediately came up my mind was > > to > > > check > > > > /be able to configure if we really *need* (or should use) > > > transactions. > > > If > > > > not, there is no problem in interleaving messages and so we would > > not > > > need > > > > to > > > > make sure the batch is being processed atomically and thus we > could > > > release > > > > the lock while generating messages -- much like v4 does. HOWEVER, > > > that > > > also > > > > means we have much more locking activity, and so lock contention > > > begins > > > to > > > > become a concern again (somehow going in circles, here). > > > > > > > > Maybe it would be good to have both capabilities and let the > > operator > > > > decide > > > > which one suites best. For starters, I will probably just try to > > > change > > > the > > > > string generation phase to use a big buffer and place that > outside > > of > > > the > > > > atomic transaction code. While I have quite some concerns, > outlined > > > above, > > > > I > > > > find it interesting to see the effect this has on the overall > > > performance > > > > (when used with reasonable batch sizes). This may be useful as a > > > guideline > > > > for further work. > > > > > > we already allow the transaction size to be configured down to 1 > > > (effectively disabled) so there's no need for a tunable there. > > > > > > But if the string generation is being done while the lock is being > > held > > > it > > > should definitely be moved outside of the lock. > > > > I agree, but remember the problems I saw in my first post. Anyhow, > we > > are > > now looking at them. I also think it makes a lot of sense to provide > > different code pathes for plugins that actually need a transactional > > interface (like database outputs) and those that are not actually > > transactional (like file output). The later code path can contain > some > > optimization over the first. > > > > Rainer > > > > > > David Lang > > > > > > > As usual, comments are greatly appreciated. > > > > > > > > Rainer > > > > > > > >> -----Original Message----- > > > >> From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > > > >> bounces at lists.adiscon.com] On Behalf Of Aaron Wiebe > > > >> Sent: Monday, June 07, 2010 7:17 PM > > > >> To: rsyslog-users > > > >> Subject: Re: [rsyslog] discussion request: performance > enhancement > > > for > > > >> imtcp > > > >> > > > >> Greetings, > > > >> > > > >> >> hmm, this could be locking overhead as well. One thing that > you > > > did > > > >> >> early > > > >> >> in v5 (I don't think it made it into v4) was to allow the UDP > > > >> receiver > > > >> >> to > > > >> >> insert multiple messages into the queue at once. That made a > > huge > > > >> >> difference. > > > >> > > > > >> > No, I think that was something I did to both versions. At some > > > time, > > > >> I did > > > >> > optimizations to both v4 and v5, things like reducing copies, > > > >> reducing malloc > > > >> > calls and so on. I am pretty sure submission batching was > among > > > them. > > > >> > > > >> I agree with David actually. While multiple tcp threads on the > > > input > > > >> side certainly would be helpful, I believe the locking overhead > is > > > >> likely the real culprit behind the inability to fully utilize a > > > >> multi-core machine with a single instance of rsyslog. In my > > > >> experience, while the input thread was certainly relatively > busy, > > > the > > > >> thread itself wasn't hitting a cpu bottleneck. Reducing some of > > the > > > >> latencies around queuing and context switching is probably the > > best > > > >> place to spend time if the goal is improved performance. The > > > earlier > > > >> investigations into lockless queues combined with some batching > > may > > > >> help to address these. As it stands, I don't regularly see > > specific > > > >> threads hitting cpu bottlenecks (assuming top -H is accurate). > > > >> > > > >> Also, if that is the problem (queues and context switching), > > adding > > > >> further division of work into imtcp may actually make the > problem > > > >> worse. That said, I'm not against reducing possible bottlenecks > > to > > > >> get into the 1-10 gig input levels (at which this would probably > > > >> become an issue) - but I think the queues should be more closely > > > >> examined first. > > > >> > > > >> -Aaron > > > >> _______________________________________________ > > > >> rsyslog mailing list > > > >> http://lists.adiscon.net/mailman/listinfo/rsyslog > > > >> http://www.rsyslog.com > > > > _______________________________________________ > > > > rsyslog mailing list > > > > http://lists.adiscon.net/mailman/listinfo/rsyslog > > > > http://www.rsyslog.com > > _______________________________________________ > > rsyslog mailing list > > http://lists.adiscon.net/mailman/listinfo/rsyslog > > http://www.rsyslog.com > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com From rgerhards at hq.adiscon.com Tue Jun 8 15:35:42 2010 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Tue, 8 Jun 2010 15:35:42 +0200 Subject: [rsyslog] discussion request: performance enhancement for imtcp References: <9B6E2A8877C38245BFB15CC491A11DA7103E61@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103E65@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103E66@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103E67@GRFEXC.intern.adiscon.com><25a9276339df3684fcdeb6af67d30467@asgard.lang.hm><9B6E2A8877C38245BFB15CC491A11DA7103E6E@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103E75@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103E76@GRFEXC.intern.adiscon.com> Message-ID: <9B6E2A8877C38245BFB15CC491A11DA7103E7C@GRFEXC.intern.adiscon.com> I could go even further. New commit: http://git.adiscon.com/?p=rsyslog.git;a=commitdiff;h=11bd517465360278b270ee7c 18607b4d1d97e44e Mark messages are now correctly handled again, and the performance is equally well in that case because we do not need to do a full serialization. For those interested, I'd like to draw your attention to the lock-free CAS loop at line 1356 of action.c: http://git.adiscon.com/?p=rsyslog.git;a=blob;f=action.c;h=b8751c636038e344f4f 0c479be2e85cfce8ba6ff;hb=11bd517465360278b270ee7c18607b4d1d97e44e#l1344 It that handles time updates in a lock-free manner. My educated guess is that we can replace a couple of mutex calls by methods similar to this CAS loop and by that increase performance and decrease complexity. If you have not been in touch with lock-free methods, this is more or less speculative execution, where the computation result is discarded if some other thread was faster persisting its result. The computation then is retried. While this sounds a bit like waste of CPU time, it actually is more efficient, as we do not have all the overhead of locking methods (think context switches et al). There is good scientific literature backing this approach as being correct and efficient. Lock-free methods are often used inside operating systems themselves including very demanding real-time OSes. Of course, there are restrictions of what can be done with them, but often they provide an excellent alternative to mutexes and other blocking synchronization mechanisms. Rainer > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of Rainer Gerhards > Sent: Tuesday, June 08, 2010 1:45 PM > To: rsyslog-users > Subject: Re: [rsyslog] discussion request: performance enhancement for > imtcp > > The good results seem to hold true in practice as well. I did a first > somewhat solid implementation, commit can be found here: > > http://git.adiscon.com/?p=rsyslog.git;a=commitdiff;h=3e49a1075ab6750135 > e1a38c > f0c213579fa30b4a > > If no special params are selected, firehose mode is used for stage 1 > processing (up to enqueue in action queue), providing full concurrency. > One > problem so far is the setting to emit mark messages only when no > messages > have recently been written. Unfortunately, that mode is the default. It > enforces some serialization (keep last written timestamp for action) > and thus > cannot be used in full firehose mode. For now, I have simply ignored > that > setting. In the somewhat longer term, I will probably end up with three > or > more slightly different algorithms, each dedicated to a specific set of > parameters. The "mark issue", for example, I think can be relatively > easy be > solved by lock-free synchronization via a CAS loop. I have put > considerably > effort into researching this since march, and I hope that work now pays > back. > > The test suite also tells me that something most not be 100% correct, > at > least omruleset has a problem with the new code (the test segfaults). > So > there is more work to do, but I am confident these are comparatively > minor > issues. > > Looking at the performance, on my "ad-hoc performance lab" (read: > virtualized > development environment) I get around ~ 70K mps both for the old and > the new > code (with the new one being a very slight bit faster) when I run them > on one > thread. This is a quad-core system, and three cores are almost idle. So > when > I select three main message queue workers, the old code goes *down* to > around > 40k mps -- this is the lock contention we have so often seen and > obviously > not yet solved. But the new code goes up to 120k mps, and three CPUs > get > utilized around 85%. So obviously, sync is still eating up more CPU > than it > should, but we get a considerable speedup of 1.7 (but low compared to > almost > three times as many CPUs). Anyhow, this is the work of maybe 10 hours > of > actual coding and analysis, so it doesn't look too bad ;) Also, it > tells that > I am probably on the right track -- and my previous investment in > research > pays back :). > > If someone want to test out the new version, please use the commit > above, I > am not sure what future commits will break. I'll tell you when I think > there > is another version suitable for testing. Please report back your > findings on > performance, I'd be very interested to hear about them. Keep in mind > that you > need to run the main message queue with more than one worker in order > to see > any difference. > > Rainer > > > -----Original Message----- > > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > > bounces at lists.adiscon.com] On Behalf Of Rainer Gerhards > > Sent: Tuesday, June 08, 2010 12:04 PM > > To: rsyslog-users > > Subject: Re: [rsyslog] discussion request: performance enhancement > for > > imtcp > > > > OK, *very* interesting: I have now partially analyzed control flow as > > it is > > after all the changes to v4 and v5. Over the years, more and more > > features > > went into rsyslog, most of them used only in the minority of cases. > > Plus, > > there is "last message repeated n time", that I have elaborated quite > > often > > so far (and which I really dislike). > > > > I have now checked that most of these features require serialization > > inside > > the first stage of action processing. For example, rate limiting > > features > > like "do not execute this action more than 10 times within 5 seconds" > > require > > a proper count of message, and so do others. The end result is that > > this > > stage needs to be guarded by a lock and is so. So during that > > processing, no > > concurrency is possible. Also, the code path has become rather long > and > > complex. > > > > Now, these feature are there for a good reason. It doesn't make sense > > to > > remove them, just to get more performance. However, I now plan to > > partition > > the code, and serialize things only when selected features actually > > demand > > that. I have created a version that hardcodes a "firehose" mode where > > none of > > these esoteric features are used (and also "last message ..." is also > > disabled). It immediately more than doubled the throughput, without > > even a > > change to the action lock! I am not sure if this is a realistic > speedup > > (as I > > created a very specific environment), but it clearly shows there is > > value in > > that approach. > > > > So I will now see if I can create some code that enables a "firehose > > mode" if > > action parameters permit that (and the default action parameters, I > > think, > > should do so). > > > > Rainer > > > > > -----Original Message----- > > > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > > > bounces at lists.adiscon.com] On Behalf Of Rainer Gerhards > > > Sent: Tuesday, June 08, 2010 9:16 AM > > > To: dlang at lang.hm; rsyslog-users > > > Subject: Re: [rsyslog] discussion request: performance enhancement > > for > > > imtcp > > > > > > > -----Original Message----- > > > > From: dlang at lang.hm [mailto:dlang at lang.hm] > > > > Sent: Monday, June 07, 2010 11:49 PM > > > > To: rsyslog-users > > > > Cc: Rainer Gerhards > > > > Subject: Re: [rsyslog] discussion request: performance > enhancement > > > for > > > > imtcp > > > > > > > > First off a comment that this may not make it through to the > list. > > > I'm > > > > having to use webmail to access my home e-mail from work and I > got > > > > bounced > > > > earlier today. > > > > > > > > On Mon, 7 Jun 2010 22:11:37 +0200, "Rainer Gerhards" > > > > wrote: > > > > > Aaron, David, > > > > > > > > > > thanks again for the feedback. I did some testing the past few > > > hours, > > > > but > > > > > all > > > > > of them on v5, so the number may be quite different for v4. > > > > > > > > > > Based on what I have seen, the issue seems not to be lock > > > contention, > > > > but > > > > > rather locking quite long code sequences. In v5, we introduced > > > > batching, > > > > > but > > > > > we also (somewhat later) introduced transactions. A > transaction, > > > > however, > > > > > is > > > > > atomic, and so rsyslog must make sure that no two transactions > > are > > > > mixed > > > > > together. To do so, the engine applies the action lock when the > > > batch > > > > is > > > > > being processed and releases it when the batch is completely > > > > processed. > > > > > During that processing, not only the action is being called > > (which > > > > would > > > > be > > > > > fine), but also the to-be-passed strings are generated. > However, > > if > > > > there > > > > > is > > > > > a quick filter "in front of the action" (as *.* is!) this is > the > > > > majority > > > > > of > > > > > work that is being carried out. So in essence, due to > transaction > > > > support, > > > > > there can not be more than one worker thread in that large code > > > > sequence. I > > > > > have tested on a quad core machine, and I can't get any more > than > > > 1.3 > > > > CPUs > > > > > utilized when I have a single input and a single rule > > > > > > > > > > *.* /path/to/file > > > > > > > > > > I have then *completely* removed omfile processing (by > commenting > > > out > > > > > everything in doAction), and I still can't get up to more than > > 1.5 > > > > CPUs > > > > > being > > > > > utilized, all of this with 4 worker thread defined (the imtcp > > input > > > > uses > > > > > around 10 to 15% of a single CPU). > > > > > > > > could this be worked around by using multiple rulesets so that > the > > > main > > > > queue is not locked as long (or does this just result in the > > > secondary > > > > queues getting blocked) > > > > > > that later is the case: queue processing/output is not fast enough > > > > > > Let me differentiate output, that is the action, from (queue) > > > processing > > > here. This is important, because when I hear output, I always > > constrain > > > my > > > thinking to the actual output part minus other internal processing > > like > > > filters and the like. A core question is if the interim processing > or > > > the > > > actual output takes most of the time. > > > > > > > I thought that the transaction support worked like this (which > > would > > > > not > > > > cause the problem you are describing) > > > > > > > > lock queue > > > > mark N messages as being worked on > > > > unlock queue > > > > process messages > > > > lock queue > > > > if successful > > > > mark the messages tagged above as being completed (and therefor > > > > available to be removed, which may remove them) > > > > else > > > > mark the messages tagged above as not processed > > > > unlock queue > > > > > > > > you don't need to hold the lock while processing the records, > only > > > when > > > > tagging them (to make sure another thread doesn't tag them as > well) > > > > > > >From the POV of the queue this is correct and how things are > done... > > > > > > > > > > > from your description of the process it sounds like what is > > happening > > > > is > > > > > > > > lock queue > > > > mark N messages as being worked on > > > > create output strings for N messages > > > > unlock queue > > > > send messages > > > > lock queue > > > > if successful > > > > mark the messages tagged above as being completed (and therefor > > > > available to be removed, which may remove them) > > > > else > > > > mark the messages tagged above as not processed > > > > unlock queue > > > > > > > > the one thing you need to watch out for here is that you don't > move > > > > messages around while they are live (which will cause a little > bit > > of > > > > fragmentation) > > > > > > this is not correct. But you miss the action part. > > > > > > Remember that the output plugin interface specifies that only one > > > thread may > > > be inside an action at any time concurrently. This was introduced > to > > > facilitate writing output plugins. In theory, an output plugin can > > > request to > > > be called concurrently, but this is not yet implemented. So we need > > to > > > hold > > > on to the action lock (NOT queue lock) whenever we call an action. > > > > > > Even more, transactions mean that we must not interleave two or > more > > > batches. > > > Let's say we had two batches A und B, each with 4 messages. Then > > > calling the > > > output as follows: > > > > > > Abegin > > > Bbegin > > > A1 > > > A2 > > > B1 > > > A3 > > > B2 > > > A4 > > > Acommit > > > B3 > > > B4 > > > Bcommit > > > > > > would mean that at Acommit, messages A1,..,A4,B1,B2 would be > > committed. > > > This > > > could be worked around by far more complex ouput plugins. These > would > > > then > > > need to not only support concurrency but also keep separate > > > objects/connections for the various threads. This, if at all, makes > > > only > > > sense for database plugins. I don't see if the added overhead would > > > make any > > > sense at all to things like the file writer. > > > > > > But as we have already discussed, it is not so easy to keep the > file > > > writer > > > problem free in that case as well -- because it may get interrupted > > > during > > > writes (which means we need a lock, even if we manage to permit > more > > > concurrency inside the file writer). > > > > > > So in essence, the area to look at is that we can restructure the > > > output > > > plugin interface in regard to its transaction support. I am > currently > > > looking > > > at this area and have done some preliminary testing. My main > concern > > at > > > this > > > time is to find those spots that actually are the primary > bottlenecks > > > (at > > > this time, hopefully moving the border forward ;)). The past hours > I > > > thankfully was able to get same base results and match them with > what > > I > > > expect. At some other places, the results surprise me a bit. This > is > > > not > > > unexpected -- I had no time to touch that code (under a performance > > > poing of > > > few) for roughly a year, so I need to gain some new understanding. > > > Also, the > > > code has evolved, and it may be possible to refactor it into > > something > > > simpler (which is good for both performance and maintability). > > > > > > As one of the next things, I will probably use the "big memory, off > > > sync" > > > string generation, just to see the effects (it is rather > complicated > > to > > > get > > > that in cleanly, because there was so much optimization in v4 on > > cache > > > hit > > > efficiency, parts of which must be undone). Along that way, I will > > also > > > analyze the calling structure and search for simplifications. > > > > > > And, as usal, your feedback is very helpful and appreciated. Good > > > questions > > > often lead to good thinking and analysis and thus lead to thinks I > > had > > > not > > > thought about without them ;) > > > > > > As a side-note, I already identified one regression that caused > locks > > > to be > > > applied to often when messages were discarded. I already committed > a > > > fix for > > > that to the master branch. > > > > > > > > If this is a problem you can create a separate lock for modifying > > > > existing > > > > messages > > > > > > > > the locking rules would be something like > > > > > > > > Allow an unlimited number of worker threads to hold it as a > 'read' > > > lock > > > > A process trying to modify (defragment) the queue would try to > get > > it > > > > as a > > > > 'write' lock. > > > > If there are any readers holding the lock the writer stalls > waiting > > > for > > > > them > > > > If there is a process that has indicated that it wants the write > > > lock, > > > > new > > > > readers block waiting for the writer > > > > > > > > as long as the defragmentation runs are fairly rare compared to > > > normal > > > > operation this will be very efficient. > > > > > > > > But I suspect that there is not really a need to do the > > > defragmentation > > > > like this, just accept that some space may be wasted until the > > oldest > > > > messages finish being processed. > > > > > > > > > So this seem to be the culprit as least as far as v5 is > involved. > > > For > > > > v4, I > > > > > am pretty sure I will get a totally different picture, because > > > there > > > > is > > > > no > > > > > such coarse locking and the string generation can be run in > > > parallel > > > > (but > > > > > there we have lots of locking activity and contention). V5 > seems > > to > > > > be > > > > so > > > > > much faster, that this effect did not really surface. > > > > > > > > > > So what is the cure? An easy way would be to generate the > strings > > > for > > > > the > > > > > entire batch before beginning to process it. However, there is > a > > > > reason > > > > > this > > > > > so far is not done: especially with large batches, that would > > > require > > > > large > > > > > amount of memory. For example, let's say we have a batch of > 1,000 > > > > messages > > > > > and each string is 200 bytes long. So we would need to use > > 200,000 > > > > bytes, > > > > > or > > > > > roughly 200k to hold just the strings, where we currently use > > only > > > > 200 > > > > (but > > > > > repeatedly overwrite it). That amount probably has big impact > on > > > the > > > > cache > > > > > hit ratio: at worst, the first messages will be evicted from > > cache > > > by > > > > the > > > > > later messages. And as a batch is being processed from begin to > > > end, > > > > each > > > > > message will evict some strings that we will soon need. The end > > > > result > > > > > could > > > > > be really bad. For smaller batch sizes, of course, that would > not > > > be > > > > as > > > > > much > > > > > of a problem. > > > > > > > > I'm not sure how much of a problem this would be. it's not like > you > > > > will > > > > be repeatedly accessing the same strings, it's just that by the > > time > > > > you > > > > get to the end of the string the beginning may no longer be in > the > > > > cache. > > > > > > > > this is inefficient from the cache point of view, but it may > still > > be > > > a > > > > win overall if there is a high transaction overhead in the > output. > > In > > > > addition, depending on the output type you may not end up having > to > > > > read > > > > the data back into the cache at all (for example, the system may > be > > > > able to > > > > DMA the data directly to the output device, network card or disk > > > > controller) > > > > > > > > the number of messages to try and batch is already tunable, if > it's > > > set > > > > low you use small strings which fit in cache, if large you don't > > fit > > > in > > > > cache, but may save elsewhere. > > > > > > jup, and that's probably the next thing I look at (in code not yet > > > meant to > > > run in practice, so I may hardcode some things, just to see their > > > effect). > > > > > > > > > > > > The other (partial) solution that immediately came up my mind > was > > > to > > > > check > > > > > /be able to configure if we really *need* (or should use) > > > > transactions. > > > > If > > > > > not, there is no problem in interleaving messages and so we > would > > > not > > > > need > > > > > to > > > > > make sure the batch is being processed atomically and thus we > > could > > > > release > > > > > the lock while generating messages -- much like v4 does. > HOWEVER, > > > > that > > > > also > > > > > means we have much more locking activity, and so lock > contention > > > > begins > > > > to > > > > > become a concern again (somehow going in circles, here). > > > > > > > > > > Maybe it would be good to have both capabilities and let the > > > operator > > > > > decide > > > > > which one suites best. For starters, I will probably just try > to > > > > change > > > > the > > > > > string generation phase to use a big buffer and place that > > outside > > > of > > > > the > > > > > atomic transaction code. While I have quite some concerns, > > outlined > > > > above, > > > > > I > > > > > find it interesting to see the effect this has on the overall > > > > performance > > > > > (when used with reasonable batch sizes). This may be useful as > a > > > > guideline > > > > > for further work. > > > > > > > > we already allow the transaction size to be configured down to 1 > > > > (effectively disabled) so there's no need for a tunable there. > > > > > > > > But if the string generation is being done while the lock is > being > > > held > > > > it > > > > should definitely be moved outside of the lock. > > > > > > I agree, but remember the problems I saw in my first post. Anyhow, > > we > > > are > > > now looking at them. I also think it makes a lot of sense to > provide > > > different code pathes for plugins that actually need a > transactional > > > interface (like database outputs) and those that are not actually > > > transactional (like file output). The later code path can contain > > some > > > optimization over the first. > > > > > > Rainer > > > > > > > > David Lang > > > > > > > > > As usual, comments are greatly appreciated. > > > > > > > > > > Rainer > > > > > > > > > >> -----Original Message----- > > > > >> From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > > > > >> bounces at lists.adiscon.com] On Behalf Of Aaron Wiebe > > > > >> Sent: Monday, June 07, 2010 7:17 PM > > > > >> To: rsyslog-users > > > > >> Subject: Re: [rsyslog] discussion request: performance > > enhancement > > > > for > > > > >> imtcp > > > > >> > > > > >> Greetings, > > > > >> > > > > >> >> hmm, this could be locking overhead as well. One thing that > > you > > > > did > > > > >> >> early > > > > >> >> in v5 (I don't think it made it into v4) was to allow the > UDP > > > > >> receiver > > > > >> >> to > > > > >> >> insert multiple messages into the queue at once. That made > a > > > huge > > > > >> >> difference. > > > > >> > > > > > >> > No, I think that was something I did to both versions. At > some > > > > time, > > > > >> I did > > > > >> > optimizations to both v4 and v5, things like reducing > copies, > > > > >> reducing malloc > > > > >> > calls and so on. I am pretty sure submission batching was > > among > > > > them. > > > > >> > > > > >> I agree with David actually. While multiple tcp threads on > the > > > > input > > > > >> side certainly would be helpful, I believe the locking > overhead > > is > > > > >> likely the real culprit behind the inability to fully utilize > a > > > > >> multi-core machine with a single instance of rsyslog. In my > > > > >> experience, while the input thread was certainly relatively > > busy, > > > > the > > > > >> thread itself wasn't hitting a cpu bottleneck. Reducing some > of > > > the > > > > >> latencies around queuing and context switching is probably the > > > best > > > > >> place to spend time if the goal is improved performance. The > > > > earlier > > > > >> investigations into lockless queues combined with some > batching > > > may > > > > >> help to address these. As it stands, I don't regularly see > > > specific > > > > >> threads hitting cpu bottlenecks (assuming top -H is accurate). > > > > >> > > > > >> Also, if that is the problem (queues and context switching), > > > adding > > > > >> further division of work into imtcp may actually make the > > problem > > > > >> worse. That said, I'm not against reducing possible > bottlenecks > > > to > > > > >> get into the 1-10 gig input levels (at which this would > probably > > > > >> become an issue) - but I think the queues should be more > closely > > > > >> examined first. > > > > >> > > > > >> -Aaron > > > > >> _______________________________________________ > > > > >> rsyslog mailing list > > > > >> http://lists.adiscon.net/mailman/listinfo/rsyslog > > > > >> http://www.rsyslog.com > > > > > _______________________________________________ > > > > > rsyslog mailing list > > > > > http://lists.adiscon.net/mailman/listinfo/rsyslog > > > > > http://www.rsyslog.com > > > _______________________________________________ > > > rsyslog mailing list > > > http://lists.adiscon.net/mailman/listinfo/rsyslog > > > http://www.rsyslog.com > > _______________________________________________ > > rsyslog mailing list > > http://lists.adiscon.net/mailman/listinfo/rsyslog > > http://www.rsyslog.com > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com From rgerhards at hq.adiscon.com Wed Jun 9 10:52:55 2010 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Wed, 9 Jun 2010 10:52:55 +0200 Subject: [rsyslog] discussion request: performance enhancement for imtcp References: <9B6E2A8877C38245BFB15CC491A11DA7103E61@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103E65@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103E66@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103E67@GRFEXC.intern.adiscon.com> <25a9276339df3684fcdeb6af67d30467@asgard.lang.hm> <9B6E2A8877C38245BFB15CC491A11DA7103E6E@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103E77@GRFEXC.intern.adiscon.com> Message-ID: <9B6E2A8877C38245BFB15CC491A11DA7103E83@GRFEXC.intern.adiscon.com> > > Also, strings are not generated before the actions, but while we are > > processing them. Doing it up front would require even more memory and > > processing time, because we would need to run over all actions twice > (once to > > create the string, once to call the actions, storing all strings > created). > > This does not even make sense from a lock contention POV because each > action > > has a separate lock, so there can be no lock contention between > different > > actions. The question of whether to generate all strings for ONE > action > > upfront was the initial question, and I think we have reached some > consensus > > on it (meaning that it is at least wroth trying out the performance > effects > > and then decide). > > I'm not quite clear on the granularity here. > > if I have the config > > *.* file1 > *.* file2 > *.* @ip1 > *.* @ip2 > *.* @@ip3 > *.* @@ip4 > > for purposes of the locking, how many separate things are there? Sorry, I think should have defined some terms first. An *action* is a specific instance of some desired output. The actual processing carried out is NOT termed "action", even though one could easily do so. I have to admit I have not defined any term for that. So let's call this processing. That actual processing is carried out by the output module (and the really bad thing is that the entry point is named "doAction", which somewhat implies that the output module is called the action, what is not the case). Each action can use the service of exactly one output module. Each output module can provide services to many actions. So we have a N:1 relationship between actions and output modules. > depending on how I read your explination, sometimes it sounds like 6 > (one > for each line) and sometimes itsounds like 3 (one for file output, one > for > UDP send, one for TCP send) In the above samples, 3 output modules are involved, where each output module is used by two actions. We have 6 actions, and so we have 6 action locks. So the output module interface does not serialize access to the output module, but rather to the action instance. All action-specific data is kept in a separate, per-action data structure and passed into the output module at the time the doAction call is made. The output module can modify all of this instance data as if it were running on a single thread. HOWEVER, any global data items (in short: everything not inside the action instance data) is *not* synchronized by the rsyslog core. The output module must take care itself of synchronization if it desires to have concurrent access to such data items. All current output modules do NOT access global data other than for config parsing (which is serial and single-threaded by nature). I hope this clarifies. If not, please keep asking. It is important to get this right, and maybe I finally end up expressing me precise enough ;) Rainer > > >> note that the output lock is only needed when the two threads really > >> are > >> accessing the same thing (probably only for files, as you can have > two > >> network connections to the same destination at the same time, in > which > >> case you can use the path name as the lock id). For things like > >> databases, > >> network relays (including relp) it would probably be better if each > >> worker > >> thread opened it's own connection. In these cases the destination is > >> designed to accept messages in parallel on multiple connections > anyway. > >> The good news is that the more complex (and slower) sending methods > >> also > >> tend to be the ones that can have multiple outbound connections. > > > > I agree, but that's another quite large effort. None of the current > outputs > > are designed in that way, and it introduces quite some complexity in > error > > and recovery cases. Right now, I'd consider this the last thing that > I'd > > address. > > Ok, we'll discuss this when dealing with thread-safe output modules > > >> I seem to remember reading in the module explination that you do > some > >> trickery to take fairly normal code written in the module and make > it > >> thread-safe (by doing something with the variable access IIRC). > > > > That trick simply is the action lock -- so there is no concurrency at > that > > level. But I agree (and have begun to work on that idea) that it > would be > > useful to provide that capability, at least if the output supports > it. As it > > turned out today, there is still some other ground to explore before > going > > down that path. > > ahh, that makes sense (I was puzzeled over what trickery you had done > to > make the variables be thread-safe) > > >> If you have this (and use the filename as the lock) you also gain > >> protection against two different actions stepping on each other. > >> > >> I have a growing number of cases where I have things like > >> :hostname, isequal, "foo" /var/log/messages;fixup_format > >> & ~ > >> *.* /var/log/messages > >> > >> this works today if I'm sending over the network instead of writing > to > >> a > >> file, but on my relay boxes (which do both) I have a number of > >> corrupted > >> messages each day due to the different actions stepping on each > other. > > > > That is a bug I would be interested in finding. The threading model > does NOT > > allow for that possibility (I mean from a design point, as you > experience it > > happens, but the design does not mean this is valid). Still I will > keep > > myself now focused a bit on the performance optimization, it doesn't > make > > sense to now, that I have gained up momentum and knowledge in that > area > > again, start another bughunt and loose that momentum. But that's > definitely > > something I am interested in, it shows something works fundamentally > flawed. > > Ok, one thing at a time. > > >> note that if you do this output locking on files, it may be possible > to > >> do strange things like > >> > >> =*.info /var/log/messages > >> =*.debug /var/log/messages > >> etc > >> > >> and allow these to have multiple worker threads running so that each > >> worker be processing messages with different severity as different > >> actions > >> in parallel (with just a write lock around the final output to the > >> file). > >> This is far uglier than being able to do the action processing in > >> parallel, but may work. > > > > ah, OK, I guess I get the picture. You are writing to files with more > than > > one action. That does not work well. Ruleset inclusion is the current > > solution to it. In the long term, it may be useful to have a single > object > > that represents the file being written, no matter which rule is used > to do > > it. I'd say that's something that would go together with the new > config > > format... > > I think that it wouldn't need any change to the configs. the more I > think > about it the more I think this is only really a significant problem for > file output and (there it should be pretty trivial to implement), > everything else can just have multiple sockets open (one per > thread) > > >> I don't see much here where threads handling one message instead of > >> multiple messages could speed things up much. Since writes are not > >> atomic, > >> you still need the output locks (or multiple outputs) even if only > >> processing one message at a time. > >> > >> single thread, single message is a simpler case, but in that case > the > >> locking will be very close to a no-op anyway (since there will never > be > >> contention) > > > > One thing that I found out during my research and testing is that it > pays to > > look at a far more granular level, and todays change is the first > real-world > > approach to this. Not craft one method that does it all, but see the > > different config params and what they demand (same for transactions, > etc, > > etc.). Then code "driver"-like functions for that specific case and > call the > > rigth one for the config params given. That way it is possible to > provide > > high speed where it is possible but provide some costly features as > well. > > Then, they do not affect the majority of cases that do not need them > (in > > other words: pay the performance penalty only if you also get some > benefits > > from themn). The same holds true for some other optimizations that > can only > > be done when looking at a very fine-granular level. I think that it > will be > > possible to even get rid of locks at all in some important cases. I > will most > > probably try to introduce some lock-free alternative for the "mark" > case, not > > only to cover it, but also to see how it works in practice. Out of my > testing > > and reasearch, it should provide superb performance. If that turns > out to be > > true, I see many more potential for these methods. > > sounds good. lock free will almost always win. > > > I will try this at the moment, but at the expense of stability. The > next > > days, I'll try out at least some ideas and only after that I will see > what it > > takes to stabilize the engine in all cases again (getting a too-large > delta > > may make this stabilization too hard, doing the stabilization too > early > > distracts me from the real facts I intend to look at - but who said > life is > > easy ;)). > > I'm going to get mylab setup again to test this. > > David Lang From rgerhards at hq.adiscon.com Wed Jun 9 10:55:42 2010 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Wed, 9 Jun 2010 10:55:42 +0200 Subject: [rsyslog] discussion request: performance enhancement for imtcp References: <9B6E2A8877C38245BFB15CC491A11DA7103E61@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103E65@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103E66@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103E67@GRFEXC.intern.adiscon.com><25a9276339df3684fcdeb6af67d30467@asgard.lang.hm><9B6E2A8877C38245BFB15CC491A11DA7103E6E@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103E77@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103E83@GRFEXC.intern.adiscon.com> Message-ID: <9B6E2A8877C38245BFB15CC491A11DA7103E84@GRFEXC.intern.adiscon.com> I just blogged the description, and added this hopefully useful sentence: Note that the consistency of the action instance data is actually guarded by the rsyslog core by actually running the output module processing on a single thread *for that action*. But the output module code may be called concurrently if more than one action uses the same output module. That is a typical case. If so, each of the concurrently running instances receives its private instance data pointer but shares everything else. In context: http://blog.gerhards.net/2010/06/what-are-actions-and-action-instance.html Rainer > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of Rainer Gerhards > Sent: Wednesday, June 09, 2010 10:53 AM > To: rsyslog-users > Subject: Re: [rsyslog] discussion request: performance enhancement for > imtcp > > > > Also, strings are not generated before the actions, but while we > are > > > processing them. Doing it up front would require even more memory > and > > > processing time, because we would need to run over all actions > twice > > (once to > > > create the string, once to call the actions, storing all strings > > created). > > > This does not even make sense from a lock contention POV because > each > > action > > > has a separate lock, so there can be no lock contention between > > different > > > actions. The question of whether to generate all strings for ONE > > action > > > upfront was the initial question, and I think we have reached some > > consensus > > > on it (meaning that it is at least wroth trying out the performance > > effects > > > and then decide). > > > > I'm not quite clear on the granularity here. > > > > if I have the config > > > > *.* file1 > > *.* file2 > > *.* @ip1 > > *.* @ip2 > > *.* @@ip3 > > *.* @@ip4 > > > > for purposes of the locking, how many separate things are there? > > Sorry, I think should have defined some terms first. > > An *action* is a specific instance of some desired output. The actual > processing carried out is NOT termed "action", even though one could > easily > do so. I have to admit I have not defined any term for that. So let's > call > this processing. That actual processing is carried out by the output > module > (and the really bad thing is that the entry point is named "doAction", > which > somewhat implies that the output module is called the action, what is > not the > case). > > Each action can use the service of exactly one output module. Each > output > module can provide services to many actions. So we have a N:1 > relationship > between actions and output modules. > > > depending on how I read your explination, sometimes it sounds like 6 > > (one > > for each line) and sometimes itsounds like 3 (one for file output, > one > > for > > UDP send, one for TCP send) > > In the above samples, 3 output modules are involved, where each output > module > is used by two actions. We have 6 actions, and so we have 6 action > locks. > > So the output module interface does not serialize access to the output > module, but rather to the action instance. All action-specific data is > kept > in a separate, per-action data structure and passed into the output > module at > the time the doAction call is made. The output module can modify all of > this > instance data as if it were running on a single thread. HOWEVER, any > global > data items (in short: everything not inside the action instance data) > is > *not* synchronized by the rsyslog core. The output module must take > care > itself of synchronization if it desires to have concurrent access to > such > data items. All current output modules do NOT access global data other > than > for config parsing (which is serial and single-threaded by nature). > > I hope this clarifies. If not, please keep asking. It is important to > get > this right, and maybe I finally end up expressing me precise enough ;) > > Rainer > > > > > >> note that the output lock is only needed when the two threads > really > > >> are > > >> accessing the same thing (probably only for files, as you can have > > two > > >> network connections to the same destination at the same time, in > > which > > >> case you can use the path name as the lock id). For things like > > >> databases, > > >> network relays (including relp) it would probably be better if > each > > >> worker > > >> thread opened it's own connection. In these cases the destination > is > > >> designed to accept messages in parallel on multiple connections > > anyway. > > >> The good news is that the more complex (and slower) sending > methods > > >> also > > >> tend to be the ones that can have multiple outbound connections. > > > > > > I agree, but that's another quite large effort. None of the current > > outputs > > > are designed in that way, and it introduces quite some complexity > in > > error > > > and recovery cases. Right now, I'd consider this the last thing > that > > I'd > > > address. > > > > Ok, we'll discuss this when dealing with thread-safe output modules > > > > >> I seem to remember reading in the module explination that you do > > some > > >> trickery to take fairly normal code written in the module and make > > it > > >> thread-safe (by doing something with the variable access IIRC). > > > > > > That trick simply is the action lock -- so there is no concurrency > at > > that > > > level. But I agree (and have begun to work on that idea) that it > > would be > > > useful to provide that capability, at least if the output supports > > it. As it > > > turned out today, there is still some other ground to explore > before > > going > > > down that path. > > > > ahh, that makes sense (I was puzzeled over what trickery you had done > > to > > make the variables be thread-safe) > > > > >> If you have this (and use the filename as the lock) you also gain > > >> protection against two different actions stepping on each other. > > >> > > >> I have a growing number of cases where I have things like > > >> :hostname, isequal, "foo" /var/log/messages;fixup_format > > >> & ~ > > >> *.* /var/log/messages > > >> > > >> this works today if I'm sending over the network instead of > writing > > to > > >> a > > >> file, but on my relay boxes (which do both) I have a number of > > >> corrupted > > >> messages each day due to the different actions stepping on each > > other. > > > > > > That is a bug I would be interested in finding. The threading model > > does NOT > > > allow for that possibility (I mean from a design point, as you > > experience it > > > happens, but the design does not mean this is valid). Still I will > > keep > > > myself now focused a bit on the performance optimization, it > doesn't > > make > > > sense to now, that I have gained up momentum and knowledge in that > > area > > > again, start another bughunt and loose that momentum. But that's > > definitely > > > something I am interested in, it shows something works > fundamentally > > flawed. > > > > Ok, one thing at a time. > > > > >> note that if you do this output locking on files, it may be > possible > > to > > >> do strange things like > > >> > > >> =*.info /var/log/messages > > >> =*.debug /var/log/messages > > >> etc > > >> > > >> and allow these to have multiple worker threads running so that > each > > >> worker be processing messages with different severity as different > > >> actions > > >> in parallel (with just a write lock around the final output to the > > >> file). > > >> This is far uglier than being able to do the action processing in > > >> parallel, but may work. > > > > > > ah, OK, I guess I get the picture. You are writing to files with > more > > than > > > one action. That does not work well. Ruleset inclusion is the > current > > > solution to it. In the long term, it may be useful to have a single > > object > > > that represents the file being written, no matter which rule is > used > > to do > > > it. I'd say that's something that would go together with the new > > config > > > format... > > > > I think that it wouldn't need any change to the configs. the more I > > think > > about it the more I think this is only really a significant problem > for > > file output and (there it should be pretty trivial to implement), > > everything else can just have multiple sockets open (one per > > thread) > > > > >> I don't see much here where threads handling one message instead > of > > >> multiple messages could speed things up much. Since writes are not > > >> atomic, > > >> you still need the output locks (or multiple outputs) even if only > > >> processing one message at a time. > > >> > > >> single thread, single message is a simpler case, but in that case > > the > > >> locking will be very close to a no-op anyway (since there will > never > > be > > >> contention) > > > > > > One thing that I found out during my research and testing is that > it > > pays to > > > look at a far more granular level, and todays change is the first > > real-world > > > approach to this. Not craft one method that does it all, but see > the > > > different config params and what they demand (same for > transactions, > > etc, > > > etc.). Then code "driver"-like functions for that specific case and > > call the > > > rigth one for the config params given. That way it is possible to > > provide > > > high speed where it is possible but provide some costly features as > > well. > > > Then, they do not affect the majority of cases that do not need > them > > (in > > > other words: pay the performance penalty only if you also get some > > benefits > > > from themn). The same holds true for some other optimizations that > > can only > > > be done when looking at a very fine-granular level. I think that it > > will be > > > possible to even get rid of locks at all in some important cases. I > > will most > > > probably try to introduce some lock-free alternative for the "mark" > > case, not > > > only to cover it, but also to see how it works in practice. Out of > my > > testing > > > and reasearch, it should provide superb performance. If that turns > > out to be > > > true, I see many more potential for these methods. > > > > sounds good. lock free will almost always win. > > > > > I will try this at the moment, but at the expense of stability. The > > next > > > days, I'll try out at least some ideas and only after that I will > see > > what it > > > takes to stabilize the engine in all cases again (getting a too- > large > > delta > > > may make this stabilization too hard, doing the stabilization too > > early > > > distracts me from the real facts I intend to look at - but who said > > life is > > > easy ;)). > > > > I'm going to get mylab setup again to test this. > > > > David Lang > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com From david at lang.hm Wed Jun 9 17:48:47 2010 From: david at lang.hm (david at lang.hm) Date: Wed, 9 Jun 2010 08:48:47 -0700 (PDT) Subject: [rsyslog] discussion request: performance enhancement for imtcp In-Reply-To: <9B6E2A8877C38245BFB15CC491A11DA7103E83@GRFEXC.intern.adiscon.com> References: <9B6E2A8877C38245BFB15CC491A11DA7103E61@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103E65@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103E66@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103E67@GRFEXC.intern.adiscon.com> <25a9276339df3684fcdeb6af67d30467@asgard.lang.hm> <9B6E2A8877C38245BFB15CC491A11DA7103E6E@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103E77@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103E83@GRFEXC.intern.adiscon.com> Message-ID: On Wed, 9 Jun 2010, Rainer Gerhards wrote: > Date: Wed, 9 Jun 2010 10:52:55 +0200 > From: Rainer Gerhards >>> Also, strings are not generated before the actions, but while we are >>> processing them. Doing it up front would require even more memory and >>> processing time, because we would need to run over all actions twice >> (once to >>> create the string, once to call the actions, storing all strings >> created). >>> This does not even make sense from a lock contention POV because each >> action >>> has a separate lock, so there can be no lock contention between >> different >>> actions. The question of whether to generate all strings for ONE >> action >>> upfront was the initial question, and I think we have reached some >> consensus >>> on it (meaning that it is at least wroth trying out the performance >> effects >>> and then decide). >> >> I'm not quite clear on the granularity here. >> >> if I have the config >> >> *.* file1 >> *.* file2 >> *.* @ip1 >> *.* @ip2 >> *.* @@ip3 >> *.* @@ip4 >> >> for purposes of the locking, how many separate things are there? > > Sorry, I think should have defined some terms first. > > An *action* is a specific instance of some desired output. The actual > processing carried out is NOT termed "action", even though one could easily > do so. I have to admit I have not defined any term for that. So let's call > this processing. That actual processing is carried out by the output module > (and the really bad thing is that the entry point is named "doAction", which > somewhat implies that the output module is called the action, what is not the > case). > > Each action can use the service of exactly one output module. Each output > module can provide services to many actions. So we have a N:1 relationship > between actions and output modules. > >> depending on how I read your explination, sometimes it sounds like 6 >> (one >> for each line) and sometimes itsounds like 3 (one for file output, one >> for >> UDP send, one for TCP send) > > In the above samples, 3 output modules are involved, where each output module > is used by two actions. We have 6 actions, and so we have 6 action locks. > > So the output module interface does not serialize access to the output > module, but rather to the action instance. All action-specific data is kept > in a separate, per-action data structure and passed into the output module at > the time the doAction call is made. The output module can modify all of this > instance data as if it were running on a single thread. HOWEVER, any global > data items (in short: everything not inside the action instance data) is > *not* synchronized by the rsyslog core. The output module must take care > itself of synchronization if it desires to have concurrent access to such > data items. All current output modules do NOT access global data other than > for config parsing (which is serial and single-threaded by nature). > > I hope this clarifies. If not, please keep asking. It is important to get > this right, and maybe I finally end up expressing me precise enough ;) this clarifies a lot, but not everything if we can handle the following *.* file1 *.* file2 with having two worker threads running, one working on file1 and one working on file2 (with each having per-thred variables for everything except the global data and the actual filedescripter being written to). why couldn't we handle the case *.* file1 with two threads working on different messages. (if there was a lock around the actual write to file1)? I'm not understanding the difference between the two cases. I understnad that you have a lock to prevent this, but I don't understand what the lock is protecting. David Lang From rgerhards at hq.adiscon.com Thu Jun 10 07:58:41 2010 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Thu, 10 Jun 2010 07:58:41 +0200 Subject: [rsyslog] discussion request: performance enhancement for imtcp References: <9B6E2A8877C38245BFB15CC491A11DA7103E61@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103E65@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103E66@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103E67@GRFEXC.intern.adiscon.com><25a9276339df3684fcdeb6af67d30467@asgard.lang.hm><9B6E2A8877C38245BFB15CC491A11DA7103E6E@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103E77@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103E83@GRFEXC.intern.adiscon.com> Message-ID: <9B6E2A8877C38245BFB15CC491A11DA7103E8C@GRFEXC.intern.adiscon.com> > this clarifies a lot, but not everything > > if we can handle the following > > *.* file1 > *.* file2 > > with having two worker threads running, one working on file1 and one > working on file2 (with each having per-thred variables for everything > except the global data and the actual filedescripter being written to). > > why couldn't we handle the case > > *.* file1 > > with two threads working on different messages. (if there was a lock > around the actual write to file1)? > > I'm not understanding the difference between the two cases. I > understnad > that you have a lock to prevent this, but I don't understand what the > lock > is protecting. Well, the simple answer is "because the plugin interface specifies it in this way". Remember that the interface was created around 2004 (mmmhhh, maybe a bit later, but in any case quite some while ago...) and there were far other problems than there are today. So the main focus was on getting things done, and guaranteeing single-threadedness within an action definitely helped getting things done. Looking at omfile implementation as an example, there *are* lots of things guarded by this look. Just think about the dynafile cache, the ZIP writer or the background write process. All of them use relatively simple algorithms based on the assumption that the core guarantees exclusive use of the instance data structures. Removing that guarantee is a non-trivial task. HOWEVER, as I wrote, I will head into the direction where an output module can actually *request* the core to call it concurrently, even for a single action instance. This capability is part of the interface "spec", but was so far never implemented. When I finally do this, I need to check each plugin, and some more require large modifications to support that. I will probably begin with omfile, where the first thing again is to partition processing based on configuration selected. If you just use plain files, without zip, without async writing, and without other bells and whistles I currently do not have on my mind, the algorithm can greatly be simplified and a single lock around the write loop would be sufficient. But for the dynafile case, I then need to create a lot of new code to ensure the cache structure is not damaged by concurrent access and I also need totally different ways to convey the fd to be used back to the actual file writer. All of this heavily depends on the exclusivity guaranteed by the interface spec. I am not even sure yet this finer lock granularity will be faster -- it may introduce more overhead than it saves (but I don't want to argue about this today, as it is too far away and I do not yet have a clear understanding. Being optimistic, it may be possible to do it lock-free). Note that the structure of this work is very closely related to what I currently do in regard to stage 1 action processing. So once I have tackled that, I think I have a quite good blueprint of the type of modifications I need to make. Please keep asking if there are still things that are not certain. Rainer From david at lang.hm Thu Jun 10 08:51:11 2010 From: david at lang.hm (david at lang.hm) Date: Wed, 9 Jun 2010 23:51:11 -0700 (PDT) Subject: [rsyslog] discussion request: performance enhancement for imtcp In-Reply-To: <9B6E2A8877C38245BFB15CC491A11DA7103E8C@GRFEXC.intern.adiscon.com> References: <9B6E2A8877C38245BFB15CC491A11DA7103E61@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103E65@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103E66@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103E67@GRFEXC.intern.adiscon.com><25a9276339df3684fcdeb6af67d30467@asgard.lang.hm><9B6E2A8877C38245BFB15CC491A11DA7103E6E@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103E77@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103E83@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103E8C@GRFEXC.intern.adiscon.com> Message-ID: On Thu, 10 Jun 2010, Rainer Gerhards wrote: >> this clarifies a lot, but not everything >> >> if we can handle the following >> >> *.* file1 >> *.* file2 >> >> with having two worker threads running, one working on file1 and one >> working on file2 (with each having per-thred variables for everything >> except the global data and the actual filedescripter being written to). >> >> why couldn't we handle the case >> >> *.* file1 >> >> with two threads working on different messages. (if there was a lock >> around the actual write to file1)? >> >> I'm not understanding the difference between the two cases. I >> understnad >> that you have a lock to prevent this, but I don't understand what the >> lock >> is protecting. > > Well, the simple answer is "because the plugin interface specifies it in this > way". Remember that the interface was created around 2004 (mmmhhh, maybe a > bit later, but in any case quite some while ago...) and there were far other > problems than there are today. So the main focus was on getting things done, > and guaranteeing single-threadedness within an action definitely helped > getting things done. > > Looking at omfile implementation as an example, there *are* lots of things > guarded by this look. Just think about the dynafile cache, the ZIP writer or > the background write process. All of them use relatively simple algorithms > based on the assumption that the core guarantees exclusive use of the > instance data structures. Removing that guarantee is a non-trivial task. > > HOWEVER, as I wrote, I will head into the direction where an output module > can actually *request* the core to call it concurrently, even for a single > action instance. This capability is part of the interface "spec", but was so > far never implemented. When I finally do this, I need to check each plugin, > and some more require large modifications to support that. > > I will probably begin with omfile, where the first thing again is to > partition processing based on configuration selected. If you just use plain > files, without zip, without async writing, and without other bells and > whistles I currently do not have on my mind, the algorithm can greatly be > simplified and a single lock around the write loop would be sufficient. But > for the dynafile case, I then need to create a lot of new code to ensure the > cache structure is not damaged by concurrent access and I also need totally > different ways to convey the fd to be used back to the actual file writer. > All of this heavily depends on the exclusivity guaranteed by the interface > spec. I am not even sure yet this finer lock granularity will be faster -- it > may introduce more overhead than it saves (but I don't want to argue about > this today, as it is too far away and I do not yet have a clear > understanding. Being optimistic, it may be possible to do it lock-free). > > Note that the structure of this work is very closely related to what I > currently do in regard to stage 1 action processing. So once I have tackled > that, I think I have a quite good blueprint of the type of modifications I > need to make. > > Please keep asking if there are still things that are not certain. this makes lots of sense. It may be worthwhile doing some testing on how much time is spent in the output portion vs time spent on the selection/string manipulation (especially with the new high-speed fixed templates). I am speculating that there is a significant amount of time spent if the selection/template stage crafting the strings that are then written out (possibly by being sent through zip, accessing a dynafile, etc). If so there may be a significant win in just moving the lock aquisition from the beginning of the action to just before writing (by allowing one thread to be crafting the string while another is writing it out) I would expect the win to be less with more complex output (dynafiles/compression) but I also suspect that high speed sites will tend to not use these features as much (especially if they make a noticable difference in the speed ;-) something to try a quick test on sometime. David Lang From rgerhards at hq.adiscon.com Thu Jun 10 09:00:17 2010 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Thu, 10 Jun 2010 09:00:17 +0200 Subject: [rsyslog] discussion request: performance enhancement for imtcp References: <9B6E2A8877C38245BFB15CC491A11DA7103E61@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103E65@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103E66@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103E67@GRFEXC.intern.adiscon.com><25a9276339df3684fcdeb6af67d30467@asgard.lang.hm><9B6E2A8877C38245BFB15CC491A11DA7103E6E@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103E77@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103E83@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103E8C@GRFEXC.intern.adiscon.com> Message-ID: <9B6E2A8877C38245BFB15CC491A11DA7103E8E@GRFEXC.intern.adiscon.com> > It may be worthwhile doing some testing on how much time is spent in > the > output portion vs time spent on the selection/string manipulation > (especially with the new high-speed fixed templates). I am speculating > that there is a significant amount of time spent if the > selection/template > stage crafting the strings that are then written out (possibly by being > sent through zip, accessing a dynafile, etc). If so there may be a > significant win in just moving the lock aquisition from the beginning > of > the action to just before writing (by allowing one thread to be > crafting > the string while another is writing it out) I agree, and that is my plan. Hopefully for next week. As you have probably seen, I made some changes with good results, but there are correctness issues (e.g. I see some duplicated strings). I will now focus on getting the code correct again (which probably means some loss of performance) and then I'll dig further into that direction. I have lots of ideas, and I think there is lots of potential for speedup. It requires partial redesign, but that doesn't matter. > I would expect the win to be less with more complex output > (dynafiles/compression) but I also suspect that high speed sites will > tend > to not use these features as much (especially if they make a noticable > difference in the speed ;-) My overall route will be to make rsyslog look at the parameters actually being used for a given problem (output, queue, whatever) and select the best algorithm for these parameters. So if someone needs complex features, they will cost performance, but those who only use simpler features will gain speed benefits. That, however, doesn't mean I don't care about complex features. I still will try to make them as fast as possible (and we seem not to be too bad at this if I read third-party comparisons). But the "one algorithm fits all needs" approach so far used doesn't give us excellent performance in all cases. If it is possible to avoid the more costly features, that's a good choice for a user. If not, one needs to pay a price. My goal, however, is that rsyslog offers complex features far less expensive than other syslogds. Using multiple algorithms is obviously more development effort, and this is the premier reason I avoided it so far. But looking at what we currently have reached, I think it pays of and is the right thing to do (it would not have been three years ago). Rainer > > something to try a quick test on sometime. > > David Lang > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com From rgerhards at hq.adiscon.com Thu Jun 10 10:45:20 2010 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Thu, 10 Jun 2010 10:45:20 +0200 Subject: [rsyslog] discussion request: performance enhancement for imtcp References: <9B6E2A8877C38245BFB15CC491A11DA7103E61@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103E65@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103E66@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103E67@GRFEXC.intern.adiscon.com><25a9276339df3684fcdeb6af67d30467@asgard.lang.hm><9B6E2A8877C38245BFB15CC491A11DA7103E6E@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103E77@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103E83@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103E8C@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA71 03E8E@GRFEXC.intern.adiscon.com> Message-ID: <9B6E2A8877C38245BFB15CC491A11DA7103E92@GRFEXC.intern.adiscon.com> > I agree, and that is my plan. Hopefully for next week. As you have > probably > seen, I made some changes with good results, but there are correctness > issues > (e.g. I see some duplicated strings). I will now focus on getting the > code > correct again (which probably means some loss of performance) and then > I'll > dig further into that direction. I have identified and fixed the regression. It was exactly in the area of transaction support/string generation, where I had included very poor PoC code. Not unexpectedly, that means performance and scalability will drop again considerably with this fix. But please do not be concerned about that. I now know where to look at and what to change and will do that as next things. It may just take a little more than a couple of hours ;) Rainer From epiphani at gmail.com Thu Jun 10 14:16:34 2010 From: epiphani at gmail.com (Aaron Wiebe) Date: Thu, 10 Jun 2010 08:16:34 -0400 Subject: [rsyslog] discussion request: performance enhancement for imtcp In-Reply-To: <9B6E2A8877C38245BFB15CC491A11DA7103E8C@GRFEXC.intern.adiscon.com> References: <9B6E2A8877C38245BFB15CC491A11DA7103E61@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103E65@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103E66@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103E67@GRFEXC.intern.adiscon.com> <25a9276339df3684fcdeb6af67d30467@asgard.lang.hm> <9B6E2A8877C38245BFB15CC491A11DA7103E6E@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103E77@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103E83@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103E8C@GRFEXC.intern.adiscon.com> Message-ID: On Thu, Jun 10, 2010 at 1:58 AM, Rainer Gerhards wrote: > > Looking at omfile implementation as an example, there *are* lots of things > guarded by this look. Just think about the dynafile cache, the ZIP writer or > the background write process. All of them use relatively simple algorithms > based on the assumption that the core guarantees exclusive use of the > instance data structures. Removing that guarantee is a non-trivial task. I'd like to see the dynafile cache go away, personally. That section of code has always felt problematic, and it seems that there would be significantly more effective ways of doing the same task. I personally prefer the timeout method where files idle for x seconds are closed out. I am of the opinion that the current approach is just generally bad. In the case of compression, couldn't that simply sit inside the same lock used for writing, just prior to the write() itself? -Aaron From rgerhards at hq.adiscon.com Thu Jun 10 14:46:58 2010 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Thu, 10 Jun 2010 14:46:58 +0200 Subject: [rsyslog] discussion request: performance enhancement for imtcp References: <9B6E2A8877C38245BFB15CC491A11DA7103E61@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103E65@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103E66@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103E67@GRFEXC.intern.adiscon.com><25a9276339df3684fcdeb6af67d30467@asgard.lang.hm><9B6E2A8877C38245BFB15CC491A11DA7103E6E@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103E77@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103E83@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103E8C@GRFEXC.intern.adiscon.com> Message-ID: <9B6E2A8877C38245BFB15CC491A11DA7103E93@GRFEXC.intern.adiscon.com> > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of Aaron Wiebe > Sent: Thursday, June 10, 2010 2:17 PM > To: rsyslog-users > Subject: Re: [rsyslog] discussion request: performance enhancement for > imtcp > > On Thu, Jun 10, 2010 at 1:58 AM, Rainer Gerhards > wrote: > > > > Looking at omfile implementation as an example, there *are* lots of > things > > guarded by this look. Just think about the dynafile cache, the ZIP > writer or > > the background write process. All of them use relatively simple > algorithms > > based on the assumption that the core guarantees exclusive use of the > > instance data structures. Removing that guarantee is a non-trivial > task. > > I'd like to see the dynafile cache go away, personally. That section > of code has always felt problematic, and it seems that there would be > significantly more effective ways of doing the same task. I > personally prefer the timeout method where files idle for x seconds > are closed out. I am of the opinion that the current approach is just > generally bad. I think you misunderstand that cache. If there would be no cache, that would mean we would need to close the current dynafile and open a new one whenever the action needs to route a message to a different file. That would have enormous performance impact. Think about a dynafile action as one that potentially writes to 100 files at once. Then think about having only one file open at one time and then think about the message stream. Note that this is NOT related to the question if a file is closed after an inactivity timeout or only after cache eviction. Inactivity timeout could (and probably will) be implemented together with the dynafile cache. Anyhow, suggestions for handling dynafiles in a different way are very welcome. This is something I have under consideration. > In the case of compression, couldn't that simply sit inside the same > lock used for writing, just prior to the write() itself? There are a number of subtle issues, one being the dynafile cache (one action can write to more than one file), another one being the potential for async io (which is not really enabled in v4 but will be in v5). If we had just a single lock for (logical) writing, we could not fill up a buffer and physically write it out only if it were full. So, for ZIP, you would need to compress single messages. That means a sharp drop in compression ratio (I guess you can not achieve more than 10 to 20% on single messages if you would like to have recovery records). Rainer From epiphani at gmail.com Thu Jun 10 15:11:19 2010 From: epiphani at gmail.com (Aaron Wiebe) Date: Thu, 10 Jun 2010 09:11:19 -0400 Subject: [rsyslog] discussion request: performance enhancement for imtcp In-Reply-To: <9B6E2A8877C38245BFB15CC491A11DA7103E93@GRFEXC.intern.adiscon.com> References: <9B6E2A8877C38245BFB15CC491A11DA7103E61@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103E65@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103E66@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103E67@GRFEXC.intern.adiscon.com> <25a9276339df3684fcdeb6af67d30467@asgard.lang.hm> <9B6E2A8877C38245BFB15CC491A11DA7103E6E@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103E77@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103E83@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103E8C@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103E93@GRFEXC.intern.adiscon.com> Message-ID: On Thu, Jun 10, 2010 at 8:46 AM, Rainer Gerhards wrote: > > I think you misunderstand that cache. If there would be no cache, that would > mean we would need to close the current dynafile and open a new one whenever > the action needs to route a message to a different file. That would have > enormous performance impact. Think about a dynafile action as one that > potentially writes to 100 files at once. Then think about having only one > file open at one time and then think about the message stream. I should rephrase - I do understand the requirement for such a caching system, however I dislike the implementation. I would prefer a simpler lookup method without this idea of a cache size, forced expiration, or files open forever. There is no deterministic expectation on what happens when a file enters the cache, it is entirely dependent upon what takes place with _other_ files. I would prefer to have a simpler lookup cache with a negative result triggering an open, and that being the only expectation of the cache. The only limit to the number of files currently open should be the open files limit provided by the OS. I am curious: how is an output file identified at present? -Aaron > There are a number of subtle issues, one being the dynafile cache (one action > can write to more than one file), another one being the potential for async > io (which is not really enabled in v4 but will be in v5). If we had just a > single lock for (logical) writing, we could not fill up a buffer and > physically write it out only if it were full. So, for ZIP, you would need to > compress single messages. That means a sharp drop in compression ratio (I > guess you can not achieve more than 10 to 20% on single messages if you would > like to have recovery records). I don't fully understand how the async writer interacts with an action thread yet... is it a one-action-many-writers relationship? -Aaron From rgerhards at hq.adiscon.com Thu Jun 10 15:17:49 2010 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Thu, 10 Jun 2010 15:17:49 +0200 Subject: [rsyslog] discussion request: performance enhancement for imtcp References: <9B6E2A8877C38245BFB15CC491A11DA7103E61@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103E65@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103E66@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103E67@GRFEXC.intern.adiscon.com><25a9276339df3684fcdeb6af67d30467@asgard.lang.hm><9B6E2A8877C38245BFB15CC491A11DA7103E6E@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103E77@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103E83@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103E8C@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC4 91A11DA7103E 93@GRFEXC.intern.adiscon.com> Message-ID: <9B6E2A8877C38245BFB15CC491A11DA7103E94@GRFEXC.intern.adiscon.com> > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of Aaron Wiebe > Sent: Thursday, June 10, 2010 3:11 PM > To: rsyslog-users > Subject: Re: [rsyslog] discussion request: performance enhancement for > imtcp > > On Thu, Jun 10, 2010 at 8:46 AM, Rainer Gerhards > wrote: > > > > I think you misunderstand that cache. If there would be no cache, > that would > > mean we would need to close the current dynafile and open a new one > whenever > > the action needs to route a message to a different file. That would > have > > enormous performance impact. Think about a dynafile action as one > that > > potentially writes to 100 files at once. Then think about having only > one > > file open at one time and then think about the message stream. > > I should rephrase - I do understand the requirement for such a caching > system, however I dislike the implementation. I would prefer a > simpler lookup method without this idea of a cache size, forced > expiration, or files open forever. There is no deterministic > expectation on what happens when a file enters the cache, it is > entirely dependent upon what takes place with _other_ files. > > I would prefer to have a simpler lookup cache with a negative result > triggering an open, and that being the only expectation of the cache. To all of this, I simply agree ;) > The only limit to the number of files currently open should be the > open files limit provided by the OS. ummm... think that sockets take form the same camp, so there must be a limit. Also, even with a timeout, some files may be dormant for at least a couple of minutes (or seconds -- user option). >I am curious: how is an output > file identified at present? via it's name, so search is relatively expensive So I simply misunderstood your earlier comment. The current implementation is not quite well. A new one should also solve the different-actions-write-to-same-file problem. But all of that comes after the current round of optimizations (you need to start at one spot ;)). > > There are a number of subtle issues, one being the dynafile cache > (one action > > can write to more than one file), another one being the potential for > async > > io (which is not really enabled in v4 but will be in v5). If we had > just a > > single lock for (logical) writing, we could not fill up a buffer and > > physically write it out only if it were full. So, for ZIP, you would > need to > > compress single messages. That means a sharp drop in compression > ratio (I > > guess you can not achieve more than 10 to 20% on single messages if > you would > > like to have recovery records). > > I don't fully understand how the async writer interacts with an action > thread yet... is it a one-action-many-writers relationship? right now, it is one-on-one, as the action is guarded by the action lock (and remember that on my list is removing that lock for select actions, omfile being one of them!). The async writer is simply set to do async-io, it's essentially double-buffering in mainframe terms (and the current implementation isn't even very efficient). Rainer > > -Aaron > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com From epiphani at gmail.com Thu Jun 10 15:31:18 2010 From: epiphani at gmail.com (Aaron Wiebe) Date: Thu, 10 Jun 2010 09:31:18 -0400 Subject: [rsyslog] discussion request: performance enhancement for imtcp In-Reply-To: <9B6E2A8877C38245BFB15CC491A11DA7103E94@GRFEXC.intern.adiscon.com> References: <9B6E2A8877C38245BFB15CC491A11DA7103E61@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103E65@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103E66@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103E67@GRFEXC.intern.adiscon.com> <25a9276339df3684fcdeb6af67d30467@asgard.lang.hm> <9B6E2A8877C38245BFB15CC491A11DA7103E6E@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103E77@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103E83@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103E8C@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103E94@GRFEXC.intern.adiscon.com> Message-ID: On Thu, Jun 10, 2010 at 9:17 AM, Rainer Gerhards wrote: > >> The only limit to the number of files currently open should be the >> open files limit provided by the OS. > > ummm... think that sockets take form the same camp, so there must be a limit. > Also, even with a timeout, some files may be dormant for at least a couple of > minutes (or seconds -- user option). Well yes - but those pool sizes are defined through max listeners and max sessions, so we could have anything that was left over. >>I am curious: ?how is an output >> file identified at present? > > via it's name, so search is relatively expensive Sounds like the cache should be implemented as a hash table. :) >> I don't fully understand how the async writer interacts with an action >> thread yet... is it a one-action-many-writers relationship? > > right now, it is one-on-one, as the action is guarded by the action lock (and > remember that on my list is removing that lock for select actions, omfile > being one of them!). The async writer is simply set to do async-io, it's > essentially double-buffering in mainframe terms (and the current > implementation isn't even very efficient). So... just to clarify the separation of work, the async writer buffers and writes, and that's it? -Aaron From rgerhards at hq.adiscon.com Thu Jun 10 16:11:38 2010 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Thu, 10 Jun 2010 16:11:38 +0200 Subject: [rsyslog] discussion request: performance enhancement for imtcp References: <9B6E2A8877C38245BFB15CC491A11DA7103E61@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103E65@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103E66@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103E67@GRFEXC.intern.adiscon.com><25a9276339df3684fcdeb6af67d30467@asgard.lang.hm><9B6E2A8877C38245BFB15CC491A11DA7103E6E@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103E77@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103E83@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103E8C@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103E94@GRFEXC.intern.adiscon.com> Message-ID: <9B6E2A8877C38245BFB15CC491A11DA7103E98@GRFEXC.intern.adiscon.com> > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of Aaron Wiebe > Sent: Thursday, June 10, 2010 3:31 PM > To: rsyslog-users > Subject: Re: [rsyslog] discussion request: performance enhancement for > imtcp > > On Thu, Jun 10, 2010 at 9:17 AM, Rainer Gerhards > wrote: > > > >> The only limit to the number of files currently open should be the > >> open files limit provided by the OS. > > > > ummm... think that sockets take form the same camp, so there must be > a limit. > > Also, even with a timeout, some files may be dormant for at least a > couple of > > minutes (or seconds -- user option). > > Well yes - but those pool sizes are defined through max listeners and > max sessions, so we could have anything that was left over. agreed -- but I will look into these details once I am there ;) > > >>I am curious: ?how is an output > >> file identified at present? > > > > via it's name, so search is relatively expensive > > Sounds like the cache should be implemented as a hash table. :) Yes, but I need to find good hash functions. If someone already has suggestions, they are welcome (while I will not tackle this immediately, I would record that information). > > >> I don't fully understand how the async writer interacts with an > action > >> thread yet... is it a one-action-many-writers relationship? > > > > right now, it is one-on-one, as the action is guarded by the action > lock (and > > remember that on my list is removing that lock for select actions, > omfile > > being one of them!). The async writer is simply set to do async-io, > it's > > essentially double-buffering in mainframe terms (and the current > > implementation isn't even very efficient). > > So... just to clarify the separation of work, the async writer buffers > and writes, and that's it? Let's tell in terms of producer and consumer. The producer is omfile, as far as it delivers data to the buffer. The consumer than uses the buffer filled by the producer, ZIPs it if required to do so and emits the write request. Right now, there are always two buffers, so while the consumer processes one, the producer fills the other one. If the consumer is too slow the producer will block. And if you look carefully at the code, you'll see that producer and consumer barely run in parallel due to a "design issue" that I did not want (for various reasons) sort out in v4. Once I do so, performance should be even better in v5 (this has the potential to double the speed of file writes -- but not total message processing time). But this is part of the overall omfile refactoring. Rainer > > -Aaron > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com From epiphani at gmail.com Thu Jun 10 16:27:15 2010 From: epiphani at gmail.com (Aaron Wiebe) Date: Thu, 10 Jun 2010 10:27:15 -0400 Subject: [rsyslog] discussion request: performance enhancement for imtcp In-Reply-To: <9B6E2A8877C38245BFB15CC491A11DA7103E98@GRFEXC.intern.adiscon.com> References: <9B6E2A8877C38245BFB15CC491A11DA7103E61@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103E65@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103E66@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103E67@GRFEXC.intern.adiscon.com> <25a9276339df3684fcdeb6af67d30467@asgard.lang.hm> <9B6E2A8877C38245BFB15CC491A11DA7103E6E@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103E77@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103E83@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103E8C@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103E94@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103E98@GRFEXC.intern.adiscon.com> Message-ID: On Thu, Jun 10, 2010 at 10:11 AM, Rainer Gerhards wrote: >> Sounds like the cache should be implemented as a hash table. ?:) > > Yes, but I need to find good hash functions. If someone already has > suggestions, they are welcome (while I will not tackle this immediately, I > would record that information). Perl5 has a fairly good and simple generic hashing algorithm that I've used in the past. As follows (simplified): unsigned int get_hashkey(void *key) { char *rkey = (char *)key; int len; unsigned int hash = 0; len = strlen(rkey); if (len > MAX_KEYLEN) len = MAX_KEYLEN; while (len--) hash = hash * 33 + *rkey++; return hash % HASH_SIZE; } HASH_SIZE should be a suitable large prime number. For what we're doing, I would guess something like 3001 would be a good hash table size, and provide a pretty low chance of collisions. Mind you, this algorithm isn't optimized at all for the fairly limited input that is "name", and could hash just about anything. Probably not the fastest we could get, but it would certainly be worth using for the short term. -Aaron From tom at ng23.net Fri Jun 11 00:13:01 2010 From: tom at ng23.net (Tom Brown) Date: Thu, 10 Jun 2010 23:13:01 +0100 Subject: [rsyslog] rsyslog-relp for RHEL5.5 ? Message-ID: Hi We use rsyslog-relp and recent upgrades to RHEL5.5 have caused some issues as there is a later release of rsyslog in here than what we have used in the past. Does anyone know if there is an rsyslog-relp that will go with rsyslog that comes from RHEL5.5 which is rsyslog-3.22.1-3.el5 ?? many thanks From me at gavitron.com Fri Jun 11 10:24:24 2010 From: me at gavitron.com (Gavin McDonald) Date: Fri, 11 Jun 2010 01:24:24 -0700 Subject: [rsyslog] rsyslog TLS question Message-ID: Greetings list, I was hoping you could offer a small piece of advice, re: TLS certificates and rsyslog; I have a farm of ubuntu instances in the Amazon EC2 cloud, and am implementing encrypted remote syslogging. In the gssapi documentation, it states that it is a bad idea to "use these [host certificates] on more than one instance, [because] doing so would prevent [me] from distinguising between the instances and thus would disable useful authentication." This would mean that not only do I have to create over 50 client certs to start with, but because of the way in which we currently backup & provision EC2 cloud server instances, I would have to generate a new host cert on every instance reboot. Besides the obvious security concerns, what effects would there be from sharing a cert as it is explicitly stated to not do? How indistinguishable does the log traffic become? Don't remote syslog messages come with a hostname in plaintext anyway? (Besides, rsyslog has templated output too !) Would time-stamp collisions cause logging failures? If the issues are not unresolvable, my plan is to generate a unique client certificate per machine TYPE, (webserver, DB & slaves, load-balancer, api proxy, etc.) thus allowing me to continue with our current method of single-image instance provisioning, while gaining (mostly) secure centralized logging. I'd appreciate some experienced insight into the matter, of course, hence this email. Regards, -G Gavin McDonald From rgerhards at hq.adiscon.com Fri Jun 11 14:54:11 2010 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Fri, 11 Jun 2010 14:54:11 +0200 Subject: [rsyslog] rsyslog TLS question References: Message-ID: <9B6E2A8877C38245BFB15CC491A11DA7103EA3@GRFEXC.intern.adiscon.com> > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of Gavin McDonald > Sent: Friday, June 11, 2010 10:24 AM > To: rsyslog at lists.adiscon.com > Subject: [rsyslog] rsyslog TLS question > > Greetings list, I was hoping you could offer a small piece of advice, > re: > TLS certificates and rsyslog; > > I have a farm of ubuntu instances in the Amazon EC2 cloud, and am > implementing encrypted remote syslogging. In the gssapi documentation, > it > states that it is a bad idea to "use these [host certificates] on more > than > one instance, [because] doing so would prevent [me] from distinguising > between the instances and thus would disable useful authentication." > > This would mean that not only do I have to create over 50 client certs > to > start with, but because of the way in which we currently backup & > provision > EC2 cloud server instances, I would have to generate a new host cert on > every instance reboot. Besides the obvious security concerns, what > effects > would there be from sharing a cert as it is explicitly stated to not > do? Well, the certificate is the machine's ID. So if you share certificates among different machines, you never know which machine you are talking to. It now depends on your security requirement if that is a problem or not. If it is sufficient to know that the peer you are talking to is one of those that you manage, everything is fine. If you need to set finer-grained permissions, you can probably not take this route. Most often, such finer grained permissions are NOT needed. > How indistinguishable does the log traffic become? Don't remote syslog > messages come with a hostname in plaintext anyway? (Besides, rsyslog > has > templated output too !) The certificat is solely used for access control. It does not affect the hostname as given inside the message. The reason, for example, is relay chains. The RFCs demand mutual authentication, and rsyslog implements it, but there is neither yet a RFC nor a rsyslog implementation that permits only select messages to be received based on certificate permissions. This would be possible, but as noone ever asked for such a feature, I'd suspect that it is not really needed. > Would time-stamp collisions cause logging > failures? > > > If the issues are not unresolvable, my plan is to generate a unique > client > certificate per machine TYPE, (webserver, DB & slaves, load-balancer, > api > proxy, etc.) thus allowing me to continue with our current method of > single-image instance provisioning, while gaining (mostly) secure > centralized logging. I'd appreciate some experienced insight into the > matter, of course, hence this email. I personally find this a very good approach. Note that many real-world applciations use certificates, and certificate chains quite different from the one size fits all approach the the core CAs want to tell us. I know, for example, that folks in healthcare use certificates, but in a similar sense like you suggest BUT they do not use any public CAs and have their own web of trust (as a side-note, this is the reason why rsyslog supports multiple certificates at the same time). Even in healthcare, this is not the only approach. I think it is important to make a good policy decision based on ones own needs. This obviously includes knowing the details of all facts involved and I guess that exactly was the reason for your question ;) And as usual in security: YMMV ;) Rainer > > Regards, > -G > > Gavin McDonald > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com From me at gavitron.com Fri Jun 11 20:24:43 2010 From: me at gavitron.com (Gavin McDonald) Date: Fri, 11 Jun 2010 11:24:43 -0700 Subject: [rsyslog] rsyslog TLS question In-Reply-To: <9B6E2A8877C38245BFB15CC491A11DA7103EA3@GRFEXC.intern.adiscon.com> References: <9B6E2A8877C38245BFB15CC491A11DA7103EA3@GRFEXC.intern.adiscon.com> Message-ID: On Fri, Jun 11, 2010 at 5:54 AM, Rainer Gerhards wrote: > > -----Original Message----- > > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > > bounces at lists.adiscon.com] On Behalf Of Gavin McDonald > > Sent: Friday, June 11, 2010 10:24 AM > > To: rsyslog at lists.adiscon.com > > Subject: [rsyslog] rsyslog TLS question > Well, the certificate is the machine's ID. So if you share certificates > among > different machines, you never know which machine you are talking to. > > It now depends on your security requirement if that is a problem or not. If > it is sufficient to know that the peer you are talking to is one of those > that you manage, everything is fine. If you need to set finer-grained > permissions, you can probably not take this route. > >From a permissions standpoint, It is enough to know that I can 'trust' the log source to be one of my machines, and not spoofed data. My one concern is when you say that I will "never know which machine [I am] talking to." You mean this only from an authentication perspective, correct? I can handle that - but I need (would like) to know the identity of the source host for log analytics once they are collected. You do also state that " It does not affect the hostname as given inside the message," so I think the above assumption is correct, I just don't want to get caught out on an assumption. (you know what they say...) :) Thanks again for your time on this, -G Gavin McDonald. From rgerhards at hq.adiscon.com Fri Jun 11 22:03:27 2010 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Fri, 11 Jun 2010 22:03:27 +0200 Subject: [rsyslog] rsyslog TLS question References: <9B6E2A8877C38245BFB15CC491A11DA7103EA3@GRFEXC.intern.adiscon.com> Message-ID: <9B6E2A8877C38245BFB15CC491A11DA7103EA4@GRFEXC.intern.adiscon.com> > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of Gavin McDonald > Sent: Friday, June 11, 2010 8:25 PM > To: rsyslog-users > Subject: Re: [rsyslog] rsyslog TLS question > > On Fri, Jun 11, 2010 at 5:54 AM, Rainer Gerhards > wrote: > > > > -----Original Message----- > > > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > > > bounces at lists.adiscon.com] On Behalf Of Gavin McDonald > > > Sent: Friday, June 11, 2010 10:24 AM > > > To: rsyslog at lists.adiscon.com > > > Subject: [rsyslog] rsyslog TLS question > > > Well, the certificate is the machine's ID. So if you share certificates > > among > > different machines, you never know which machine you are talking to. > > > > It now depends on your security requirement if that is a problem or > not. If > > it is sufficient to know that the peer you are talking to is one of > those > > that you manage, everything is fine. If you need to set finer-grained > > permissions, you can probably not take this route. > > > > >From a permissions standpoint, It is enough to know that I can 'trust' > the > log source to be one of my machines, and not spoofed data. My one > concern > is when you say that I will "never know which machine [I am] talking > to." > You mean this only from an authentication perspective, correct? Yes, that's right -- it has no effect on the actual message (and thus on the hostname inside it). :) Rainer >I can > handle that - but I need (would like) to know the identity of the > source > host for log analytics once they are collected. > > You do also state that " It does not affect the hostname as given > inside the > message," so I think the above assumption is correct, I just don't want > to > get caught out on an assumption. (you know what they say...) :) > > Thanks again for your time on this, > > -G > > Gavin McDonald. > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com From tom at ng23.net Sat Jun 12 22:30:14 2010 From: tom at ng23.net (Tom Brown) Date: Sat, 12 Jun 2010 21:30:14 +0100 Subject: [rsyslog] rsyslog-relp for RHEL5.5 ? In-Reply-To: References: Message-ID: <7113527C-D6F9-493E-9A05-BF3DD95991B7@ng23.net> Built my own On 10 Jun 2010, at 23:13, Tom Brown wrote: > Hi > > We use rsyslog-relp and recent upgrades to RHEL5.5 have caused some > issues as there is a later release of rsyslog in here than what we > have used in the past. > > Does anyone know if there is an rsyslog-relp that will go with rsyslog > that comes from RHEL5.5 which is rsyslog-3.22.1-3.el5 ?? > > many thanks From jeff at atlassian.com Mon Jun 14 11:01:34 2010 From: jeff at atlassian.com (Jeff Turner) Date: Mon, 14 Jun 2010 19:01:34 +1000 Subject: [rsyslog] IncludeConfig breaking config ordering? Message-ID: Hi, Perhaps I'm missing something, but it appears that $IncludeConfig is messing up the evaluation order of expressions. Given this trimmed-down configuration file, /tmp/rsyslog.conf: $ModLoad imuxsock :msg, contains, "testy" /tmp/test.log & ~ *.*;auth,authpriv.none -/tmp/syslog I start an rsyslogd instance with it: sudo rsyslogd -f /tmp/rsyslog.conf -d and then run: logger "_testy_ hello world" As expected, I see 'hello world' logged in /tmp/test.log, not in /tmp/syslog Now if I modify the config to: $ModLoad imuxsock :msg, contains, "testy" /tmp/test.log & ~ $IncludeConfig /tmp/more.conf and create /tmp/more.conf containing: *.*;auth,authpriv.none -/tmp/syslog Now I get 'hello world' logged to /tmp/test.log and /tmp/syslog, when I would still expect it to only appear in /tmp/test.log. According to http://www.rsyslog.com/doc-rsconf1_includeconfig.html, shouldn't these be logically identical? I've tested with rsyslog 4.2.0 (Ubuntu 10.4) and 5.4.0. Regards, Jeff From rgerhards at hq.adiscon.com Mon Jun 14 11:17:47 2010 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Mon, 14 Jun 2010 11:17:47 +0200 Subject: [rsyslog] IncludeConfig breaking config ordering? References: Message-ID: <9B6E2A8877C38245BFB15CC491A11DA7103EAC@GRFEXC.intern.adiscon.com> Hi Jeff, that should not make any difference and I do not see how it could with the statements you give (if you have an async action queue, that would be different). I'll try to reproduce it later today. Rainer > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of Jeff Turner > Sent: Monday, June 14, 2010 11:02 AM > To: rsyslog at lists.adiscon.com > Subject: [rsyslog] IncludeConfig breaking config ordering? > > Hi, > > Perhaps I'm missing something, but it appears that $IncludeConfig is > messing > up the evaluation order of expressions. > > Given this trimmed-down configuration file, /tmp/rsyslog.conf: > > $ModLoad imuxsock > :msg, contains, "testy" /tmp/test.log > & ~ > *.*;auth,authpriv.none -/tmp/syslog > > I start an rsyslogd instance with it: > > sudo rsyslogd -f /tmp/rsyslog.conf -d > > and then run: > > logger "_testy_ hello world" > > As expected, I see 'hello world' logged in /tmp/test.log, not in > /tmp/syslog > > Now if I modify the config to: > > $ModLoad imuxsock > :msg, contains, "testy" /tmp/test.log > & ~ > $IncludeConfig /tmp/more.conf > > and create /tmp/more.conf containing: > *.*;auth,authpriv.none -/tmp/syslog > > Now I get 'hello world' logged to /tmp/test.log and /tmp/syslog, when I > would still expect it to only appear in /tmp/test.log. > > According to http://www.rsyslog.com/doc-rsconf1_includeconfig.html, > shouldn't these be logically identical? > > I've tested with rsyslog 4.2.0 (Ubuntu 10.4) and 5.4.0. > > > Regards, > Jeff > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com From rgerhards at hq.adiscon.com Mon Jun 14 15:25:19 2010 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Mon, 14 Jun 2010 15:25:19 +0200 Subject: [rsyslog] tracing futex calls Message-ID: <9B6E2A8877C38245BFB15CC491A11DA7103EB1@GRFEXC.intern.adiscon.com> Hi all, another information request. I am tracing the origin of futex calls I see in strace. However, in my code I see only the pthreads mutex calls, and I am unable to identify the exact source of the futex calls. Does anyone has some advice on how to map what I see in strace to the source code? Any advise would be appreciated. Thanks, Rainer From rgerhards at hq.adiscon.com Mon Jun 14 15:36:25 2010 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Mon, 14 Jun 2010 15:36:25 +0200 Subject: [rsyslog] IncludeConfig breaking config ordering? References: <9B6E2A8877C38245BFB15CC491A11DA7103EAC@GRFEXC.intern.adiscon.com> Message-ID: <9B6E2A8877C38245BFB15CC491A11DA7103EB2@GRFEXC.intern.adiscon.com> Jeff, could youe send me off-list a debug log of a run with the nonfunctional config? I am right in the middle of some optimization and it will take me a while to prepare my dev env to use a "regular" version. So a debug log may speed up things. Rainer > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of Rainer Gerhards > Sent: Monday, June 14, 2010 11:18 AM > To: rsyslog-users > Subject: Re: [rsyslog] IncludeConfig breaking config ordering? > > Hi Jeff, > > that should not make any difference and I do not see how it could with > the > statements you give (if you have an async action queue, that would be > different). I'll try to reproduce it later today. > > Rainer > > > -----Original Message----- > > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > > bounces at lists.adiscon.com] On Behalf Of Jeff Turner > > Sent: Monday, June 14, 2010 11:02 AM > > To: rsyslog at lists.adiscon.com > > Subject: [rsyslog] IncludeConfig breaking config ordering? > > > > Hi, > > > > Perhaps I'm missing something, but it appears that $IncludeConfig is > > messing > > up the evaluation order of expressions. > > > > Given this trimmed-down configuration file, /tmp/rsyslog.conf: > > > > $ModLoad imuxsock > > :msg, contains, "testy" /tmp/test.log > > & ~ > > *.*;auth,authpriv.none -/tmp/syslog > > > > I start an rsyslogd instance with it: > > > > sudo rsyslogd -f /tmp/rsyslog.conf -d > > > > and then run: > > > > logger "_testy_ hello world" > > > > As expected, I see 'hello world' logged in /tmp/test.log, not in > > /tmp/syslog > > > > Now if I modify the config to: > > > > $ModLoad imuxsock > > :msg, contains, "testy" /tmp/test.log > > & ~ > > $IncludeConfig /tmp/more.conf > > > > and create /tmp/more.conf containing: > > *.*;auth,authpriv.none -/tmp/syslog > > > > Now I get 'hello world' logged to /tmp/test.log and /tmp/syslog, when > I > > would still expect it to only appear in /tmp/test.log. > > > > According to http://www.rsyslog.com/doc-rsconf1_includeconfig.html, > > shouldn't these be logically identical? > > > > I've tested with rsyslog 4.2.0 (Ubuntu 10.4) and 5.4.0. > > > > > > Regards, > > Jeff > > _______________________________________________ > > rsyslog mailing list > > http://lists.adiscon.net/mailman/listinfo/rsyslog > > http://www.rsyslog.com > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com From rsyslog at elyograg.org Mon Jun 14 18:34:08 2010 From: rsyslog at elyograg.org (Shawn Heisey) Date: Mon, 14 Jun 2010 10:34:08 -0600 Subject: [rsyslog] Replacing syslog-ng - send local sources to files, remote to mysql Message-ID: <4C165A00.6000300@elyograg.org> I am replacing my central monitoring server, which is currently running syslog-ng, with another based on Debian lenny and its default rsyslog. The current system logs all remote syslog sources to a separate directory and splits it out into subdirectories by source host. All local sources are sent to /var/log like a normal syslog configuration. This is an extremely easy configuration with syslog-ng. With rsyslog, I'm interested in logging all remote sources to only mysql, and all local sources to the standard files in /var/log. I got the mysql set up, but everything gets logged to both locations. I'm OK with local data being in both places, but I definitely do not want the remote data in /var/log. I have been searching the mailing list and trying to understand the documentation, but I cannot figure this out. Can someone help? Thanks, Shawn From david at lang.hm Thu Jun 17 08:05:33 2010 From: david at lang.hm (david at lang.hm) Date: Wed, 16 Jun 2010 23:05:33 -0700 (PDT) Subject: [rsyslog] Replacing syslog-ng - send local sources to files, remote to mysql (fwd) Message-ID: On Mon, 14 Jun 2010 10:34:08 -0600, Shawn Heisey wrote: > I am replacing my central monitoring server, which is currently running > syslog-ng, with another based on Debian lenny and its default rsyslog. > The current system logs all remote syslog sources to a separate > directory and splits it out into subdirectories by source host. All > local sources are sent to /var/log like a normal syslog configuration. > This is an extremely easy configuration with syslog-ng. > > With rsyslog, I'm interested in logging all remote sources to only > mysql, and all local sources to the standard files in /var/log. I got > the mysql set up, but everything gets logged to both locations. I'm OK > with local data being in both places, but I definitely do not want the > remote data in /var/log. > > I have been searching the mailing list and trying to understand the > documentation, but I cannot figure this out. Can someone help? I have a case where i need to do something similar. the first action in rsyslog.conf is :fromhost, isequal, "127.0.0.1" @192.168.1.8;TraditionalForwardFormat & ~ which says that if it's from localhost, write it to the network and do nothing else with the log entry anything that comes from any other server will not match this and fall through to the rest of the actions. David Lang From james at linux-source.org Thu Jun 17 10:14:51 2010 From: james at linux-source.org (James Corteciano) Date: Thu, 17 Jun 2010 16:14:51 +0800 Subject: [rsyslog] rsyslog - creates wrong directory Message-ID: Hi All, I have setup centralized system logging using rsyslog-3.22.1-3.el5 and this is my /etc/rsyslog.conf {http://pastebin.com/D8eei00p}. In remote host clients, I use syslog package. When I restart syslog service, the centralized rsyslog server creates directory with hostname based. However there are two wrong directory created (exiting and syslogd) on /var/log/syslog. Can help to check if there is mistake lines on my rsyslog configuration? Thank you. Regards, James From rgerhards at hq.adiscon.com Thu Jun 17 13:02:33 2010 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Thu, 17 Jun 2010 13:02:33 +0200 Subject: [rsyslog] feedback requested: NEW rsyslog.conf format Message-ID: <9B6E2A8877C38245BFB15CC491A11DA7103ECE@GRFEXC.intern.adiscon.com> Hi all, me again with some feedback request ;) As you all know, rsyslog.conf format is ugly (to phrase it politely). Adiscon has begun an initiative to unify config language across products. I'd like to join that initiative with rsyslog, thus gaining some extra help. So my plan is to actually begin looking at the config language once I am through with the current round of optimizations. To do so, I would like to receive your feedback on a config proposal that I have been able to discuss with an internal expert. We have now based the format on the familiar Apache config format, with some extra bells and whistles to make it more useful for syslog. Most importantly, my personal feeling is that pure Apache format is quite verbose, and thus it may take too much space (read: make unreadable) to describe a plain vanilla rsyslog.conf (all those "pri.severity /do/somewhat" rules). As such, I have asked for a way to describe more condensed rules. I think we have found a nice compromise. Also, I have included some things into the config format that would easily enable me to do a next round of performance enhancements once we have the new format in place. It allows to express things like shared actions, which simply cannot be done with the current engine and would be a major pain to implement without a new config format. As a side-note, I will probably be able to obtain, from Adiscon's closed source, a higher speed expression evaluation engine, which would be something really nice. I have posted a hypothetical config in the new format to http://www.rsyslog.com/download/new_rsyslog.conf It should be relatively self-explanatory (at least to people who know Apache config) -- if not, that's already a bad sign ;) I would appreciate feedback on this config format. Is it useful? Is it readable? Does it look sufficiently simple for simple use cases and sufficiently expressive for complex cases? Would you like to work with such a format? Please raise your voice especially if you DO NOT LIKE the format -- the reasons is that I will probably work towards such a format if I don't get feedback that convinces me it is a bad idea to do so ;) Thanks to all, Rainer From rgerhards at hq.adiscon.com Thu Jun 17 14:28:51 2010 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Thu, 17 Jun 2010 14:28:51 +0200 Subject: [rsyslog] rsyslog - creates wrong directory References: Message-ID: <9B6E2A8877C38245BFB15CC491A11DA7103ED1@GRFEXC.intern.adiscon.com> Hi James, this looks like a problem inside the status message code. If you do not need these messages, a simple work-around is to use $LogRsyslogStatusMessages off on the remote machines. Also not that by using %hostname% plainly, some invalid sequences may be injected by remote systems. The secpath-drop and secpath-replace options may be used to handle slashes within hostname, e.g. %hostname:::secpath-replace% (See http://www.rsyslog.com/doc-property_replacer.html ) Another alternative is to use fromhost-ip, which is NOT take from the message, but thus only works correctly in non-relay environments. HTH Rainer > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of James Corteciano > Sent: Thursday, June 17, 2010 10:15 AM > To: rsyslog-users > Subject: [rsyslog] rsyslog - creates wrong directory > > Hi All, > > I have setup centralized system logging using rsyslog-3.22.1-3.el5 and > this > is my /etc/rsyslog.conf {http://pastebin.com/D8eei00p}. In remote host > clients, I use syslog package. When I restart syslog service, the > centralized rsyslog server creates directory with hostname based. > However > there are two wrong directory created (exiting and syslogd) on > /var/log/syslog. > > Can help to check if there is mistake lines on my rsyslog > configuration? > > Thank you. > > Regards, > James > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com From epiphani at gmail.com Thu Jun 17 14:26:15 2010 From: epiphani at gmail.com (Aaron Wiebe) Date: Thu, 17 Jun 2010 08:26:15 -0400 Subject: [rsyslog] feedback requested: NEW rsyslog.conf format In-Reply-To: <9B6E2A8877C38245BFB15CC491A11DA7103ECE@GRFEXC.intern.adiscon.com> References: <9B6E2A8877C38245BFB15CC491A11DA7103ECE@GRFEXC.intern.adiscon.com> Message-ID: A few comments: execonlyonce 5sec ## this is a directive GLOBAL to ALL actions! You should never be able to declare a globally-impacting variable outside of the global scope. It should be in up at the top. If you allow this, it will confuse people: when reading this, how would we know that it was globally impacting without the comment? It appears to be a scoped declaration. type omfile Perhaps: That would make something like this more readable: listen 10515; ruleset remote10515 In other words, everything should be nameable, and everything defined in-scope should also be reference-able. I'm a little concerned by the reference to rulesets in the input section of the example - there you've got an example of something being referenced before its been declared. That can also be confusing, even though I understand how that is being handled. I think it comes down to this (i'd bold it if this wasn't plain text): We should be able to easily understand what is being declared against what is being applied, and I think this initial mock-up can be confusing. Lastly - please.... HUP configuration reloads. Having written that type of code myself, I realize its not easy, but it would be very much appreciated. -Aaron On Thu, Jun 17, 2010 at 7:02 AM, Rainer Gerhards wrote: > Hi all, > > me again with some feedback request ;) > > As you all know, rsyslog.conf format is ugly (to phrase it politely). Adiscon > has begun an initiative to unify config language across products. I'd like to > join that initiative with rsyslog, thus gaining some extra help. So my plan > is to actually begin looking at the config language once I am through with > the current round of optimizations. > > To do so, I would like to receive your feedback on a config proposal that I > have been able to discuss with an internal expert. We have now based the > format on the familiar Apache config format, with some extra bells and > whistles to make it more useful for syslog. Most importantly, my personal > feeling is that pure Apache format is quite verbose, and thus it may take too > much space (read: make unreadable) to describe a plain vanilla rsyslog.conf > (all those "pri.severity /do/somewhat" rules). As such, I have asked for a > way to describe more condensed rules. I think we have found a nice > compromise. > > Also, I have included some things into the config format that would easily > enable me to do a next round of performance enhancements once we have the new > format in place. It allows to express things like shared actions, which > simply cannot be done with the current engine and would be a major pain to > implement without a new config format. As a side-note, I will probably be > able to obtain, from Adiscon's closed source, a higher speed expression > evaluation engine, which would be something really nice. > > I have posted a hypothetical config in the new format to > > http://www.rsyslog.com/download/new_rsyslog.conf > > It should be relatively self-explanatory (at least to people who know Apache > config) -- if not, that's already a bad sign ;) > > I would appreciate feedback on this config format. Is it useful? Is it > readable? Does it look sufficiently simple for simple use cases and > sufficiently expressive for complex cases? Would you like to work with such a > format? > > Please raise your voice especially if you DO NOT LIKE the format -- the > reasons is that I will probably work towards such a format if I don't get > feedback that convinces me it is a bad idea to do so ;) > > Thanks to all, > Rainer > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com > From rgerhards at hq.adiscon.com Thu Jun 17 14:49:29 2010 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Thu, 17 Jun 2010 14:49:29 +0200 Subject: [rsyslog] feedback requested: NEW rsyslog.conf format References: <9B6E2A8877C38245BFB15CC491A11DA7103ECE@GRFEXC.intern.adiscon.com> Message-ID: <9B6E2A8877C38245BFB15CC491A11DA7103ED2@GRFEXC.intern.adiscon.com> > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of Aaron Wiebe > Sent: Thursday, June 17, 2010 2:26 PM > To: rsyslog-users > Subject: Re: [rsyslog] feedback requested: NEW rsyslog.conf format > > A few comments: > > execonlyonce 5sec ## this is a directive GLOBAL to > ALL actions! > > You should never be able to declare a globally-impacting variable > outside of the global scope. It should be in up at the top. > If you allow this, it will confuse people: when reading this, how > would we know that it was globally impacting without the comment? It > appears to be a scoped declaration. Oops, sorry, my bad - the comment is wrong :( [I'll remove it from the online version, so do not wonder when you do not see it any longer]. This was intended as a *scoped* directive. Only things in global (and these should be few) should actually be global. > > > type omfile > > Perhaps: > > > That would make something like this more readable: > > listen 10515; ruleset > remote10515 This was something we discussed, and kind of my favorite. I got convinced by the following arguments a) Apache uses a single parameter, and it always is the name of the object described b) this becomes ugly if there are more than a handful of modifiers (and there sometimes are) I'd appreciate feedback on these arguments. Note that the overall idea of staying close to Apache is that folks already know that format -- and it seems to be received well. > In other words, everything should be nameable, and everything defined > in-scope should also be reference-able. Yes, that's my idea and requirement as well. I intended to tackle this via ... OR > > I'm a little concerned by the reference to rulesets in the input > section of the example - there you've got an example of something > being referenced before its been declared. That can also be > confusing, even though I understand how that is being handled. One of the things that often causes problem with the current format is that everything needs to be defined before it is used. This has caused quite some problems. So my idea was to remove that requirement, actually in the hopes that it makes to format easier to use. If that is not the case, we can stay with the "define before use" paradigm, it is easier to implement from the parser level. Note that for some complicated configs (like multiple listeners, multiple rulesets, multiple queues and omruleset in one big config) it is rather complicated to get the "define before use" into the right sequence. I myself failed more often than once ;) > I think it comes down to this (i'd bold it if this wasn't plain text): > We should be able to easily understand what is being declared against > what is being applied, and I think this initial mock-up can be > confusing. Can you elaborate a little bit more -- I simply do not fully understand the sentence (probably a non-native speaker issue). > > Lastly - please.... HUP configuration reloads. Having written that > type of code myself, I realize its not easy, but it would be very much > appreciated. Yes, yes, yes, ... but: I'd like to have it myself, but it is extremely difficult with all the dynamic things that go on in rsyslog. I intend to tackle that in multiple steps. The new config system would run in two stages: 1) parse the config and create a new config tree 2) apply the new config tree That process facilitates dynamic config reloads, but it doesn't bring them for free. In fact, almost anything inside the core and in all plugins must be touched to make this happen. So my plan is to get the config system ready to support multiple in-core configs, and *at a later stage* and gradually move to dynamic config change. Thanks again for the feedback, it is very valuable. Rainer > > -Aaron > > > On Thu, Jun 17, 2010 at 7:02 AM, Rainer Gerhards > wrote: > > Hi all, > > > > me again with some feedback request ;) > > > > As you all know, rsyslog.conf format is ugly (to phrase it politely). > Adiscon > > has begun an initiative to unify config language across products. I'd > like to > > join that initiative with rsyslog, thus gaining some extra help. So > my plan > > is to actually begin looking at the config language once I am through > with > > the current round of optimizations. > > > > To do so, I would like to receive your feedback on a config proposal > that I > > have been able to discuss with an internal expert. We have now based > the > > format on the familiar Apache config format, with some extra bells > and > > whistles to make it more useful for syslog. Most importantly, my > personal > > feeling is that pure Apache format is quite verbose, and thus it may > take too > > much space (read: make unreadable) to describe a plain vanilla > rsyslog.conf > > (all those "pri.severity /do/somewhat" rules). As such, I have asked > for a > > way to describe more condensed rules. I think we have found a nice > > compromise. > > > > Also, I have included some things into the config format that would > easily > > enable me to do a next round of performance enhancements once we have > the new > > format in place. It allows to express things like shared actions, > which > > simply cannot be done with the current engine and would be a major > pain to > > implement without a new config format. As a side-note, I will > probably be > > able to obtain, from Adiscon's closed source, a higher speed > expression > > evaluation engine, which would be something really nice. > > > > I have posted a hypothetical config in the new format to > > > > http://www.rsyslog.com/download/new_rsyslog.conf > > > > It should be relatively self-explanatory (at least to people who know > Apache > > config) -- if not, that's already a bad sign ;) > > > > I would appreciate feedback on this config format. Is it useful? Is > it > > readable? Does it look sufficiently simple for simple use cases and > > sufficiently expressive for complex cases? Would you like to work > with such a > > format? > > > > Please raise your voice especially if you DO NOT LIKE the format -- > the > > reasons is that I will probably work towards such a format if I don't > get > > feedback that convinces me it is a bad idea to do so ;) > > > > Thanks to all, > > Rainer > > _______________________________________________ > > rsyslog mailing list > > http://lists.adiscon.net/mailman/listinfo/rsyslog > > http://www.rsyslog.com > > > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com From aoz.syn at gmail.com Thu Jun 17 15:40:09 2010 From: aoz.syn at gmail.com (RB) Date: Thu, 17 Jun 2010 07:40:09 -0600 Subject: [rsyslog] feedback requested: NEW rsyslog.conf format In-Reply-To: <9B6E2A8877C38245BFB15CC491A11DA7103ECE@GRFEXC.intern.adiscon.com> References: <9B6E2A8877C38245BFB15CC491A11DA7103ECE@GRFEXC.intern.adiscon.com> Message-ID: On Thu, Jun 17, 2010 at 05:02, Rainer Gerhards wrote: > As you all know, rsyslog.conf format is ugly (to phrase it politely). Adiscon > has begun an initiative to unify config language across products. I'd like to > join that initiative with rsyslog, thus gaining some extra help. So my plan > is to actually begin looking at the config language once I am through with > the current round of optimizations. It's only ugly because the format it's stayed so true to was, IMO, never designed to express the complex configurations we find ourselves needing these days. In spite of being dissatisfied with the current format, I'm always impressed by how much you've managed to shoehorn into it. > To do so, I would like to receive your feedback on a config proposal that I > have been able to discuss with an internal expert. We have now based the > format on the familiar Apache config format, with some extra bells and > whistles to make it more useful for syslog. Regardless of what format you settle on, you're eventually going to have to address the legacy user population. Providing a conversion routine or service that they can just plug their existing configuration into and validate the results would go a long way toward keeping them happy and moving forward. A web service to do so would be tempting, but we _are_ talking about users that run the same daemon for years - an optionally-built tool may be the way to go. From aoz.syn at gmail.com Thu Jun 17 15:27:52 2010 From: aoz.syn at gmail.com (RB) Date: Thu, 17 Jun 2010 07:27:52 -0600 Subject: [rsyslog] feedback requested: NEW rsyslog.conf format In-Reply-To: References: <9B6E2A8877C38245BFB15CC491A11DA7103ECE@GRFEXC.intern.adiscon.com> Message-ID: On Thu, Jun 17, 2010 at 06:26, Aaron Wiebe wrote: > I'm a little concerned by the reference to rulesets in the input > section of the example - there you've got an example of something > being referenced before its been declared. ?That can also be > confusing, even though I understand how that is being handled. Being a Perl bigot myself, I actually prefer this approach. It eventually becomes less confusing - you only have to worry about scope rather than both order and scope. Only those deeply attached to more strict languages will find it confusing, and exercising that feature would be entirely user-dependent. From rgerhards at hq.adiscon.com Thu Jun 17 16:05:09 2010 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Thu, 17 Jun 2010 16:05:09 +0200 Subject: [rsyslog] feedback requested: NEW rsyslog.conf format References: <9B6E2A8877C38245BFB15CC491A11DA7103ECE@GRFEXC.intern.adiscon.com> Message-ID: <9B6E2A8877C38245BFB15CC491A11DA7103ED4@GRFEXC.intern.adiscon.com> > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of RB > Sent: Thursday, June 17, 2010 3:40 PM > To: rsyslog-users > Subject: Re: [rsyslog] feedback requested: NEW rsyslog.conf format > > On Thu, Jun 17, 2010 at 05:02, Rainer Gerhards > wrote: > > As you all know, rsyslog.conf format is ugly (to phrase it politely). > Adiscon > > has begun an initiative to unify config language across products. I'd > like to > > join that initiative with rsyslog, thus gaining some extra help. So > my plan > > is to actually begin looking at the config language once I am through > with > > the current round of optimizations. > > It's only ugly because the format it's stayed so true to was, IMO, > never designed to express the complex configurations we find ourselves > needing these days. In spite of being dissatisfied with the current > format, I'm always impressed by how much you've managed to shoehorn > into it. > > > To do so, I would like to receive your feedback on a config proposal > that I > > have been able to discuss with an internal expert. We have now based > the > > format on the familiar Apache config format, with some extra bells > and > > whistles to make it more useful for syslog. > > Regardless of what format you settle on, you're eventually going to > have to address the legacy user population. Providing a conversion > routine or service that they can just plug their existing > configuration into and validate the results would go a long way toward > keeping them happy and moving forward. A web service to do so would > be tempting, but we _are_ talking about users that run the same daemon > for years - an optionally-built tool may be the way to go. I should have mentioned that. The old format will still be supported. In fact, I expect that many, many installations will continue to use it, as for the very simple use cases of workstations and laptops the old format is hard to beat. Over time, however, some of the advanced features will probably no longer work with the old format, but that needs to be seen. And the old format will definitely never receive updates for new features. In any case, with what I have on my mind, the old format will probably be just another way to generate the same in-memory representation of the config. However, I will most probably NOT permit to mix old and new format inside a single file. Rainer > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com From epiphani at gmail.com Thu Jun 17 16:58:42 2010 From: epiphani at gmail.com (Aaron Wiebe) Date: Thu, 17 Jun 2010 10:58:42 -0400 Subject: [rsyslog] feedback requested: NEW rsyslog.conf format In-Reply-To: References: <9B6E2A8877C38245BFB15CC491A11DA7103ECE@GRFEXC.intern.adiscon.com> Message-ID: On Thu, Jun 17, 2010 at 9:27 AM, RB wrote: > On Thu, Jun 17, 2010 at 06:26, Aaron Wiebe wrote: >> I'm a little concerned by the reference to rulesets in the input >> section of the example - there you've got an example of something >> being referenced before its been declared. ?That can also be >> confusing, even though I understand how that is being handled. > > Being a Perl bigot myself, I actually prefer this approach. ?It > eventually becomes less confusing - you only have to worry about scope > rather than both order and scope. ?Only those deeply attached to more > strict languages will find it confusing, and exercising that feature > would be entirely user-dependent. Being a C bigot... ;) I don't really have a problem with forward references, but I think my objection is more to the fact that its an input that is declaring its ruleset, whereas I think it would be more logical to bind inputs to rulesets. ie, declare your inputs, your templates, your filters, and your actions, and tie them all together using rulesets. -Aaron From epiphani at gmail.com Thu Jun 17 17:06:58 2010 From: epiphani at gmail.com (Aaron Wiebe) Date: Thu, 17 Jun 2010 11:06:58 -0400 Subject: [rsyslog] feedback requested: NEW rsyslog.conf format In-Reply-To: <9B6E2A8877C38245BFB15CC491A11DA7103ED2@GRFEXC.intern.adiscon.com> References: <9B6E2A8877C38245BFB15CC491A11DA7103ECE@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103ED2@GRFEXC.intern.adiscon.com> Message-ID: On Thu, Jun 17, 2010 at 8:49 AM, Rainer Gerhards wrote: >> listen 10515; ruleset >> remote10515 > > > This was something we discussed, and kind of my favorite. I got convinced by > the following arguments > > a) Apache uses a single parameter, and it always is the name of the object > described > b) this becomes ugly if there are more than a handful of modifiers (and there > sometimes are) > > I'd appreciate feedback on these arguments. Note that the overall idea of > staying close to Apache is that folks already know that format -- and it > seems to be received well. I don't think the Apache argument holds weight - its being used in a generic look-and-feel sense as opposed to yanking its actual config parser. If the modifiers are static - ie, "type, name, use" can be used everywhere - that would solve the concern about adding more and more modifiers. Personally I'd say having "name=" instead of name being implicit would be nice as a minimum. This is a personal style thing though. >> I think it comes down to this (i'd bold it if this wasn't plain text): >> ?We should be able to easily understand what is being declared against >> what is being applied, and I think this initial mock-up can be >> confusing. > > Can you elaborate a little bit more -- I simply do not fully understand the > sentence ?(probably a non-native speaker issue). I've been convinced on the declare wherever argument by yourself and RB. But as a clarification on this point: I think it should be easy to tell what is a declaration and what is an application. For example, if I define an action queue in the global scope, its a declaration because i MUST reference that declaration in order for it to do anything. That's a declaration. Defining an action queue in a ruleset appears to be an application - it does not require later reference - and thus isn't a declaration. I think that when looking at the config, it should be clear what is a declaration and what is an application. Does that explain it better? -Aaron From rgerhards at hq.adiscon.com Thu Jun 17 17:17:31 2010 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Thu, 17 Jun 2010 17:17:31 +0200 Subject: [rsyslog] feedback requested: NEW rsyslog.conf format References: <9B6E2A8877C38245BFB15CC491A11DA7103ECE@GRFEXC.intern.adiscon.com> Message-ID: <9B6E2A8877C38245BFB15CC491A11DA7103ED8@GRFEXC.intern.adiscon.com> > I don't really have a problem with forward references, but I think my > objection is more to the fact that its an input that is declaring its > ruleset, whereas I think it would be more logical to bind inputs to > rulesets. ie, declare your inputs, your templates, your filters, and > your actions, and tie them all together using rulesets. mhhh... Good that you bring up this point. To me it looks much more natural that the input specifies its "sink". I actually hadn't thought the other way round. Now after reading your post, I realize that it is actually either way. Let me think a little bit about this (BTW: I am out of office tomorrow, so silence on my part doesn't mean diminishing interest ;)). I'll driving out shortly, but I hope to be able to look and reply to mail later this evening. In any case, now I have something to think about during the drive ;) Thanks, Rainer From alorbach at ro1.adiscon.com Thu Jun 17 17:18:03 2010 From: alorbach at ro1.adiscon.com (Andre Lorbach) Date: Thu, 17 Jun 2010 17:18:03 +0200 Subject: [rsyslog] feedback requested: NEW rsyslog.conf format In-Reply-To: References: <9B6E2A8877C38245BFB15CC491A11DA7103ECE@GRFEXC.intern.adiscon.com> Message-ID: > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of Aaron Wiebe > Sent: Donnerstag, 17. Juni 2010 14:26 > To: rsyslog-users > Subject: Re: [rsyslog] feedback requested: NEW rsyslog.conf format > > > type omfile > > Perhaps: > > > That would make something like this more readable: > > listen 10515; ruleset > remote10515 > > In other words, everything should be nameable, and everything defined in- > scope should also be reference-able. During our first discussion, we had this format approach as well. But we stayed with the apache-like approach of having only one parameter in the and other blocks, and all general parameters below as name/value pairs. So everything can have a name/id besides the type. To get further verbose with the initial sample: name inp10515 type imtcp #params holds module-specific config parameters listen 10514 ruleset remote10514 Best regards, Andre Lorbach From rgerhards at hq.adiscon.com Thu Jun 17 17:25:12 2010 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Thu, 17 Jun 2010 17:25:12 +0200 Subject: [rsyslog] feedback requested: NEW rsyslog.conf format References: <9B6E2A8877C38245BFB15CC491A11DA7103ECE@GRFEXC.intern.adiscon.com> Message-ID: <9B6E2A8877C38245BFB15CC491A11DA7103ED9@GRFEXC.intern.adiscon.com> > > -----Original Message----- > > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > > bounces at lists.adiscon.com] On Behalf Of Aaron Wiebe > > Sent: Donnerstag, 17. Juni 2010 14:26 > > To: rsyslog-users > > Subject: Re: [rsyslog] feedback requested: NEW rsyslog.conf format > > > > > > type omfile > > > > Perhaps: > > > > > > That would make something like this more readable: > > > > listen 10515; ruleset > > remote10515 > > > > In other words, everything should be nameable, and everything defined > in- > > scope should also be reference-able. > > During our first discussion, we had this format approach as well. But > we > stayed with the apache-like approach of having only one parameter in > the > and other blocks, and all general parameters below as > name/value > pairs. So everything can have a name/id besides the type. To get > further > verbose with the initial sample: > > name inp10515 > type imtcp > #params holds module-specific config parameters > listen 10514 > ruleset remote10514 > > Following Aaron's route, that would also make quite some sense: listen 10514 ruleset remote10514 Note that I kept the formatting inline with the previous sample. More compact, one could note it as: listen 10514 ruleset remote10514 which I have to admit I like. As we have single name/value pairs, it actually is not a big deal if they are inside the type specifier or within the bracketing tags. Note that by this format we could also get rid of the tag, which is a nice saving in any case. Note to list members: I should have mentioned the difference between the ""-parameters and the others. Those in are module specific, whereas the others are generic and apply to all modules. So would actually be passed down to the module for verification. It is of advantage to separate these two different parameter sets, even though they look pretty much the same from an end user perspective. Now let me make one final modification to the code sample: listen=10514 ruleset=remote10514 or equivalent but more compact listen=10514 ruleset=remote10514 (the last one I do *not* prefer, but that would be a user choice) I don't see any notable disadvantage in that, but I see the advantages outlined above. Comments? Rainer From aoz.syn at gmail.com Thu Jun 17 17:20:23 2010 From: aoz.syn at gmail.com (RB) Date: Thu, 17 Jun 2010 09:20:23 -0600 Subject: [rsyslog] feedback requested: NEW rsyslog.conf format In-Reply-To: References: <9B6E2A8877C38245BFB15CC491A11DA7103ECE@GRFEXC.intern.adiscon.com> Message-ID: On Thu, Jun 17, 2010 at 08:58, Aaron Wiebe wrote: > Being a C bigot... ;) > > I don't really have a problem with forward references, but I think my > objection is more to the fact that its an input that is declaring its > ruleset, whereas I think it would be more logical to bind inputs to > rulesets. ?ie, declare your inputs, your templates, your filters, and > your actions, and tie them all together using rulesets. And to be fair, a stricter ruleset is usually easier for both humans and machines to understand and implement. Put this way, it would seem that you're either familiar with the syslog-ng format (which takes this very approach) or have independently come up with it. From rgerhards at hq.adiscon.com Thu Jun 17 17:32:38 2010 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Thu, 17 Jun 2010 17:32:38 +0200 Subject: [rsyslog] feedback requested: NEW rsyslog.conf format References: <9B6E2A8877C38245BFB15CC491A11DA7103ECE@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103ED2@GRFEXC.intern.adiscon.com> Message-ID: <9B6E2A8877C38245BFB15CC491A11DA7103EDA@GRFEXC.intern.adiscon.com> > I've been convinced on the declare wherever argument by yourself and > RB. But as a clarification on this point: I think it should be easy > to tell what is a declaration and what is an application. > > For example, if I define an action queue in the global scope, its a > declaration because i MUST reference that declaration in order for it > to do anything. That's a declaration. > > Defining an action queue in a ruleset appears to be an application - > it does not require later reference - and thus isn't a declaration. > > I think that when looking at the config, it should be clear what is a > declaration and what is an application. > > Does that explain it better? Definitely, and it permits me to provide some new argument :) My goal in designing the new config language is to have the ability that things do not necessarily be defined upfront. Thus I insisted on the capability to declare actions *right* inside the rule. The idea is that we often have situation (very often indeed) that an action is used only once. I find it somewhat unintuitive (and error-prone) when one needs to declare and name the action first, just to use it one time. In that sense, I'd like to have the capability to declare such objects in scope. But on the other hand, if you use the same action in 5 rules, it is more natural (and even important from a code POV) to declare it just once and then re-use it. Thus declare it globally and reuse it inside the rules. Of course, this system can be abused (declare & name action inside a scope and reuse it in another scope), but with great power always comes great ability to screw up ;) That would definitely not be a recommended config (bug I would not forcefully try to forbid it, it would complicate the parser). So, in short, I'd like to have the ability to do ... ... # a one-timer as well as ... ... ... ... Rainer From epiphani at gmail.com Thu Jun 17 17:41:51 2010 From: epiphani at gmail.com (Aaron Wiebe) Date: Thu, 17 Jun 2010 11:41:51 -0400 Subject: [rsyslog] feedback requested: NEW rsyslog.conf format In-Reply-To: <9B6E2A8877C38245BFB15CC491A11DA7103EDA@GRFEXC.intern.adiscon.com> References: <9B6E2A8877C38245BFB15CC491A11DA7103ECE@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103ED2@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103EDA@GRFEXC.intern.adiscon.com> Message-ID: On Thu, Jun 17, 2010 at 11:32 AM, Rainer Gerhards wrote: > > My goal in designing the new config language is to have the ability that > things do not necessarily be defined upfront. Thus I insisted on the > capability to declare actions *right* inside the rule. The idea is that we > often have situation (very often indeed) that an action is used only once. I > find it somewhat unintuitive (and error-prone) when one needs to declare and > name the action first, just to use it one time. In that sense, I'd like to > have the capability to declare such objects in scope. I see where you're coming from - and I do agree. That said, I think getting the difference visible is important. For example, if you want to use an action one-off, you shouldn't be permitted to name it. If you want to declare it for multiple reuse, it must be at a specific scope level (ie, global scope only), and it must be named. That said, no objections. :) -Aaron From rgerhards at hq.adiscon.com Thu Jun 17 22:42:58 2010 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Thu, 17 Jun 2010 22:42:58 +0200 Subject: [rsyslog] feedback requested: NEW rsyslog.conf format References: <9B6E2A8877C38245BFB15CC491A11DA7103ECE@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103ED8@GRFEXC.intern.adiscon.com> Message-ID: <9B6E2A8877C38245BFB15CC491A11DA7103EDB@GRFEXC.intern.adiscon.com> > > I don't really have a problem with forward references, but I think my > > objection is more to the fact that its an input that is declaring its > > ruleset, whereas I think it would be more logical to bind inputs to > > rulesets. ie, declare your inputs, your templates, your filters, and > > your actions, and tie them all together using rulesets. > > mhhh... Good that you bring up this point. To me it looks much more > natural > that the input specifies its "sink". I actually hadn't thought the > other way > round. Now after reading your post, I realize that it is actually > either way. > > Let me think a little bit about this (BTW: I am out of office tomorrow, > so > silence on my part doesn't mean diminishing interest ;)). I'll driving > out > shortly, but I hope to be able to look and reply to mail later this > evening. > In any case, now I have something to think about during the drive ;) After my long drive, I now know why I find it natural to specify the ruleset to bind to inside the input. In my point of view, there is a n:1 relationship between inputs and rulesets. After a message is received, the input must place the message somewhere. This somewhere is a ruleset. Hypothetically, one can think about an n:m relationship, so that an input can have more than one ruleset that the message is being submitted to. However, I am a bit skeptic if that is really useful: because what happens is that the input must then duplicate the message and submit it to multiple rulesets/queues. That is another interface where messages may be multiplexed to various next destinations. It also means that we need some more object/code that handles the n:m relationship, where the input hands over the message to that object and that then multiplexes it to the receiving rulesets. While this is not necessarily a big performance hit, I have learned that small hits also accumulate. Also, it increases code complexity. So rather than doing it this way, I prefer to think about the input submitting a message to a single ruleset. Then, the ruleset can decide what to do with the message. Most importantly, it may decide to push all messages to other (sub-) rulesets. With the current system, this goes only via omruleset, which is kind of a quick hack. With the new system, I could simply do an include/copy automatically during creation of the config tree. Is there any good argument that backs a n:m relationship? Rainer From aoz.syn at gmail.com Thu Jun 17 22:56:51 2010 From: aoz.syn at gmail.com (RB) Date: Thu, 17 Jun 2010 14:56:51 -0600 Subject: [rsyslog] feedback requested: NEW rsyslog.conf format In-Reply-To: <9B6E2A8877C38245BFB15CC491A11DA7103EDB@GRFEXC.intern.adiscon.com> References: <9B6E2A8877C38245BFB15CC491A11DA7103ECE@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103ED8@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103EDB@GRFEXC.intern.adiscon.com> Message-ID: On Thu, Jun 17, 2010 at 14:42, Rainer Gerhards wrote: > Is there any good argument that backs a n:m relationship? A syslogd that accepts UDP, TCP, and RELP feeds and dumps them both to a database for immediate analysis/queries and to a WORM filesystem for long-term archival. RB From rgerhards at hq.adiscon.com Thu Jun 17 23:02:29 2010 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Thu, 17 Jun 2010 23:02:29 +0200 Subject: [rsyslog] feedback requested: NEW rsyslog.conf format References: <9B6E2A8877C38245BFB15CC491A11DA7103ECE@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103ED8@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103EDB@GRFEXC.intern.adiscon.com> Message-ID: <9B6E2A8877C38245BFB15CC491A11DA7103EDC@GRFEXC.intern.adiscon.com> > > Is there any good argument that backs a n:m relationship? > > A syslogd that accepts UDP, TCP, and RELP feeds and dumps them both to > a database for immediate analysis/queries and to a WORM filesystem for > long-term archival. Mmmh... I guess we are misunderstandig here. I talk about rule*sets*, not rules (within rulesets). Or I get the sample wrong. I would creates this scenario as follows: Note that each input has exactly one ruleset whereas the (single) ruleset is used by three inputs. Thus we have 1:n rather than m:n. Misunderstanding or did I overlook something? Rainer From aoz.syn at gmail.com Thu Jun 17 23:25:16 2010 From: aoz.syn at gmail.com (RB) Date: Thu, 17 Jun 2010 15:25:16 -0600 Subject: [rsyslog] feedback requested: NEW rsyslog.conf format In-Reply-To: <9B6E2A8877C38245BFB15CC491A11DA7103EDC@GRFEXC.intern.adiscon.com> References: <9B6E2A8877C38245BFB15CC491A11DA7103ECE@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103ED8@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103EDB@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103EDC@GRFEXC.intern.adiscon.com> Message-ID: On Thu, Jun 17, 2010 at 15:02, Rainer Gerhards wrote: > Mmmh... I guess we are misunderstandig here. I talk about rule*sets*, not > rules (within rulesets). Or I get the sample wrong. I would creates this > scenario as follows: > > > ? > ? > > > > > > > Note that each input has exactly one ruleset whereas the (single) ruleset is > used by three inputs. Thus we have 1:n rather than m:n. > > Misunderstanding or did I overlook something? Probably not, but assuming each individual rule could have independent selectors (only send high-urgency messages to the database), I'm most likely just not grasping what you intend by m:n. Then again, my uses are relatively simple and at a cursory glance the current proposal more than fits them. RB From rgerhards at hq.adiscon.com Thu Jun 17 23:47:36 2010 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Thu, 17 Jun 2010 23:47:36 +0200 Subject: [rsyslog] feedback requested: NEW rsyslog.conf format References: <9B6E2A8877C38245BFB15CC491A11DA7103ECE@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103ED8@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103EDB@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103EDC@GRFEXC.intern.adiscon.com> Message-ID: <9B6E2A8877C38245BFB15CC491A11DA7103EDD@GRFEXC.intern.adiscon.com> > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of RB > Sent: Thursday, June 17, 2010 11:25 PM > To: rsyslog-users > Subject: Re: [rsyslog] feedback requested: NEW rsyslog.conf format > > On Thu, Jun 17, 2010 at 15:02, Rainer Gerhards > wrote: > > Mmmh... I guess we are misunderstandig here. I talk about rule*sets*, > not > > rules (within rulesets). Or I get the sample wrong. I would creates > this > > scenario as follows: > > > > > > ? > > ? > > > > > > > > > > > > > > Note that each input has exactly one ruleset whereas the (single) > ruleset is > > used by three inputs. Thus we have 1:n rather than m:n. > > > > Misunderstanding or did I overlook something? > > Probably not, but assuming each individual rule could have independent > selectors (only send high-urgency messages to the database), I'm most > likely just not grasping what you intend by m:n. Then again, my uses > are relatively simple and at a cursory glance the current proposal > more than fits them. I left the selectors out for brevity. I think you actually missed on level. At the top there are rulesets. These are composed out of multiple rules. Each rule than is composed of a filter (selector) and multiple actions. This hierarchy already exists in v5, but the top level is seldom used. I am talking about the relationship between inputs and rulsets (not rules). What I intended to say is that I don?t see a need that a single input can be bound to more than one ruleset. On the other hand, it is definitely necessary to have the ability to bind a ruleset to more than one input. I hope that clarifies. Rainer From aoz.syn at gmail.com Fri Jun 18 01:01:45 2010 From: aoz.syn at gmail.com (RB) Date: Thu, 17 Jun 2010 17:01:45 -0600 Subject: [rsyslog] feedback requested: NEW rsyslog.conf format In-Reply-To: <9B6E2A8877C38245BFB15CC491A11DA7103EDD@GRFEXC.intern.adiscon.com> References: <9B6E2A8877C38245BFB15CC491A11DA7103ECE@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103ED8@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103EDB@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103EDC@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103EDD@GRFEXC.intern.adiscon.com> Message-ID: On Thu, Jun 17, 2010 at 15:47, Rainer Gerhards wrote: > I think you actually missed on level. At the top there are rulesets. These > are composed out of multiple rules. Each rule than is composed of a filter > (selector) and multiple actions. This hierarchy already exists in v5, but the > top level is seldom used. > > I am talking about the relationship between inputs and rulsets (not rules). > What I intended to say is that I don?t see a need that a single input can be > bound to more than one ruleset. On the other hand, it is definitely necessary > to have the ability to bind a ruleset to more than one input. > > I hope that clarifies. It does clarify; I did indeed miss the rule->ruleset relationship. To justify an m:n relationship, then, I think you'd have to evaluate what features other than sub-rules a ruleset provides. So far as I can tell from reading the docs in the source tree, there are currently two: independent queues and message parsers. It is conceivable (but perhaps not practical) that a user may want to parse a given message multiple ways, or to queue differently based on the speed of a given rule+action. If rules themselves had independent queues and message parsers, then that would return to the 1:n. That is to say, as long as there are features unique to rulesets, it is conceivable that a user would desire to map a single input to several of them. Please do note, I've not had cause to dig through all of v5's features, so what I speak of may already be addressed. RB From rsyslog at elyograg.org Fri Jun 18 01:51:22 2010 From: rsyslog at elyograg.org (Shawn Heisey) Date: Thu, 17 Jun 2010 17:51:22 -0600 Subject: [rsyslog] Replacing syslog-ng - send local sources to files, remote to mysql In-Reply-To: <4C165A00.6000300@elyograg.org> References: <4C165A00.6000300@elyograg.org> Message-ID: <4C1AB4FA.1090100@elyograg.org> After looking at the actual contents of the database for a clue, what I ended up doing was changing /etc/rsyslog.d/mysql.conf to this: :fromhost,!regex,"^serverhostname$" :ommysql:localhost,rsyslogmysql,dbuser,dbpassword & ~ Now it does exactly what I want. On 6/14/2010 10:34 AM, Shawn Heisey wrote: > I am replacing my central monitoring server, which is currently running > syslog-ng, with another based on Debian lenny and its default rsyslog. > The current system logs all remote syslog sources to a separate > directory and splits it out into subdirectories by source host. All > local sources are sent to /var/log like a normal syslog configuration. > This is an extremely easy configuration with syslog-ng. > > With rsyslog, I'm interested in logging all remote sources to only > mysql, and all local sources to the standard files in /var/log. I got > the mysql set up, but everything gets logged to both locations. I'm OK > with local data being in both places, but I definitely do not want the > remote data in /var/log. > > I have been searching the mailing list and trying to understand the > documentation, but I cannot figure this out. Can someone help? From epiphani at gmail.com Fri Jun 18 02:09:20 2010 From: epiphani at gmail.com (Aaron Wiebe) Date: Thu, 17 Jun 2010 20:09:20 -0400 Subject: [rsyslog] feedback requested: NEW rsyslog.conf format In-Reply-To: References: <9B6E2A8877C38245BFB15CC491A11DA7103ECE@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103ED8@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103EDB@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103EDC@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103EDD@GRFEXC.intern.adiscon.com> Message-ID: On Thu, Jun 17, 2010 at 7:01 PM, RB wrote: > > That is to say, as long as there are features unique to rulesets, it > is conceivable that a user would desire to map a single input to > several of them. I think most of the concerns that seem to be going around (of which a lot of them seem to be terminology) would be resolved if we have the ability to chain rulesets into each other - which I believe omruleset presently is capable of - a ruleset being an output target. Is that correct? -Aaron From aoz.syn at gmail.com Fri Jun 18 05:08:08 2010 From: aoz.syn at gmail.com (RB) Date: Thu, 17 Jun 2010 21:08:08 -0600 Subject: [rsyslog] feedback requested: NEW rsyslog.conf format In-Reply-To: References: <9B6E2A8877C38245BFB15CC491A11DA7103ECE@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103ED8@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103EDB@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103EDC@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103EDD@GRFEXC.intern.adiscon.com> Message-ID: On Thu, Jun 17, 2010 at 18:09, Aaron Wiebe wrote: > I think most of the concerns that seem to be going around (of which a > lot of them seem to be terminology) would be resolved if we have the > ability to chain rulesets into each other - which I believe omruleset > presently is capable of - a ruleset being an output target. ?Is that > correct? I believe you are correct. I'd consider it a rather complex configuration, but then again - the user would be requesting rather complex behavior from rsyslog. From rgerhards at hq.adiscon.com Fri Jun 18 07:48:34 2010 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Fri, 18 Jun 2010 07:48:34 +0200 Subject: [rsyslog] feedback requested: NEW rsyslog.conf format References: <9B6E2A8877C38245BFB15CC491A11DA7103ECE@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103ED8@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103EDB@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103EDC@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103EDD@GRFEXC.intern.adiscon.com> Message-ID: <9B6E2A8877C38245BFB15CC491A11DA7103EDE@GRFEXC.intern.adiscon.com> > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of RB > Sent: Friday, June 18, 2010 5:08 AM > To: rsyslog-users > Subject: Re: [rsyslog] feedback requested: NEW rsyslog.conf format > > On Thu, Jun 17, 2010 at 18:09, Aaron Wiebe wrote: > > I think most of the concerns that seem to be going around (of which a > > lot of them seem to be terminology) would be resolved if we have the > > ability to chain rulesets into each other - which I believe omruleset > > presently is capable of - a ruleset being an output target. ?Is that > > correct? > > I believe you are correct. Yes, this is correct, even though omruleset is a bit of hackish, it works quite well. With the new config system, there is a much cleaner and easier to grasp implementation possible. > I'd consider it a rather complex > configuration, but then again - the user would be requesting rather > complex behavior from rsyslog. One thing that you (RB) brought up is very interesting: the ability to parse a message multiple times. Would that actually be useful? So far, I have worked on the assumption that a message will be parsed exactly once, thinking that the parser is bound to a device-specific format (and all messages from the same device have the same format). Note that even today it is possible to MODIFY messages after they are parsed. Message modification modules do that. However, there currently does not exist any such module. I had no need to create one and as it looks nobody else had ;) Rainer From epiphani at gmail.com Fri Jun 18 14:57:11 2010 From: epiphani at gmail.com (Aaron Wiebe) Date: Fri, 18 Jun 2010 08:57:11 -0400 Subject: [rsyslog] feedback requested: NEW rsyslog.conf format In-Reply-To: <9B6E2A8877C38245BFB15CC491A11DA7103EDE@GRFEXC.intern.adiscon.com> References: <9B6E2A8877C38245BFB15CC491A11DA7103ECE@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103ED8@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103EDB@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103EDC@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103EDD@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103EDE@GRFEXC.intern.adiscon.com> Message-ID: On Fri, Jun 18, 2010 at 1:48 AM, Rainer Gerhards wrote: > > One thing that you (RB) brought up is very interesting: the ability to parse > a message multiple times. Would that actually be useful? So far, I have > worked on the assumption that a message will be parsed exactly once, thinking > that the parser is bound to a device-specific format (and all messages from > the same device have the same format). Note that even today it is possible to > MODIFY messages after they are parsed. Message modification modules do that. > However, there currently does not exist any such module. I had no need to > create one and as it looks nobody else had ;) I think multiple parsings would make sense if the method to do the parser passes worked something like this: 1. First ruleset, multiple source inputs, extremely simple pattern match 2. Second with very complex rules for rare cases where only 10% of traffic inbound to first ruleset makes it. Would this be a good application of omruleset, or would there be a more efficient method? Secondly, rsyslog already modifies the stream in sometimes difficult to understand ways. You'd be surprised how many syslog sources completely ignore the expected format. That said, I would LOVE to have something that could rewrite a log line based on some variation of tokens or regex (a la awk). Full regex would probably be required, but it would be nice to also have a simple interface as well. -Aaron From aoz.syn at gmail.com Fri Jun 18 15:40:55 2010 From: aoz.syn at gmail.com (RB) Date: Fri, 18 Jun 2010 07:40:55 -0600 Subject: [rsyslog] feedback requested: NEW rsyslog.conf format In-Reply-To: <9B6E2A8877C38245BFB15CC491A11DA7103EDE@GRFEXC.intern.adiscon.com> References: <9B6E2A8877C38245BFB15CC491A11DA7103ECE@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103ED8@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103EDB@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103EDC@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103EDD@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103EDE@GRFEXC.intern.adiscon.com> Message-ID: On Thu, Jun 17, 2010 at 23:48, Rainer Gerhards wrote: > One thing that you (RB) brought up is very interesting: the ability to parse > a message multiple times. Would that actually be useful? So far, I have > worked on the assumption that a message will be parsed exactly once, thinking > that the parser is bound to a device-specific format (and all messages from > the same device have the same format). Note that even today it is possible to > MODIFY messages after they are parsed. Message modification modules do that. > However, there currently does not exist any such module. I had no need to > create one and as it looks nobody else had ;) At that level, perhaps not. I was thinking of the SQL v. long-term storage use case again. For long-term storage you'd want as raw a copy of the message as possible, but for SQL you would want to make sure it didn't have any injection and was properly escaped, perhaps only even having a subset of the information. Those uses, however, don't seem to belong in front-end parsing. I could see someone wanting to do it, but also shrugging once they figure out it can't be done and doing it the right way. From david at lang.hm Fri Jun 18 18:29:07 2010 From: david at lang.hm (david at lang.hm) Date: Fri, 18 Jun 2010 09:29:07 -0700 (PDT) Subject: [rsyslog] feedback requested: NEW rsyslog.conf format In-Reply-To: <9B6E2A8877C38245BFB15CC491A11DA7103ECE@GRFEXC.intern.adiscon.com> References: <9B6E2A8877C38245BFB15CC491A11DA7103ECE@GRFEXC.intern.adiscon.com> Message-ID: On Thu, 17 Jun 2010, Rainer Gerhards wrote: > Please raise your voice especially if you DO NOT LIKE the format -- the > reasons is that I will probably work towards such a format if I don't get > feedback that convinces me it is a bad idea to do so ;) I have not yet had a chance to dig through this, but one thing I want to point out. I think that one reason rsyslog has beome standard instead of syslog-ng is that the rsyslog config file is understandable to people used to traditional syslog. If you are doing the same thing you would do with traditional syslog it's pretty trivial, and doing simple filtering is simple. syslog-ng has a more powerful config language than rsyslog at the moment, but to do anything with it requires learning a bunch of new stuff. if the new config language completely replaces the old one I think it will be a problem, however if the new config language is optional (available for power users, but not required for simple things) then I think it could work. Daivd Lang From aoz.syn at gmail.com Fri Jun 18 18:51:06 2010 From: aoz.syn at gmail.com (RB) Date: Fri, 18 Jun 2010 10:51:06 -0600 Subject: [rsyslog] feedback requested: NEW rsyslog.conf format In-Reply-To: References: <9B6E2A8877C38245BFB15CC491A11DA7103ECE@GRFEXC.intern.adiscon.com> Message-ID: On Fri, Jun 18, 2010 at 10:29, wrote: > On Thu, 17 Jun 2010, Rainer Gerhards wrote: > if the new config language completely replaces the old one I think it will > be a problem, however if the new config language is optional (available > for power users, but not required for simple things) then I think it could > work. Ranier did point out earlier in the thread that the original configuration syntax will remain, but that it would no longer be actively developed and that it would eventually be missing new features. From peter_macko at msn.com Sat Jun 19 07:28:54 2010 From: peter_macko at msn.com (Peter Macko) Date: Sat, 19 Jun 2010 05:28:54 +0000 Subject: [rsyslog] Windows-LogParser-TCP-RSyslog problem Message-ID: I am trying to configure central loghost RSyslog version 3.22.1 on CentOS 5.5 i386 to log windows workstations Event Logs. On the windows side I use LogParser 2.2. Everything works fine with UDP. When I swap to TCP, the first message is Ok, but next messages start with <14> and they do not separate, each message on new line. Let me show you some examples: UDP: Jun 18 13:50:51 OIT03 Windows: Security 1102 8 Success Audit event The audit log was cleared. Subject: Security ID: S-1-1-11-1111111111-1111111111-1111111111-1111 Account Name: Macko Domain Name: OIT03 Logon ID: 0x465b0 Jun 18 13:53:08 OIT03 Windows: Security 4776 16 Failure Audit event The domain controller attempted to validate the credentials for an account. Authentication Package: MICROSOFT_AUTHENTICATION_PACKAGE_V1_0 Logon Account: Adminuser Source Workstation: OIT03 Error Code: 0xc000006a Jun 18 13:53:10 OIT03 Windows: Security 4776 16 Failure Audit event The domain controller attempted to validate the credentials for an account. Authentication Package: MICROSOFT_AUTHENTICATION_PACKAGE_V1_0 Logon Account: Adminuser Source Workstation: OIT03 Error Code: 0xc000006a TCP: Jun 18 13:50:51 OIT03 Windows: Security 1102 8 Success Audit event The audit log was cleared. Subject: Security ID: S-1-1-11-1111111111-1111111111-1111111111-1111 Account Name: Macko Domain Name: OIT03 Logon ID: 0x465b0 <14>Jun 18 13:53:08 OIT03 Windows:Security 4776 16 Failure Audit event The domain controller attempted to validate the credentials for an account. Authentication Package: MICROSOFT_AUTHENTICATION_PACKAGE_V1_0 Logon Account: Adminuser Source Workstation: OIT03 Error Code: 0xc000006a <14>Jun 18 13:53:10 OIT03 Windows:Security 4776 16 Failure Audit event The domain controller attempted to validate the credentials for an account. Authentication Package: MICROSOFT_AUTHENTICATION_PACKAGE_V1_0 Logon Account: Adminuser Source Workstation: OIT03 Error Code: 0xc000006a <14>Jun 18 13:53:13 OIT03 Windows:Security 4776 16 Failure Audit event The domain controller attempted to validate the credentials for an account. Authentication Package: MICROSOFT_AUTHENTICATION_PACKAGE_V1_0 Log on Account: Adminuser Source Workstation: OIT03 Error Code: 0xc000006a <14>Jun 18 13:53:28 OIT03 Windows:Security ... TCP IN DEBBUG MODE: 3037.255100375:imtcp.c: error: message received is larger than max msg size, we split it 3037.255470975:imtcp.c: logmsg: flags 0, from '10.10.1.51', msg Jun 18 13:50:51 OIT03 Windows:Security 1102 8 Success Audit event The audit log was cleared. Subject: Security ID: S-1-1-11-1111111111-1111111111-1111111111-1111 Account Name: Macko Domain Name: OIT03 Logon ID: 0x465b0 <14>Jun 18 13:53:08 OIT03 Windows:Security 4776 16 Failure Audit event The domain controller attempted to validate the credentials for an account. Authentication Package: MICROSOFT_AUTHENTICATION_PACKAGE_V1_0 Logon Account: Adminuser Source Workstation: OIT03 Error Code: 0xc000006a <14>Jun 18 13:53:10 OIT03 Windows:Security 4776 16 Failure Audit event The domain controller attempted to validate the credentials for an account. Authentication Package: MICROSOFT_AUTHENTICATION_PACKAGE_V1_0 Logon Account: Adminuser Source Workstation: OIT03 Error Code: 0xc000006a <14>Jun 18 13:53:13 OIT03 Windows:Security 4776 16 Failure Audit event The domain controller attempted to validate the credentials for an account. Authentication Package: MICROSOFT_AUTHENTICATION_PACKAGeE?????T?`?(???: `?(????W8?? `?a8x???f x??P?P?X?8?????? |??c???????x??8 ???|????????????????????????????????<14>Jun 18 13:50:51 OIT03 Windows:Security ... _________________________________________________________________ Hotmail: Free, trusted and rich email service. https://signup.live.com/signup.aspx?id=60969 From rgerhards at hq.adiscon.com Sat Jun 19 12:22:55 2010 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Sat, 19 Jun 2010 12:22:55 +0200 Subject: [rsyslog] Windows-LogParser-TCP-RSyslog problem References: Message-ID: <9B6E2A8877C38245BFB15CC491A11DA7103EE4@GRFEXC.intern.adiscon.com> > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of Peter Macko > Sent: Saturday, June 19, 2010 7:29 AM > To: rsyslog at lists.adiscon.com > Subject: [rsyslog] Windows-LogParser-TCP-RSyslog problem > > > I am trying to configure central loghost RSyslog version 3.22.1 on > CentOS 5.5 i386 to log windows workstations Event Logs. > On the windows side I use LogParser 2.2. > Everything works fine with UDP. When I swap to TCP, the first message > is Ok, but next messages start with <14> and they do not separate, > each message on new line. If I understand you correctly (and the samples seem to backup that view), LogParser is broken. They need to fix their TCP framing. The can use either NL after each message (industry standard) or octet-count based framing as described in RFC5425. Rainer > > Let me show you some examples: > > UDP: > > Jun 18 13:50:51 OIT03 Windows: Security 1102 8 Success Audit event The > audit log was cleared. Subject: Security ID: S-1-1-11-1111111111- > 1111111111-1111111111-1111 Account Name: Macko Domain Name: OIT03 Logon > ID: 0x465b0 > Jun 18 13:53:08 OIT03 Windows: Security 4776 16 Failure Audit event The > domain controller attempted to validate the credentials for an account. > Authentication Package: MICROSOFT_AUTHENTICATION_PACKAGE_V1_0 Logon > Account: Adminuser Source Workstation: OIT03 Error Code: 0xc000006a > Jun 18 13:53:10 OIT03 Windows: Security 4776 16 Failure Audit event The > domain controller attempted to validate the credentials for an account. > Authentication Package: MICROSOFT_AUTHENTICATION_PACKAGE_V1_0 Logon > Account: Adminuser Source Workstation: OIT03 Error Code: 0xc000006a > > TCP: > > Jun 18 13:50:51 OIT03 Windows: Security 1102 8 Success Audit event The > audit log was cleared. Subject: Security ID: S-1-1-11-1111111111- > 1111111111-1111111111-1111 Account Name: Macko Domain Name: OIT03 Logon > ID: 0x465b0 <14>Jun 18 13:53:08 OIT03 Windows:Security 4776 16 Failure > Audit event The domain controller attempted to validate the > credentials for an account. Authentication Package: > MICROSOFT_AUTHENTICATION_PACKAGE_V1_0 Logon Account: Adminuser Source > Workstation: OIT03 Error Code: 0xc000006a <14>Jun 18 13:53:10 OIT03 > Windows:Security 4776 16 Failure Audit event The domain controller > attempted to validate the credentials for an account. Authentication > Package: MICROSOFT_AUTHENTICATION_PACKAGE_V1_0 Logon Account: Adminuser > Source Workstation: OIT03 Error Code: 0xc000006a <14>Jun 18 13:53:13 > OIT03 Windows:Security 4776 16 Failure Audit event The domain > controller attempted to validate the credentials for an account. > Authentication Package: MICROSOFT_AUTHENTICATION_PACKAGE_V1_0 Log on > Account: Adminuser Source Workstation: OIT03 Error Code: 0xc000006a > <14>Jun 18 13:53:28 OIT03 Windows:Security ... > > TCP IN DEBBUG MODE: > > 3037.255100375:imtcp.c: error: message received is larger than max msg > size, we split it > 3037.255470975:imtcp.c: logmsg: flags 0, from '10.10.1.51', msg Jun 18 > 13:50:51 OIT03 Windows:Security 1102 8 Success Audit event The audit > log was cleared. Subject: Security ID: S-1-1-11-1111111111-1111111111- > 1111111111-1111 Account Name: Macko Domain Name: OIT03 Logon ID: > 0x465b0 <14>Jun 18 13:53:08 OIT03 Windows:Security 4776 16 Failure > Audit event The domain controller attempted to validate the credentials > for an account. Authentication Package: > MICROSOFT_AUTHENTICATION_PACKAGE_V1_0 Logon Account: Adminuser Source > Workstation: OIT03 Error Code: 0xc000006a <14>Jun 18 13:53:10 OIT03 > Windows:Security 4776 16 Failure Audit event The domain controller > attempted to validate the credentials for an account. Authentication > Package: MICROSOFT_AUTHENTICATION_PACKAGE_V1_0 Logon Account: Adminuser > Source Workstation: OIT03 Error Code: 0xc000006a <14>Jun 18 13:53:13 > OIT03 Windows:Security 4776 16 Failure Audit event The domain > controller attempted to validate the credentials for an account. > Authentication Package: MICROSOFT_AUTHENTICATION_PACKAGeE?????T?`?(???: > `?(????W8?? > `?a8x???f > x??P?P?X?8?????? |??c???????x??8 > ???|????????????????????????????????<14>Jun 18 13:50:51 OIT03 > Windows:Security ... > _________________________________________________________________ > Hotmail: Free, trusted and rich email service. > https://signup.live.com/signup.aspx?id=60969 > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com From mrdemeanour at jackpot.uk.net Sun Jun 20 10:56:13 2010 From: mrdemeanour at jackpot.uk.net (Mr. Demeanour) Date: Sun, 20 Jun 2010 09:56:13 +0100 Subject: [rsyslog] Windows-LogParser-TCP-RSyslog problem In-Reply-To: <9B6E2A8877C38245BFB15CC491A11DA7103EE4@GRFEXC.intern.adiscon.com> References: <9B6E2A8877C38245BFB15CC491A11DA7103EE4@GRFEXC.intern.adiscon.com> Message-ID: <4C1DD7AD.7070200@jackpot.uk.net> Rainer Gerhards wrote: >> -----Original Message----- >> From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- >> bounces at lists.adiscon.com] On Behalf Of Peter Macko >> Sent: Saturday, June 19, 2010 7:29 AM >> To: rsyslog at lists.adiscon.com >> Subject: [rsyslog] Windows-LogParser-TCP-RSyslog problem >> >> >> I am trying to configure central loghost RSyslog version 3.22.1 on >> CentOS 5.5 i386 to log windows workstations Event Logs. >> On the windows side I use LogParser 2.2. >> Everything works fine with UDP. When I swap to TCP, the first message >> is Ok, but next messages start with <14> and they do not separate, >> each message on new line. > > If I understand you correctly (and the samples seem to backup that view), > LogParser is broken. They need to fix their TCP framing. The can use either > NL after each message (industry standard) or octet-count based framing as > described in RFC5425. Try NTSyslog. It mostly works well here. And it's free. http://ntsyslog.sourceforge.net/ -- MrD. From rgerhards at hq.adiscon.com Sun Jun 20 12:49:28 2010 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Sun, 20 Jun 2010 12:49:28 +0200 Subject: [rsyslog] UDPSpoof Module References: <593B71C383F94847BCBD5D5807D93842C9ECF2@TUS1XCHCLUPIN23.enterprise.veritas.com> Message-ID: <9B6E2A8877C38245BFB15CC491A11DA7103EE6@GRFEXC.intern.adiscon.com> Hi Johnathan, the rsyslog project does not create RPMs itself. We solely rely on package maintainers for that. So as far as I am concerned, I can only offer advice in regard on how to compile, but please be warned that I do not have Centos available to try it out myself. But maybe it helps if you could point out / show a make log the problems that you see. Also, others on this list have more experience with Centos and may be able to provide some information (anyone?). Rainer PS: it may be useful to subscribe to the mailing list, if you have not already done so. > -----Original Message----- > From: Jon Franz [mailto:Jon_Franz at symantec.com] > Sent: Saturday, June 19, 2010 3:22 PM > To: Rainer Gerhards > Subject: Re: UDPSpoof Module > > Absolutely, I would appreciate anyones assistance to get this working. > > Thanks, > > Jonathan > > ----- Original Message ----- > From: Rainer Gerhards > To: Jon Franz > Sent: Sat Jun 19 03:11:02 2010 > Subject: RE: UDPSpoof Module > > Hi Jonathan, > > is it OK if I reply to the mailing list? First of all, it may be > interesting > for others as well, and secondly others may have a better answer than I > ;) > > Rainer > > > -----Original Message----- > > From: Jon Franz [mailto:Jon_Franz at symantec.com] > > Sent: Saturday, June 19, 2010 4:17 AM > > To: Rainer Gerhards > > Subject: UDPSpoof Module > > > > Rainer. > > > > > > > > I am having all types of issues trying to get the module to compile > and > > work on Centos 5.4. Is there or do you have an RPM for the module > that > > will install on Centos 5.4 x86_64. I would greatly appreciate any > help > > you can provide me. I have found RPM for SUSE but they are all > invalid > > or corrupt. I am new to this and love the product but really need to > > get the working or I am going to be forced to switch to syslog-ng and > I > > do not want to. > > > > > > > > Thanks, > > > > > > > > Jonathan Franz From Jon_Franz at symantec.com Sun Jun 20 18:18:11 2010 From: Jon_Franz at symantec.com (Jon Franz) Date: Sun, 20 Jun 2010 09:18:11 -0700 Subject: [rsyslog] UDPSpoof RPM for Centos 5.4 x86_64 Message-ID: <593B71C383F94847BCBD5D5807D9384203347EEB@TUS1XCHCLUPIN23.enterprise.veritas.com> Does anyone have an RPM for the udpspoof module 5.4.0 that will work on Centos 5.4 x86_64. I am having issues getting it to make using --enable-omudpspoof option. Any help is greatly appreciated. From rgerhards at hq.adiscon.com Mon Jun 21 12:53:07 2010 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Mon, 21 Jun 2010 12:53:07 +0200 Subject: [rsyslog] feedback requested: NEW rsyslog.conf format References: <9B6E2A8877C38245BFB15CC491A11DA7103ECE@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103ED2@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103EDA@GRFEXC.intern.adiscon.com> Message-ID: <9B6E2A8877C38245BFB15CC491A11DA7103EF2@GRFEXC.intern.adiscon.com> > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of Aaron Wiebe > Sent: Thursday, June 17, 2010 5:42 PM > To: rsyslog-users > Subject: Re: [rsyslog] feedback requested: NEW rsyslog.conf format > > On Thu, Jun 17, 2010 at 11:32 AM, Rainer Gerhards > wrote: > > > > My goal in designing the new config language is to have the ability > that > > things do not necessarily be defined upfront. Thus I insisted on the > > capability to declare actions *right* inside the rule. The idea is > that we > > often have situation (very often indeed) that an action is used only > once. I > > find it somewhat unintuitive (and error-prone) when one needs to > declare and > > name the action first, just to use it one time. In that sense, I'd > like to > > have the capability to declare such objects in scope. > > I see where you're coming from - and I do agree. That said, I think > getting the difference visible is important. For example, if you want > to use an action one-off, you shouldn't be permitted to name it. If > you want to declare it for multiple reuse, it must be at a specific > scope level (ie, global scope only), and it must be named. makes sense ;) Rainer > > That said, no objections. :) > > -Aaron > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com From rgerhards at hq.adiscon.com Mon Jun 21 12:58:38 2010 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Mon, 21 Jun 2010 12:58:38 +0200 Subject: [rsyslog] feedback requested: NEW rsyslog.conf format References: <9B6E2A8877C38245BFB15CC491A11DA7103ECE@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103ED8@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103EDB@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103EDC@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103EDD@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103EDE@GRFEXC.intern.adiscon.com> Message-ID: <9B6E2A8877C38245BFB15CC491A11DA7103EF3@GRFEXC.intern.adiscon.com> > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of Aaron Wiebe > Sent: Friday, June 18, 2010 2:57 PM > To: rsyslog-users > Subject: Re: [rsyslog] feedback requested: NEW rsyslog.conf format > > On Fri, Jun 18, 2010 at 1:48 AM, Rainer Gerhards > wrote: > > > > One thing that you (RB) brought up is very interesting: the ability > to parse > > a message multiple times. Would that actually be useful? So far, I > have > > worked on the assumption that a message will be parsed exactly once, > thinking > > that the parser is bound to a device-specific format (and all > messages from > > the same device have the same format). Note that even today it is > possible to > > MODIFY messages after they are parsed. Message modification modules > do that. > > However, there currently does not exist any such module. I had no > need to > > create one and as it looks nobody else had ;) > > I think multiple parsings would make sense if the method to do the > parser passes worked something like this: > > 1. First ruleset, multiple source inputs, extremely simple pattern > match > 2. Second with very complex rules for rare cases where only 10% of > traffic inbound to first ruleset makes it. > > Would this be a good application of omruleset, or would there be a > more efficient method? The fundamental design (currently) is that a message is parsed exactly once. So that would probably not work as you expect. But note that the rawmessage is always kept in %rawmsg%, so whatever an output wants to do, it can do based on the original message content. > Secondly, rsyslog already modifies the stream in sometimes difficult > to understand ways. You'd be surprised how many syslog sources > completely ignore the expected format. Not the least ;) This is what parser modules are actually for: they should match the format of a given source, assuming that a single source emits a malformed, but consistent format. > That said, I would LOVE to > have something that could rewrite a log line based on some variation > of tokens or regex (a la awk). Full regex would probably be required, > but it would be nice to also have a simple interface as well. While not easy for the average user, the new strgen moduls can be used to do that, and do it in a high-performance way (because they are C programs). They can work on %rawmsg%, so you can do whatever format mangling you would like to do. Rainer > > -Aaron > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com From rgerhards at hq.adiscon.com Mon Jun 21 13:00:20 2010 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Mon, 21 Jun 2010 13:00:20 +0200 Subject: [rsyslog] feedback requested: NEW rsyslog.conf format References: <9B6E2A8877C38245BFB15CC491A11DA7103ECE@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103ED8@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103EDB@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103EDC@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103EDD@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103EDE@GRFEXC.intern.adiscon.com> Message-ID: <9B6E2A8877C38245BFB15CC491A11DA7103EF4@GRFEXC.intern.adiscon.com> > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of RB > Sent: Friday, June 18, 2010 3:41 PM > To: rsyslog-users > Subject: Re: [rsyslog] feedback requested: NEW rsyslog.conf format > > On Thu, Jun 17, 2010 at 23:48, Rainer Gerhards > wrote: > > One thing that you (RB) brought up is very interesting: the ability > to parse > > a message multiple times. Would that actually be useful? So far, I > have > > worked on the assumption that a message will be parsed exactly once, > thinking > > that the parser is bound to a device-specific format (and all > messages from > > the same device have the same format). Note that even today it is > possible to > > MODIFY messages after they are parsed. Message modification modules > do that. > > However, there currently does not exist any such module. I had no > need to > > create one and as it looks nobody else had ;) > > At that level, perhaps not. I was thinking of the SQL v. long-term > storage use case again. For long-term storage you'd want as raw a > copy of the message as possible, but for SQL you would want to make > sure it didn't have any injection and was properly escaped, perhaps > only even having a subset of the information. Those uses, however, > don't seem to belong in front-end parsing. I could see someone > wanting to do it, but also shrugging once they figure out it can't be > done and doing it the right way. If I am don't missing a point, you can already do this with the output templates. At least this was the reason why templates were introduced. E.g. $template rawArch, "%rawmsg%\n" for the long-term storage action and the default database template for the db access. Any concerns with that? Rainer From rgerhards at hq.adiscon.com Mon Jun 21 13:01:01 2010 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Mon, 21 Jun 2010 13:01:01 +0200 Subject: [rsyslog] feedback requested: NEW rsyslog.conf format References: <9B6E2A8877C38245BFB15CC491A11DA7103ECE@GRFEXC.intern.adiscon.com> Message-ID: <9B6E2A8877C38245BFB15CC491A11DA7103EF5@GRFEXC.intern.adiscon.com> > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of RB > Sent: Friday, June 18, 2010 6:51 PM > To: rsyslog-users > Subject: Re: [rsyslog] feedback requested: NEW rsyslog.conf format > > On Fri, Jun 18, 2010 at 10:29, wrote: > > On Thu, 17 Jun 2010, Rainer Gerhards wrote: > > if the new config language completely replaces the old one I think it > will > > be a problem, however if the new config language is optional > (available > > for power users, but not required for simple things) then I think it > could > > work. > > Ranier did point out earlier in the thread that the original > configuration syntax will remain, but that it would no longer be > actively developed and that it would eventually be missing new > features. Just let me strengthen that point: The original format will always be supported. Rainer From rgerhards at hq.adiscon.com Mon Jun 21 14:40:45 2010 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Mon, 21 Jun 2010 14:40:45 +0200 Subject: [rsyslog] feedback requested: NEW rsyslog.conf format -- XML? References: <9B6E2A8877C38245BFB15CC491A11DA7103ECE@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103EF5@GRFEXC.intern.adiscon.com> Message-ID: <9B6E2A8877C38245BFB15CC491A11DA7103EF9@GRFEXC.intern.adiscon.com> Hi all, thanks again for the good feedback. Please keep it flowing, I guess we will hit a first milestone very soon. Then, I think, we have a format that we could work towards and begin to look at some details. I have now condensed the comments and thought about them. Then, I have looked at some ways how config files could actually look like. What then hit me was that we are very close to how XML looks. For example we had: listen 10514 ruleset remote10514 this almost looks like XML (which is no surprise given the fact that Apache's config also looks somewhat like XML and we talked about modifications that are a bit more in the spirit of XML). The step towards full XML is not a big one (intentionally formated close to non-XML example): While I am not a big fan of XML config files, I have to admit that the difference between what we discussed, at least in a later stage, and XML is slim. Seeing this, I begin to think that using an XML-based config language has a number of advantages. Probably the most important being that I do not need to write an maintain a dedicated parser but could use a XML-Parser instead. Plus a validating editor could probably be a good aid in writing config files (assuming that I get the DTD right, something I have no experience in ;)). So I have converted my original proposal (NOT the last discussion state) into XML format. Again, I think the XML version is quite readable. Please have a look Original: http://www.rsyslog.com/download/new_rsyslog.conf XML: http://www.rsyslog.com/download/xml_rsyslog.conf I wonder if there is any good argument AGAINST using XML as "described" in the sample. If nobody brings up a good argument, I'll very possibly will try to take the XML road and begin to look what that takes in detail. Of course, it would be helpful as well if you could make yourself heard if you like XML format ;). I am looking forward to your feedback! Thanks again, Rainer From alorbach at ro1.adiscon.com Mon Jun 21 14:57:03 2010 From: alorbach at ro1.adiscon.com (Andre Lorbach) Date: Mon, 21 Jun 2010 14:57:03 +0200 Subject: [rsyslog] feedback requested: NEW rsyslog.conf format -- XML? In-Reply-To: <9B6E2A8877C38245BFB15CC491A11DA7103EF9@GRFEXC.intern.adiscon.com> References: <9B6E2A8877C38245BFB15CC491A11DA7103ECE@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103EF5@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103EF9@GRFEXC.intern.adiscon.com> Message-ID: Hi, the only argument against XML I can think of is, that syntax error's might happen more often. But if you see XML as an advanced configuration language, this would be fine. Besides that I would allow and support multiple methods to express the parameters like in this sample: remote10514 For having only a few parameters, it is fine to have the parameters as XML-Node properties, but if you have more than a few parameters, the view is more readable if each parameter has its own XML-Node. Regards, Andre > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of Rainer Gerhards > Sent: Montag, 21. Juni 2010 14:41 > To: rsyslog-users > Subject: Re: [rsyslog] feedback requested: NEW rsyslog.conf format -- XML? > > Hi all, > > thanks again for the good feedback. Please keep it flowing, I guess we will hit > a first milestone very soon. Then, I think, we have a format that we could > work towards and begin to look at some details. > > I have now condensed the comments and thought about them. Then, I have > looked at some ways how config files could actually look like. What then hit > me was that we are very close to how XML looks. For example we had: > > > listen 10514 > ruleset remote10514 > > > this almost looks like XML (which is no surprise given the fact that Apache's > config also looks somewhat like XML and we talked about modifications that > are a bit more in the spirit of XML). The step towards full XML is not a big one > (intentionally formated close to non-XML example): > > > ruleset="remote10514" > /> > > > While I am not a big fan of XML config files, I have to admit that the > difference between what we discussed, at least in a later stage, and XML is > slim. Seeing this, I begin to think that using an XML-based config language has > a number of advantages. Probably the most important being that I do not > need to write an maintain a dedicated parser but could use a XML-Parser > instead. Plus a validating editor could probably be a good aid in writing config > files (assuming that I get the DTD right, something I have no experience in ;)). > > So I have converted my original proposal (NOT the last discussion state) into > XML format. Again, I think the XML version is quite readable. Please have a > look > > Original: http://www.rsyslog.com/download/new_rsyslog.conf > XML: http://www.rsyslog.com/download/xml_rsyslog.conf > > I wonder if there is any good argument AGAINST using XML as "described" in > the sample. If nobody brings up a good argument, I'll very possibly will try to > take the XML road and begin to look what that takes in detail. Of course, it > would be helpful as well if you could make yourself heard if you like XML > format ;). > > I am looking forward to your feedback! > > Thanks again, > Rainer > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com From rgerhards at hq.adiscon.com Mon Jun 21 15:09:47 2010 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Mon, 21 Jun 2010 15:09:47 +0200 Subject: [rsyslog] feedback requested: NEW rsyslog.conf format -- XML? References: <9B6E2A8877C38245BFB15CC491A11DA7103ECE@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103EF5@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103EF9@GRFEXC.intern.adiscon.com> Message-ID: <9B6E2A8877C38245BFB15CC491A11DA7103EFB@GRFEXC.intern.adiscon.com> > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of Andre Lorbach > Sent: Monday, June 21, 2010 2:57 PM > To: rsyslog-users > Subject: Re: [rsyslog] feedback requested: NEW rsyslog.conf format -- > XML? > > Hi, > > the only argument against XML I can think of is, that syntax error's > might > happen more often. > But if you see XML as an advanced configuration language, this would be > fine. > > > Besides that I would allow and support multiple methods to express the > parameters like in this sample: > > > remote10514 > > > > For having only a few parameters, it is fine to have the parameters as > XML-Node properties, but if you have more than a few parameters, the > view is > more readable if each parameter has its own XML-Node. I think you mean this: 10514 remote10514 But what's the advantage of this over I have to admit that I do not see an advantage, just more text to be written (and IMHO harder to read due to more noise). So I personally prefer the paramter approach. Also I don't see why it should become less readable if there are many parameters. Isn't that just a matter of how you format the source text? Maybe I am overlooking something obvious. I don't have much experience with XML... Rainer From epiphani at gmail.com Mon Jun 21 15:39:18 2010 From: epiphani at gmail.com (Aaron Wiebe) Date: Mon, 21 Jun 2010 09:39:18 -0400 Subject: [rsyslog] feedback requested: NEW rsyslog.conf format -- XML? In-Reply-To: <9B6E2A8877C38245BFB15CC491A11DA7103EFB@GRFEXC.intern.adiscon.com> References: <9B6E2A8877C38245BFB15CC491A11DA7103ECE@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103EF5@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103EF9@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103EFB@GRFEXC.intern.adiscon.com> Message-ID: On Mon, Jun 21, 2010 at 9:09 AM, Rainer Gerhards wrote: > > I think you mean this: > > > ? ? ? ? > ? ? ? ? ? ? ? ?10514 > ? ? ? ? ? ? ? ?remote10514 > ? ? ? ? > > > But what's the advantage of this over > > > ? ? ? ? ? ? ? ? ? ? ? ?listen="10514" > ? ? ? ? ? ? ? ?ruleset="remote10514" > ? ? ? ?/> > > > I have to admit that I do not see an advantage, just more text to be written > (and IMHO harder to read due to more noise). So I personally prefer the > paramter approach. Also I don't see why it should become less readable if > there are many parameters. Isn't that just a matter of how you format the > source text? > > Maybe I am overlooking something obvious. I don't have much experience with > XML... I also have nearly zero experience with XML - but the one-parameter-per-node approach "looks" cleaner to me. Either way though, XML is a fairly common thing - there has to be a best practices approach. If going the XML route (which I also have to admit makes a fair bit of sense), we should do everything to stick to standards and best practices. I know that this will make it significantly easier to write configuration frontends. -Aaron From rgerhards at hq.adiscon.com Mon Jun 21 15:53:43 2010 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Mon, 21 Jun 2010 15:53:43 +0200 Subject: [rsyslog] feedback requested: NEW rsyslog.conf format -- XML? References: <9B6E2A8877C38245BFB15CC491A11DA7103ECE@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103EF5@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103EF9@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103EFB@GRFEXC.intern.adiscon.com> Message-ID: <9B6E2A8877C38245BFB15CC491A11DA7103EFC@GRFEXC.intern.adiscon.com> > I also have nearly zero experience with XML - but the > one-parameter-per-node approach "looks" cleaner to me. I am concerned about readability when using a text editor as front-end. The primary reason that I mislike XML configs is that it often is extremely hard to read them without a dedicated tool. But... > Either way > though, XML is a fairly common thing - there has to be a best > practices approach. If going the XML route (which I also have to > admit makes a fair bit of sense), we should do everything to stick to > standards and best practices. I know that this will make it > significantly easier to write configuration frontends. ... well spoken! That's one of the obvious things I overlooked. I'll try to dig out best practices and if someone knows where to look, any help is appreciated ;) Rainer From alorbach at ro1.adiscon.com Mon Jun 21 16:46:17 2010 From: alorbach at ro1.adiscon.com (Andre Lorbach) Date: Mon, 21 Jun 2010 16:46:17 +0200 Subject: [rsyslog] feedback requested: NEW rsyslog.conf format -- XML? In-Reply-To: <9B6E2A8877C38245BFB15CC491A11DA7103EFB@GRFEXC.intern.adiscon.com> References: <9B6E2A8877C38245BFB15CC491A11DA7103ECE@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103EF5@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103EF9@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103EFB@GRFEXC.intern.adiscon.com> Message-ID: I meant this: 10514 remote10514 Looks more readable to me as Also another advantage is if you have parameters that contain linefeeds like message templates: 10514 $foo $bar Regards, Andre Lorbach > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of Rainer Gerhards > Sent: Montag, 21. Juni 2010 15:10 > To: rsyslog-users > Subject: Re: [rsyslog] feedback requested: NEW rsyslog.conf format -- XML? > > > -----Original Message----- > > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > > bounces at lists.adiscon.com] On Behalf Of Andre Lorbach > > Sent: Monday, June 21, 2010 2:57 PM > > To: rsyslog-users > > Subject: Re: [rsyslog] feedback requested: NEW rsyslog.conf format -- > > XML? > > > > Hi, > > > > the only argument against XML I can think of is, that syntax error's > > might happen more often. > > But if you see XML as an advanced configuration language, this would > > be fine. > > > > > > Besides that I would allow and support multiple methods to express the > > parameters like in this sample: > > > > > > remote10514 > > > > > > > > For having only a few parameters, it is fine to have the parameters as > > XML-Node properties, but if you have more than a few parameters, the > > view is more readable if each parameter has its own XML-Node. > > I think you mean this: > > > > 10514 > remote10514 > > > > But what's the advantage of this over > > > listen="10514" > ruleset="remote10514" > /> > > > I have to admit that I do not see an advantage, just more text to be written > (and IMHO harder to read due to more noise). So I personally prefer the > paramter approach. Also I don't see why it should become less readable if > there are many parameters. Isn't that just a matter of how you format the > source text? > > Maybe I am overlooking something obvious. I don't have much experience > with XML... > > Rainer > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com From jeff at atlassian.com Mon Jun 21 17:48:47 2010 From: jeff at atlassian.com (Jeff Turner) Date: Tue, 22 Jun 2010 01:48:47 +1000 Subject: [rsyslog] feedback requested: NEW rsyslog.conf format -- XML? In-Reply-To: <9B6E2A8877C38245BFB15CC491A11DA7103EF9@GRFEXC.intern.adiscon.com> References: <9B6E2A8877C38245BFB15CC491A11DA7103ECE@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103EF5@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103EF9@GRFEXC.intern.adiscon.com> Message-ID: On Mon, Jun 21, 2010 at 10:40 PM, Rainer Gerhards wrote: > Hi all, > > thanks again for the good feedback. Please keep it flowing, I guess we will > hit a first milestone very soon. Then, I think, we have a format that we > could work towards and begin to look at some details. > > I have now condensed the comments and thought about them. Then, I have > looked > at some ways how config files could actually look like. What then hit me > was > that we are very close to how XML looks. For example we had: > > > listen 10514 > ruleset remote10514 > > > this almost looks like XML (which is no surprise given the fact that > Apache's > config also looks somewhat like XML and we talked about modifications that > are a bit more in the spirit of XML). The step towards full XML is not a > big > one (intentionally formated close to non-XML example): > > > ruleset="remote10514" > /> > > > While I am not a big fan of XML config files, I have to admit that the > difference between what we discussed, at least in a later stage, and XML is > slim. Seeing this, I begin to think that using an XML-based config language > has a number of advantages. Probably the most important being that I do not > need to write an maintain a dedicated parser but could use a XML-Parser > instead. Plus a validating editor could probably be a good aid in writing > config files (assuming that I get the DTD right, something I have no > experience in ;)). > > So I have converted my original proposal (NOT the last discussion state) > into > XML format. Again, I think the XML version is quite readable. Please have a > look > > Original: http://www.rsyslog.com/download/new_rsyslog.conf > XML: http://www.rsyslog.com/download/xml_rsyslog.conf > > I wonder if there is any good argument AGAINST using XML as "described" in > the sample. If nobody brings up a good argument, I'll very possibly will > try > to take the XML road and begin to look what that takes in detail. Of > course, > it would be helpful as well if you could make yourself heard if you like > XML > format ;). XML might be better than some apache-like format because: - editors will automatically do syntax-highlighting, which greatly improves readability. If you add as the first line of xml_rsyslog.conf then editors will notice it's XML, despite the unusual extension (I've tested vim, emacs and gedit). Editor support also gives you nice things like auto-intending (=G in vim) and folding. - there are some fantastic XML validation languages - see http://www.relaxng.org and http://www.dsdl.org. Fancy editors will autocomplete based on the contents of these validation files. - XML is natively unicode with a well-defined means of setting the encoding, and parsers will handle all that for you. Just wondering, is it necessary to have all those 'params' elements? Could this: just become: Overall though, after the initial curve I've come to quite like the old format :) A good format is one that is optimized for common cases. For syslog that is simple statements like: mail.* -/var/log/mail.log user.* -/var/log/user.log which becomes very verbose under XML: Anyway, if you do go the XML route I don't think the work would ever be wasted. An XML DOM is generic enough to act as an AST for any future formats. Anyone wanting a different format (eg. YAML or Sieve-like) can simply generate the XML DOM and pass it in. Jeff > I am looking forward to your feedback! > > Thanks again, > Rainer > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com > From s.coletta at unidata.it Mon Jun 21 17:59:10 2010 From: s.coletta at unidata.it (Stefano Coletta) Date: Mon, 21 Jun 2010 17:59:10 +0200 Subject: [rsyslog] FreeBSD 6.4 and rsyslog 4.3.0 remote logging (as client) doesn't work Message-ID: <4C1F8C4E.4010403@unidata.it> Hello, I've installed rsyslog 4.3.0 on a freebsd 6.4-RELEASE-p6 i386 and unfortunately I cannot send any syslog data to a remote server by using the standard configuration directive: *.* @192.168.0.1 or *.* @@192.168.0.1 By using tcpdump, as far I can see, *no packets* go out from the freebsd system (nor tcp or udp) to reach the remote syslog server 192.168.0.1. The remote server 192.168.0.1 is obviously reachable: I can make a telnet session to the port 514/tcp from the freebsd server. Enabling debug info not helped, it seems there is no error anywhere. This is how it has been compiled: rsyslog will be compiled with the following settings: Multithreading support enabled: yes Klog functionality enabled: yes (bsd) Regular expressions support enabled: yes Zlib compression support enabled: yes MySql support enabled: no libdbi support enabled: no PostgreSQL support enabled: no Oracle (OCI) support enabled: no SNMP support enabled: no Mail support enabled: yes RELP support enabled: no imdiag enabled: no file input module enabled: yes input template module will be compiled: yes output template module will be compiled: no omprog module will be compiled: no omstdout module will be compiled: no Large file support enabled: yes Networking support enabled: yes GnuTLS network stream driver enabled: no Enable GSSAPI Kerberos 5 support: no Debug mode enabled: no Runtime Instrumentation enabled: no Diagnostic tools enabled: no valgrind support settings enabled: no rsyslog runtime will be built: yes rsyslogd will be built: yes >From -v switch: [root at server]# rsyslogd -v rsyslogd 4.3.0, compiled with: FEATURE_REGEXP: Yes FEATURE_LARGEFILE: Yes FEATURE_NETZIP (message compression): Yes GSSAPI Kerberos 5 support: No FEATURE_DEBUG (debug build, slow code): No Atomic operations supported: No Runtime Instrumentation (slow code): No This is the first time I use rsyslog on a freebsd system. Any ideas? Thank you in advance. -- Greetings, Stefano Coletta Unidata S.p.a. Via Portuense, 1555 - 00143 Roma Commercity - Modulo M25 Tel +39 06404041 Fax +39 0640404002 E-mail s.coletta at unidata.it From Jon_Franz at symantec.com Tue Jun 22 02:10:46 2010 From: Jon_Franz at symantec.com (Jon Franz) Date: Mon, 21 Jun 2010 17:10:46 -0700 Subject: [rsyslog] FW: UDPSpoof RPM for Centos 5.4 x86_64 Message-ID: <593B71C383F94847BCBD5D5807D9384203461935@TUS1XCHCLUPIN23.enterprise.veritas.com> Here is the error I am getting: [root at mgtsyslog1 rsyslog-5.4.0]# make make all-recursive make[1]: Entering directory `/tmp/rsyslog-5.4.0' Making all in doc make[2]: Entering directory `/tmp/rsyslog-5.4.0/doc' make[2]: Nothing to be done for `all'. make[2]: Leaving directory `/tmp/rsyslog-5.4.0/doc' Making all in runtime make[2]: Entering directory `/tmp/rsyslog-5.4.0/runtime' CC librsyslog_la-conf.lo In file included from /usr/include/netinet/in.h:23, from net.h:27, from conf.c:68: /usr/include/stdint.h:13: error: conflicting types for ???int64_t??? /usr/include/sys/types.h:198: error: previous declaration of ???int64_t??? was here make[2]: *** [librsyslog_la-conf.lo] Error 1 make[2]: Leaving directory `/tmp/rsyslog-5.4.0/runtime' make[1]: *** [all-recursive] Error 1 make[1]: Leaving directory `/tmp/rsyslog-5.4.0' make: *** [all] Error 2 From: Jon Franz Sent: Sunday, June 20, 2010 12:18 PM To: rsyslog at lists.adiscon.com Subject: UDPSpoof RPM for Centos 5.4 x86_64 Does anyone have an RPM for the udpspoof module 5.4.0 that will work on Centos 5.4 x86_64. I am having issues getting it to make using --enable-omudpspoof option. Any help is greatly appreciated. From rgerhards at hq.adiscon.com Tue Jun 22 10:29:45 2010 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Tue, 22 Jun 2010 10:29:45 +0200 Subject: [rsyslog] feedback requested: NEW rsyslog.conf format -- XML? References: <9B6E2A8877C38245BFB15CC491A11DA7103ECE@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103EF5@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103EF9@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103EFB@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103EFC@GRFEXC.intern.adiscon.com> Message-ID: <9B6E2A8877C38245BFB15CC491A11DA7103EFD@GRFEXC.intern.adiscon.com> > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of Rainer Gerhards > Sent: Monday, June 21, 2010 3:54 PM > To: rsyslog-users > Subject: Re: [rsyslog] feedback requested: NEW rsyslog.conf format -- > XML? > > > I also have nearly zero experience with XML - but the > > one-parameter-per-node approach "looks" cleaner to me. > > I am concerned about readability when using a text editor as front-end. > The > primary reason that I mislike XML configs is that it often is extremely > hard > to read them without a dedicated tool. But... > > > Either way > > though, XML is a fairly common thing - there has to be a best > > practices approach. If going the XML route (which I also have to > > admit makes a fair bit of sense), we should do everything to stick to > > standards and best practices. I know that this will make it > > significantly easier to write configuration frontends. > > ... well spoken! That's one of the obvious things I overlooked. I'll > try to > dig out best practices and if someone knows where to look, any help is > appreciated ;) Interestingly, I find it quite hard to find best practices in regard to xml config files on the web. However, I have asked a question on DTD's yesterday on comp.text.xml, and the answers point into the direction that using single parameters (instead of attributes) seem to be the usual form. I suggest you have a look at http://groups.google.com/group/comp.text.xml/browse_thread/thread/f1b96d132e3 fdd8e# Any links to best practices would be appreciated. Thanks, Rainer From rgerhards at hq.adiscon.com Tue Jun 22 11:00:57 2010 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Tue, 22 Jun 2010 11:00:57 +0200 Subject: [rsyslog] feedback requested: NEW rsyslog.conf format -- XML? References: <9B6E2A8877C38245BFB15CC491A11DA7103ECE@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103EF5@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103EF9@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103EFB@GRFEXC.intern.adiscon.com> Message-ID: <9B6E2A8877C38245BFB15CC491A11DA7103EFE@GRFEXC.intern.adiscon.com> > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of Andre Lorbach > Sent: Monday, June 21, 2010 4:46 PM > To: rsyslog-users > Subject: Re: [rsyslog] feedback requested: NEW rsyslog.conf format -- > XML? > > I meant this: > > > 10514 > remote10514 > > > Looks more readable to me as > listen="10514" > ruleset="remote10514" > /> really? Good to hear this, my personal perception is just the opposite. Of course, that doesn't imply anything about what is best... Just let me elaborate that *I* find the first sample less readable because there is so much "clutter" around the actually important text. > Also another advantage is if you have parameters that contain linefeeds > like > message templates: > > > 10514 > $foo > > $bar > That's a very good argument! Rainer > > Regards, > Andre Lorbach > > > -----Original Message----- > > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > > bounces at lists.adiscon.com] On Behalf Of Rainer Gerhards > > Sent: Montag, 21. Juni 2010 15:10 > > To: rsyslog-users > > Subject: Re: [rsyslog] feedback requested: NEW rsyslog.conf format -- > XML? > > > > > -----Original Message----- > > > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > > > bounces at lists.adiscon.com] On Behalf Of Andre Lorbach > > > Sent: Monday, June 21, 2010 2:57 PM > > > To: rsyslog-users > > > Subject: Re: [rsyslog] feedback requested: NEW rsyslog.conf format > -- > > > XML? > > > > > > Hi, > > > > > > the only argument against XML I can think of is, that syntax > error's > > > might happen more often. > > > But if you see XML as an advanced configuration language, this > would > > > be fine. > > > > > > > > > Besides that I would allow and support multiple methods to express > the > > > parameters like in this sample: > > > > > > > > > remote10514 > > > > > > > > > > > > For having only a few parameters, it is fine to have the parameters > as > > > XML-Node properties, but if you have more than a few parameters, > the > > > view is more readable if each parameter has its own XML-Node. > > > > I think you mean this: > > > > > > > > 10514 > > remote10514 > > > > > > > > But what's the advantage of this over > > > > > > > listen="10514" > > ruleset="remote10514" > > /> > > > > > > I have to admit that I do not see an advantage, just more text to be > written > > (and IMHO harder to read due to more noise). So I personally prefer > the > > paramter approach. Also I don't see why it should become less > readable if > > there are many parameters. Isn't that just a matter of how you format > the > > source text? > > > > Maybe I am overlooking something obvious. I don't have much > experience > > with XML... > > > > Rainer > > _______________________________________________ > > rsyslog mailing list > > http://lists.adiscon.net/mailman/listinfo/rsyslog > > http://www.rsyslog.com > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com From rgerhards at hq.adiscon.com Tue Jun 22 11:08:45 2010 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Tue, 22 Jun 2010 11:08:45 +0200 Subject: [rsyslog] feedback requested: NEW rsyslog.conf format -- XML? References: <9B6E2A8877C38245BFB15CC491A11DA7103ECE@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103EF5@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103EF9@GRFEXC.intern.adiscon.com> Message-ID: <9B6E2A8877C38245BFB15CC491A11DA7103EFF@GRFEXC.intern.adiscon.com> > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of Jeff Turner > Sent: Monday, June 21, 2010 5:49 PM > To: rsyslog-users > Subject: Re: [rsyslog] feedback requested: NEW rsyslog.conf format -- > XML? > > XML might be better than some apache-like format because: > - editors will automatically do syntax-highlighting, which greatly > improves > readability. If you add as the first line of > xml_rsyslog.conf then editors will notice it's XML, despite the unusual > extension (I've tested vim, emacs and gedit). Editor support also gives > you > nice things like auto-intending (=G in vim) and folding. > - there are some fantastic XML validation languages - see > http://www.relaxng.org and http://www.dsdl.org. Fancy editors will > autocomplete based on the contents of these validation files. > - XML is natively unicode with a well-defined means of setting the > encoding, and parsers will handle all that for you. > > Just wondering, is it necessary to have all those 'params' elements? > Could > this: > > > > > just become: > > It's a problem of namespace pollution. In the sample I gave, attributes were used for parameters which are supported by the rsyslog core, so they always exist. Those in params were module-specific. If all of them are either parameters or all are attributes, I have the problem that rsyslog may define a new parameter that someone somewhere has used as parameter inside a module, so I would potentially break compatibility. On the other hand, there are probably ways to work around this. For example, we could permit renaming of module parameter names as part of the module load definition. This is probably extra work, but may be worth the effort. Unification might be handy. > Overall though, after the initial curve I've come to quite like the old > format :) A good format is one that is optimized for common cases. For > syslog that is simple statements like: > > mail.* -/var/log/mail.log > user.* -/var/log/user.log > > which becomes very verbose under XML: > > > > > > > > > > > > > > > > > Anyway, if you do go the XML route I don't think the work would ever be > wasted. An XML DOM is generic enough to act as an AST for any future > formats. Anyone wanting a different format (eg. YAML or Sieve-like) can > simply generate the XML DOM and pass it in. Good point, and actually something I am thinking about. I would also love to use the current format to form the AST. This would have great benefit. Unfortunately, my first thinking about how to do this shows that it is very, very hard because we do not have proper scoping in the current format. Rainer From rgerhards at hq.adiscon.com Tue Jun 22 11:09:56 2010 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Tue, 22 Jun 2010 11:09:56 +0200 Subject: [rsyslog] FreeBSD 6.4 and rsyslog 4.3.0 remote logging (as client) doesn't work References: <4C1F8C4E.4010403@unidata.it> Message-ID: <9B6E2A8877C38245BFB15CC491A11DA7103F00@GRFEXC.intern.adiscon.com> > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of Stefano Coletta > Sent: Monday, June 21, 2010 5:59 PM > To: rsyslog at lists.adiscon.com > Subject: [rsyslog] FreeBSD 6.4 and rsyslog 4.3.0 remote logging (as > client) doesn't work > > Hello, > I've installed rsyslog 4.3.0 on a freebsd 6.4-RELEASE-p6 i386 and > unfortunately I cannot send any syslog data to a remote server by using > the standard configuration directive: > > *.* @192.168.0.1 > > or > > *.* @@192.168.0.1 > > By using tcpdump, as far I can see, *no packets* go out from the > freebsd > system (nor tcp or udp) to reach the remote syslog server 192.168.0.1. > > The remote server 192.168.0.1 is obviously reachable: I can make a > telnet session to the port 514/tcp from the freebsd server. > > Enabling debug info not helped, it seems there is no error anywhere. You probably were not able to fully interpret it (it's not easy). If you mail me a copy privately (the list will probably reject it due to size and/or format), I'll have a look and check if I see something. Rainer > > This is how it has been compiled: > > rsyslog will be compiled with the following settings: > > Multithreading support enabled: yes > Klog functionality enabled: yes (bsd) > Regular expressions support enabled: yes > Zlib compression support enabled: yes > MySql support enabled: no > libdbi support enabled: no > PostgreSQL support enabled: no > Oracle (OCI) support enabled: no > SNMP support enabled: no > Mail support enabled: yes > RELP support enabled: no > imdiag enabled: no > file input module enabled: yes > input template module will be compiled: yes > output template module will be compiled: no > omprog module will be compiled: no > omstdout module will be compiled: no > Large file support enabled: yes > Networking support enabled: yes > GnuTLS network stream driver enabled: no > Enable GSSAPI Kerberos 5 support: no > Debug mode enabled: no > Runtime Instrumentation enabled: no > Diagnostic tools enabled: no > valgrind support settings enabled: no > rsyslog runtime will be built: yes > rsyslogd will be built: yes > > >From -v switch: > > [root at server]# rsyslogd -v > rsyslogd 4.3.0, compiled with: > FEATURE_REGEXP: Yes > FEATURE_LARGEFILE: Yes > FEATURE_NETZIP (message compression): Yes > GSSAPI Kerberos 5 support: No > FEATURE_DEBUG (debug build, slow code): No > Atomic operations supported: No > Runtime Instrumentation (slow code): No > > This is the first time I use rsyslog on a freebsd system. > > Any ideas? > > Thank you in advance. > > -- > Greetings, > Stefano Coletta > > Unidata S.p.a. > Via Portuense, 1555 - 00143 Roma > Commercity - Modulo M25 > Tel +39 06404041 > Fax +39 0640404002 > E-mail s.coletta at unidata.it > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com From rgerhards at hq.adiscon.com Tue Jun 22 12:23:04 2010 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Tue, 22 Jun 2010 12:23:04 +0200 Subject: [rsyslog] Next Try ;) NEW rsyslog.conf format -- XML? References: <9B6E2A8877C38245BFB15CC491A11DA7103ECE@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103EF5@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103EF9@GRFEXC.intern.adiscon.com><9B6E2A8877C38245BFB15CC491A11DA7103EFB@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103EFE@GRFEXC.intern.adiscon.com> Message-ID: <9B6E2A8877C38245BFB15CC491A11DA7103F02@GRFEXC.intern.adiscon.com> Hi all, I have now condensed all points brought up and crafted a sample with a) unified parameter names (accepting namespace pollution as a minor problem) b) almost everything expressed by its own param elements The sample is available at http://www.rsyslog.com/download/xml_params_rsyslog.conf I have to admit that it doesn't look as bad as I feared (at least when looking at it with at least simple syntax highlighting). All in all, I think this format could work well enough. I myself do not have any objections any longer against it. Does somebody else have concerns? Please let me know your feedback, Rainer > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of Rainer Gerhards > Sent: Tuesday, June 22, 2010 11:01 AM > To: rsyslog-users > Subject: Re: [rsyslog] feedback requested: NEW rsyslog.conf format -- > XML? > > > -----Original Message----- > > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > > bounces at lists.adiscon.com] On Behalf Of Andre Lorbach > > Sent: Monday, June 21, 2010 4:46 PM > > To: rsyslog-users > > Subject: Re: [rsyslog] feedback requested: NEW rsyslog.conf format -- > > XML? > > > > I meant this: > > > > > > 10514 > > remote10514 > > > > > > Looks more readable to me as > > > listen="10514" > > ruleset="remote10514" > > /> > > really? Good to hear this, my personal perception is just the opposite. > Of > course, that doesn't imply anything about what is best... Just let me > elaborate that *I* find the first sample less readable because there is > so > much "clutter" around the actually important text. > > > Also another advantage is if you have parameters that contain > linefeeds > > like > > message templates: > > > > > > 10514 > > $foo > > > > $bar > > > > That's a very good argument! > > Rainer > > > > Regards, > > Andre Lorbach > > > > > -----Original Message----- > > > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > > > bounces at lists.adiscon.com] On Behalf Of Rainer Gerhards > > > Sent: Montag, 21. Juni 2010 15:10 > > > To: rsyslog-users > > > Subject: Re: [rsyslog] feedback requested: NEW rsyslog.conf format > -- > > XML? > > > > > > > -----Original Message----- > > > > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > > > > bounces at lists.adiscon.com] On Behalf Of Andre Lorbach > > > > Sent: Monday, June 21, 2010 2:57 PM > > > > To: rsyslog-users > > > > Subject: Re: [rsyslog] feedback requested: NEW rsyslog.conf > format > > -- > > > > XML? > > > > > > > > Hi, > > > > > > > > the only argument against XML I can think of is, that syntax > > error's > > > > might happen more often. > > > > But if you see XML as an advanced configuration language, this > > would > > > > be fine. > > > > > > > > > > > > Besides that I would allow and support multiple methods to > express > > the > > > > parameters like in this sample: > > > > > > > > > > > > remote10514 > > > > > > > > > > > > > > > > For having only a few parameters, it is fine to have the > parameters > > as > > > > XML-Node properties, but if you have more than a few parameters, > > the > > > > view is more readable if each parameter has its own XML-Node. > > > > > > I think you mean this: > > > > > > > > > > > > 10514 > > > remote10514 > > > > > > > > > > > > But what's the advantage of this over > > > > > > > > > > > listen="10514" > > > ruleset="remote10514" > > > /> > > > > > > > > > I have to admit that I do not see an advantage, just more text to > be > > written > > > (and IMHO harder to read due to more noise). So I personally prefer > > the > > > paramter approach. Also I don't see why it should become less > > readable if > > > there are many parameters. Isn't that just a matter of how you > format > > the > > > source text? > > > > > > Maybe I am overlooking something obvious. I don't have much > > experience > > > with XML... > > > > > > Rainer > > > _______________________________________________ > > > rsyslog mailing list > > > http://lists.adiscon.net/mailman/listinfo/rsyslog > > > http://www.rsyslog.com > > _______________________________________________ > > rsyslog mailing list > > http://lists.adiscon.net/mailman/listinfo/rsyslog > > http://www.rsyslog.com > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com From david at lang.hm Tue Jun 22 13:15:34 2010 From: david at lang.hm (david at lang.hm) Date: Tue, 22 Jun 2010 04:15:34 -0700 (PDT) Subject: [rsyslog] feedback requested: NEW rsyslog.conf format In-Reply-To: References: <9B6E2A8877C38245BFB15CC491A11DA7103ECE@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103ED8@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103EDB@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103EDC@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103EDD@GRFEXC.intern.adiscon.com> Message-ID: On Thu, 17 Jun 2010, RB wrote: > On Thu, Jun 17, 2010 at 15:47, Rainer Gerhards wrote: >> I think you actually missed on level. At the top there are rulesets. These >> are composed out of multiple rules. Each rule than is composed of a filter >> (selector) and multiple actions. This hierarchy already exists in v5, but the >> top level is seldom used. >> >> I am talking about the relationship between inputs and rulsets (not rules). >> What I intended to say is that I don?t see a need that a single input can be >> bound to more than one ruleset. On the other hand, it is definitely necessary >> to have the ability to bind a ruleset to more than one input. >> >> I hope that clarifies. > > It does clarify; I did indeed miss the rule->ruleset relationship. > > To justify an m:n relationship, then, I think you'd have to evaluate > what features other than sub-rules a ruleset provides. So far as I > can tell from reading the docs in the source tree, there are currently > two: independent queues and message parsers. It is conceivable (but > perhaps not practical) that a user may want to parse a given message > multiple ways, or to queue differently based on the speed of a given > rule+action. If rules themselves had independent queues and message > parsers, then that would return to the 1:n. I don't see a big need to be able to have multiple parsers for a single input message the reason for having multiple parsers is to get the message from whatever garbage is going over the wire to a standard format that can then get processed by everything else. output formatting is where you would take that standard representation and customize it for whatever output channel you are using. I see a _lot_ of value in keeping the middle standard and making it so that input doesn't care what output will be used (and the output doesn't care what input was used), and neither one should care what sort of queue was used. > That is to say, as long as there are features unique to rulesets, it > is conceivable that a user would desire to map a single input to > several of them. it's possible to have the output of one ruleset go to other rulesets. I think this can cover this case. David Lang > Please do note, I've not had cause to dig through all of v5's > features, so what I speak of may already be addressed. > > > RB > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com From david at lang.hm Tue Jun 22 13:52:19 2010 From: david at lang.hm (david at lang.hm) Date: Tue, 22 Jun 2010 04:52:19 -0700 (PDT) Subject: [rsyslog] feedback requested: NEW rsyslog.conf format -- XML? In-Reply-To: <9B6E2A8877C38245BFB15CC491A11DA7103EF9@GRFEXC.intern.adiscon.com> References: <9B6E2A8877C38245BFB15CC491A11DA7103ECE@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103EF5@GRFEXC.intern.adiscon.com> <9B6E2A8877C38245BFB15CC491A11DA7103EF9@GRFEXC.intern.adiscon.com> Message-ID: On Mon, 21 Jun 2010, Rainer Gerhards wrote: > Original: http://www.rsyslog.com/download/new_rsyslog.conf > XML: http://www.rsyslog.com/download/xml_rsyslog.conf > > I wonder if there is any good argument AGAINST using XML as "described" in > the sample. If nobody brings up a good argument, I'll very possibly will try > to take the XML road and begin to look what that takes in detail. Of course, > it would be helpful as well if you could make yourself heard if you like XML > format ;). XML can be a good format or a horrible format depending on how it's used. poorly used it is horribly verbose with lots of strange characters and rules. My policy towards XML files is to do the following don't use generic tags with a 'type' atribute, use more specific tags for system defined elements. if something can only take an option once, that should be an attribute to a tag if something can happen multiple times, it is a separate element example: instead of do instead of