[rsyslog] output plugin calling interface

Rainer Gerhards rgerhards at hq.adiscon.com
Wed May 6 18:59:14 CEST 2009


> -----Original Message-----
> From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog-
> bounces at lists.adiscon.com] On Behalf Of david at lang.hm
> Sent: Wednesday, May 06, 2009 6:51 PM
> To: rsyslog-users
> Subject: Re: [rsyslog] output plugin calling interface
> 
> On Wed, 6 May 2009, Rainer Gerhards wrote:
> 
> >>>>>> in the case of rsyslog
> >>>>>> (where we are commiting a set of unrelated messages) it is not
> >>> nessasarily
> >>>>>> a fatal problem, but it just seems to complicate things with little
> >>>>>> benifit.
> >>>>>
> >>>>> It actually simplifies things - because we need not take different
> >>>> approaches
> >>>>> to different type of output plugins.
> >>>>
> >>>> I'm seeing it as the other way around, this complicates things by
making
> >>>> there be different ways for the transaction to be committed.
> >>>
> >>> OK, but what is complicated by that? Also think about third-party
> > (already
> >>> existing) output modules, which I need to define either a totally
> > different
> >>> output interface for or use the the extensible one I described. Even if
I
> >>> force a totally new interface, I still need to support
non-transactional
> >>> outputs in the upper layers. But then I need different code pathes to
do
> >>> that.
> >>
> >> I was thinking about this, and am wondering if it's really a bad thing
to
> >> seperate the code paths.
> >>
> >> I'm thinking along the lines that if the module doesn't support the
> >> doTransaction method you set batch_size=1 and use the old doAction()
> >> interface.
> >
> > Oh, that's a *very serious* performance hit. Because batches not only
> support
> > transactions but are the defining factor for the number of queue lock
> > operations on a busy system. So with a batch size of 32, you do 1 lock.
With
> > a size of 1, you do 32 locks - both to process 32 messages.
> 
> exactly as I expected. note that the performance hit is no worse than how
> things work today.

Yes, I know. But why deliberately do worse than you can?

> 
> > And now consider the usual config, where almost everything runs in direct
> > mode. So if at least one output does not support batches, the main
message
> > queue must run at batches of one.
> >
> > Plus, I need complex logic to detect all that. A unified interface is
simple
> > and elegant and does not show any of these problems.
> 
> yes, you need to either limit your batch size to the minumum of the batch
> sizes defined for all the outputs that you are dealing with, or you need
> very complicated logic

... but I need complicated logic to find that minimum batch size...
> 
> for example:
> 
> does a message that is filtered out count against the batch size?
> 
> if you get 10,000 logs/sec, set a batch size of 100, and 1% of the log
> messages need to go to your database, does this mean that you will end up
> doing 100 database transactions/sec, or 1 database transaction/sec?
> 
> there are a lot of interesting and nasty interactions that you can get in
> these situations.

... and all oft hem I try to avoid by using a unified approach.

> 
> 
> I was thinking that anyone wanting to get top performance would basicly
> be required to define seperate action queues (not nessasarily large ones,
> I would say probably ~batch size * (# worker threads +1) would be a good
> default) and the main queue walker would just filter the messages into the
> action queues. then the action queue walkers would do the batching. no
> need to create a new way to buffer things, we already have a buffer
> mechanism, it's called a seperate action queue.
> 
> >> if it does support the doTransaction method you use that exclusivly (the
> >> module internally may share code between doTransaction and doAction, and
> >> most will)
> >
> > What would that simplify?
> 
> I am seeing that this would complicate the setup code a bit, but simplify
> the logic for processing messages from the queue as that wouldn't have to
> deal with batch and non-batch modes, it would only have the batch mode.

That's exactly the point: I do NOT have to deal with non-batch mode. The
current experimental code *always* dequeues batches. So the whole engine does
no longer support anything that is not a batch (but, of course, a batch can
be a single message).

> 
> >> as it is you already will have code paths that you only follow in the
case
> >> where the module supports doTransaction, is it really simpler to have
the
> >> two partially combined instead of keeping them seperate?
> >
> > These separate code pathes will probably be a single "if" statement that
> does
> > the endTransaction() call at the end of the batch. So it is actually a
> single
> > statement and three lines of code. The rest, I think, is absolutely the
> same.
> 
> I'm not seeing that. it's not just the code at the end of the batch, it's
> also pulling the appropriate number of messages off the queue in the first
> place. so I would see the minimum as two 'if' blocks

Nope, we always pull batches off the queue. One if ;)

> 
> >>>>>> I suspect that the overhead in manipulating the lock is high enough
> > that
> >>>>>> the second approach will be a win (similar to the efficiancies that
> > were
> >>>>>> gained in the UDP input module by letting it add multiple messages
to
> >>> the
> >>>>>> queue inside one lock).
> >>>>>>
> >>>>>> As such I am seeing significant value in making the doAction() call
be
> >>>>>> lightweight under all conditions, which is an argument against
having
> > it
> >>>>>> do any more than nessasary.
> >>>>>
> >>>>> We do not know what this may be. Again, don't think "database only".
> >>>>
> >>>> I am not. even if the action is writing to disk (with fsync), to a
pipe,
> >>>> or calling an external program it can take a significant amount of
time.
> >>>> you say above that doAction could block for hours, so I am confused a
> > bit
> >>>> here. (the new version of the document may clear this up)
> >>>
> >>> My sentence should better be phrased "can not be called where you
> > described
> >>> it". See my chart and you'll see that the queue mutex is never looked
for
> > an
> >>> extended period of time - not even in ultra-reliable mode (part of why
it
> > is
> >>> so hard to do it).
> >>>
> >>> My understanding is that you work under the assumption that doAction()
is
> >>> called during dequeueing. As we don't know what doAction() will do - it
> > may
> >>> be very lengthy - it is strictly de-coupled from anything that holds
the
> >>> queue mutex (this also applies to v3 and is the base design). You may
> > take
> >>> the word "base design" as a hint that probably the whole engine would
> > need
> >> to
> >>> be rewritten if that design is to be changed. I do not see any reasons
> > for
> >>> such a change ;)
> >>>
> >>> I guess our discussion circles around the point that I did not yet
convey
> >> the
> >>> full picture on how things work. At least I have the "feeling" that you
> > have
> >>> a different architecture on your mind.
> >>
> >> yes, from prior discussions I was thinking that you were planning to
have
> >> the queue worker iterate through the messaes on the queue, calling
> >> doAction() for each one as it got to it, and then when it hits the limit
> >> doing endTransaction()
> >
> > I am not sure if you overlooked that message, the new queue worker
already
> > *exists*. So you can look at it in actual code. Same for parts of the
action
> > interface (without the "real" transaction support).
> >
> >
>
http://git.adiscon.com/?p=rsyslog.git;a=blob;f=runtime/queue.c;h=c2df928b6303
> > 34449d117aa191ec1d502525346b;hb=multi-dequeue#l1396
> >
> > All of this, of course, is experimental and will not work if only the
> > slightest error occurs somewhere in the system (and it may crash even if
> > everything else runs fine ;))
> 
> I'll try to look at this
> 
> >>
> >> if you are instead going to do what you just described (pull the entire
> >> list of messages, then unlock the queue and process them) then it
doesn't
> >> matter if the work is done in doAction() or endTransaction()
> >>
> >> the question of if it makes sense to have the batch mode be a completely
> >> seperate codepath, or have the two logical paths combined is still a
good
> >> one.
> >>
> >> thinking out loud here, how hard would it be to detect that there is no
> >> processBatch() method and create one that is a wrapper around the
> >> doAction() interface when you load the module, and then let all the code
> >> after that use the new interface?
> >
> > That sounds pretty much like what I am proposing above ;)
> >
> > You can consider this to be the wrapper:
> >
> >
>
http://git.adiscon.com/?p=rsyslog.git;a=blob;f=action.c;h=928b30dc3c5cbb2f6af
> > 41cdf2ea02a3f1c14b06a;hb=multi-dequeue#l544
> >
> > This is the too-simplistic version, but I hope it conveys the idea.
> >
> > The bottom line is that it is more complicated to create a real wrapper
than
> > to implement the interface in the way I described. I fail to see the
> > complexity that you are so concerned about...
> 
> Ok, I could be wrong here.

Maybe me - but I am arguing so hard because I do not at all see any
complexity. But this may be a sign that I overlooked something. Better notice
now than later...


Raienr 
> 
> David Lang
> _______________________________________________
> rsyslog mailing list
> http://lists.adiscon.net/mailman/listinfo/rsyslog
> http://www.rsyslog.com



More information about the rsyslog mailing list