[rsyslog] output plugin calling interface

david at lang.hm david at lang.hm
Wed May 6 13:57:39 CEST 2009


On Wed, 6 May 2009, Rainer Gerhards wrote:

>
> I have put on interim version of my model online:
>
> http://www.rsyslog.com/download/design.pdf
>
> Note that it is far from being ready, and some things are not as clear (or
> even correct) as I would like to see them. Also, it currently contains
> definitions for the various objects, but not any processing information. My
> aim is to get a very clear (and dense!) picture of the objects and the
> relationships between them from where I am convinced we can find some simple
> processing functions (which may or may not be easily enough translated in
> code).
>
> Feedback is appreciate, but please keep in mind that the document is a moving
> target. I'll be working on it full day today so I hope to have a better
> version this afternoon.

thanks for sending this.

unfortuntntly one question right away (and one that doesn't translate into 
text well :-(

at the end of the first paragraph you say (cut-n-paste into text, symbols 
mangled)

'Often, objects Oi; i 2 N; i  partition O,
but this is not necessarily the case.'

I'm not sure what is meant by this

I read that as
often
the set of objects identified by i, with i being a subset of N such that i <= |gothic O|
partition gothic O, but this is not nessasarily the case
I am lost from the point N is introduced

I think that you are saying that a set of messages to be processed  are 
frequently contiguous, but not always.



in section 2 you say

be Sm={} the totally ordered set of message states

  I think a better thing to say would be either

define Sm={} as the totally ordered set of messge states 
or
let Sm={} be the totally orderd set of message states


it may just be that I have been out of school for a long time and am 
forgetting defintitions, but it seems to me that you ae skipping the 
defintitions of a bunch of stuff in section 2. I think I'm puzzling it 
out, but eventually it's something to clean up.


you use the term 'tuple' a lot in this section, and i'm not sure it's 
always correct. to me a tuple is a set of two items, like coordinates 
(X,Y), but you use it in at least one case where you can have an arbatrary 
number of items in the list. I think that in many cases where you say 
tuple you really would be better off saying list.



in section 3.11 you say
"However, the action itself may also end the transaction and notify the 
caller." by this do you mean that the action may abort the transaction? or 
that the action could decide to commit (complete) the transaction?

if you mean abort the transaction, this makes sense (essentially on any 
doAction() call the return code could be 'fatal error, transaction 
aborted' and the queue walker code would have to fail the entire batch and 
retry) if you mean allow it to decide to commit the transaction early if 
it chooses, this strikes me as a wrong thing to do. in the case of rsyslog 
(where we are commiting a set of unrelated messages) it is not nessasarily 
a fatal problem, but it just seems to complicate things with little 
benifit.

also the fact that you use the term 'notify the caller' could be read to 
mean that it executes a callback function from the main system in this 
case, I think it's simpler to just return an error code (which is the 
other way to read this)

a bit further down you clarify this to mean that it may commit early. my 
initial reaction is that this is premature optimization. I see why you 
want to do this (the case of buffers filling up is a good example), but I 
think the complications that this produces (which you then touch on) are 
ugly enough that it may be worth the inefficiancy of failing the 
transaction (and having to resubmit a partial batch) rather than dealing 
with the need to say things like 'the transaction suceeded, but this 
message failed' as part of a doAction() call.




in defining what happens when a transaction is aborted you say 'However, 
not all outputs work on actually transactional destination.', I think the 
better way to say this (borrowing from the database terminology) is that 
not all outputs provide atomic transactions.


you say

   An output transaction is started by calling beginTransaction() either 
ex-plicitely or implicitely by a call to doAction() without calling 
beginTransaction() before

I think it's a mistake to do both. pick one or the other and standarzie on 
it. I see the value in letting the first doAction() call trigger the 
beginTransaction() in simplifying the calling code, but if the output 
module must handle this case, make that the standard way of doing things 
and eliminate the seperate call of beginTransaction entirely



this says that either the entire transaction is submitted, or it all 
fails, I think that the optimization of allowing the endTransaction() call 
to return 'the first X suceeded, the rest failed' may be worth supporting. 
it's FAR simpler than the 'doAction() may trigger an endTransaction() 
transparently' that you were exploring earlier.


in your transaction diagram, a sucessful retry should move to to sucess, 
not ready (ready is only a state that the action will be in immediatly 
after it starts, before it has processed anything. as such it's fairly 
meaningless and could probably be combined with success as 'all prior work 
is done and ready to start a new transaction'.

you may also need a state 'not initialized' for startup, or you may just 
say that the initialization needs to be done before the queues are setup, 
so it's not relavent to this discussion.


if you really do want doAction() to be able to finish a transaction and 
start another one, the state diagram will be far more complicated.

one final note on locking, I expect that the process of processing objects 
in the queue (marking them as pending, formatting them, and calling 
doAction() on them), is going to require some locking in the face of 
multiple worker threads (to prevent two threads from processing the same 
message). I see two ways of doing this.

processbatch(
   foreach message (up to limit or number in queue){
     lock queue
     mark message pending
     unlock queue
     formatMessage()
     doAction()
   }
   endTransaction()
   lock queue
   foreach message{
     mark completed
   }
   unlock queue
)

processbatch(
   lock queue
   foreach message (up to limit or number in queue){
     mark message pending
     formatMessage()
     doAction()
   }
   unlock queue
   endTransaction()
   lock queue
   foreach message{
     mark completed
   }
   unlock queue
)

I suspect that the overhead in manipulating the lock is high enough that 
the second approach will be a win (similar to the efficiancies that were 
gained in the UDP input module by letting it add multiple messages to the 
queue inside one lock).

As such I am seeing significant value in making the doAction() call be 
lightweight under all conditions, which is an argument against having it 
do any more than nessasary. I think that the beginTransaction() 
functionality is probably fast enough that adding that into doAction() is 
acceptable, but the endTransaction() functionality definantly has the 
potential to block for a substantial amount of time (and therefor must not 
happen while locks are held that could cause other threads to be waiting)

except for your isolation of the output modules (which I have questioned 
elsewhere), it could even be a win to move the formatMessage() step out of 
the inner loop and have that called from the output module (doAction() 
would just pass the pointer to the queue object, the output module would 
remember them and do the formatting inside endTransaction()), this would 
still be thread-safe as the output module would only be reading the 
message queue.

hmm, the more I think about this, the cleaner it seems to be.

It would also delay the need to do any buffer allocation until the 
endTransaction() step. this doesn't nessasarily eliminate the peak memory 
usage (all threads could be in endTransaction at the same time), but it 
will significantly reduce the average memory useage (normally they _won't_ 
all be in endTransaction() with maximum size messages at the same time)

there would be a fixed size array (based on the max batch size), to 
track what messages are in the batch. I think this is already needed for 
the worker thread to know which messages to mark as completed when the 
transaction completes

doAction() would just put a pointer (logical pointer, not C memory 
pointer) to the message contents into the next slot in the array.

there would be no need for a startTransaction.

there would be a helper function, something like
formatMessage(char *format, msg *message, char *output, int bufsize)
that would format the message and write it to the output buffer, output 
would then point at the null at the end of the string

endTransaction() would do all the work. it would allocate a buffer 
needed (note that it would know the actual message sizes, so could 
allocate based on the actual amount of data involved), do the work 
that was planned for startTransaction() (probably putting 
boilerplate in the buffer), call formatMessage() for each message in the 
list, output the messages, free the buffers, and return




note that this assumes that when you have action queues, the action queues 
contain the same data that was in the main queue, and the formatting is 
done by the queue walker thread that called doAction(), not by the thread 
that walks the main queue and puts the messages in the action queue. If 
that isn't the case (if instead, the action queues contain the formatted 
strings, not the raw messages), then this won't work, but we should have 
another discussion around this (including the potential advantages of 
single-instance-store if you have lots of action queues)

David Lang



More information about the rsyslog mailing list