[rsyslog] output plugin calling interface

david at lang.hm david at lang.hm
Thu Apr 30 01:31:45 CEST 2009


On Wed, 29 Apr 2009, Rainer Gerhards wrote:

>>
>> the key to the locking is to try to both minimize the work
>> done inside the
>> lock, and minimize the time the lock is held.
>>
>> in general I think the locking should be something along the lines of
>>
>> doStartbatch()
>>    lock action mutex
>>      multiple doAction() calls
>>    unlock action mutex
>> doEndbatch()
>
> I overlooked a very important point, and it now appears to me. I *must* lock
> the complete batch, including doEndBatch(). The reason is that a single
> app-level transaction otherwise has no definite start and end point. So we
> would not know what is committed and what not (if another thread puts
> messages into the transaction queue. That breaks the whole model. The only
> solution is to hold the lock during the whole tranaction. As you outline,
> this should not be too long. Even if it took considerable time, there would
> be limited usefulness in interleaving other doAction calls, as this would
> simply cause additional time. At this point, I think there is no real benefit
> in running multiple threads concurrently.

you really don't want to hold the lock through the doEndBatch() call, that 
can potentially take a _long_ time, and that is the time when you most 
want the ability to have other things accessing the queue (note that I may 
be misunderstanding the definition of the lock here)

the output module will be tied up this entire time, but you need other 
things to be able to access the queue (definantly addding things to the 
queue, but there is a win in having the ability for another reader thread 
to be pushing things to a different copy of the output module at the same 
time)

so this means that when you are goign through and doing the doAction() 
call, you are marking that you are working on that queue entry. then you 
release the lock and the next reader that comes along will skip over the 
entries that you have claimed and work to deliver the next N messages.

then when you get the results of doEndBatch() you go back and mark some or 
all of those messages as completed (removing them from the queue). note 
that with multiple worker threads you have the potential to have items 
that aren't the oldest ones in the queue being completed before the oldest 
ones, batching may make bigger holes, but the potential for holes was 
there all along.

>> and the output module should be written to defer as much
>> processing as
>> possible to the doEndbatch() call to make the doAction() call
>> as fast as
>> possible.
>
> Sounds reasonable.

note that the reason for doAction deferring as much work as possible is to 
allow that work to be done outside of any locking

>> since most errors will not be detected during doAction (in
>> fact, the only
>> errors I can think of that will happen at this point are
>> rsyslog resource
>> contratints), the error handling will need to be done after
>> doEndbatc()
>> returns
>>
>> at that point the output module may not know which of the
>> messages caused
>> the error (if the module sends the messages as a transaction
>> it may just
>> know that the transaction failed, and have to do retries with
>> subsets to
>> narrow down which message caused the failure)
>>
>> as long as at least one message is sucessful, things are not
>> blocked and
>> should continue. it's only when doEndBatch() reports that no messages
>> could be delivered that you have a possible reason to drop
>> the message
>> (and even then, only the first message. all others must be retried)
>
> Well, I wouldn't conclude that it is the first message, but "one message"
> inside the batch. So there may be some benefit in retrying the batch with
> less records (as you suggested). Under the assumption that usually only one
> record casues the problem, I tend to think that it may be useful to run
> commit the batch one-by-one in this case - this may be more efficient than a
> binary search for the failing record.

note that it's not _quite_ a binary search (same basic concept though), as 
you submit a subset of them they either go through or you need to try a 
smaller batch

with the individual submissions you are O(N) (on average you will have to 
commit 1/2 the batch individually before you hit the bad one, ~6 
transactions for a batch size of 10, 51 for a batch size of 100)

with the 'binary search' approach you are O(log(N)) ( ~4 transactions for 
a batch size of 10, ~7 for a batch size of 100)

the worst case is probably where the last message is the one that has the 
problem.

for the individual processing that is simple math (batch size of 100, you 
will fail the first one, then submit 99 sucessfully, then fail on the last 
one)

for the binary search it's more complicated (this is assuming the batch 
size gets bumped up when it succeeds)

assuming the batch completly fails (i.e. a database where the output 
module doesn't know which one caused it to fail)

bad message is message 100 and there are >>100 messages in the queue
fail 100
succeed 50 (bad message is now message 50)
fail 100
fail 50
succeed 25 (bad message is now message 25)
fail 100
fail 50
fail 25
suceed 12 (bad message is now message 13
fail 100
fail 50
fail 25
suceed 12 (bad message is now message 1
fail 100
fail 50
fail 25
fail 12
fail 6
fail 3
fail 1
retry 1
.
.
message 1 is bad (20 transactions + retries)

best case would be

bad message is message 1 and there are >>100 messages in the queue
fail 100
fail 50
fail 25
fail 12
fail 6
fail 3
fail 1
retry 1
.
.
message 1 is bad (7 transactions + retries)


if the output module is able to commit a partial transaction, then the 
logic devolves to

bad message is message 100 and there are >>100 messages in the queue
submit 100, succeed 99 bad message is message 1


> All in all, it looks like the algorithm needs to get a bit more complicated
> ;)

unfortunantly, but not very much more complicated.

the algorithm I posted last week cut the batch size in half for each loop 
and restored it when a commit succeded. you didn't want that to be in the 
core (and it doesn't have to be, but that means that the output module may 
need to do the retries)

the question is which side the retry logic needs to be in.

it can be in the queue walkder that calls the output module

it can be in doEndBatch() in the output module

some output modules don't need partial retries (as they can output partial 
batches)

some output modules do need partial retries.

the more complicated retry logic will work for both situations, or it can 
be implemented in each of how ever many output modules need it.

it can go either way, I tend to lean towards only having the logic in one 
place (even if it's more complicated logic than some modules need)

David Lang



More information about the rsyslog mailing list