[rsyslog] multi-message handling and databases

david at lang.hm david at lang.hm
Sat Apr 18 00:28:40 CEST 2009


the company that I work for has decided to sponser multi-message queue 
output capability, they have chosen to remain anonomous (I am posting from 
my personal account)

there are two parts to this.

1. the interaction between the output module and the queue

2. the configuration of the output module for it's interaction with the 
database

for the first part (how the output module interacts with the queue), the 
criteria are that

1. it needs to be able to maintain guarenteed delivery (even in the face 
of crashes, assuming rsyslog is configured appropriately)

2. at low-volume times it must not wait for 'enough' messages to 
accumulate, messages should be processed with as little latency as 
possible



to meet these criteria, what is being proposed is the following

a configuration option to define the max number of messages to be 
processed at once.

the output module goes through the following loop

X=max_messages

if (messages in queue)
   mark that it is going to process the next X messages
   grab the messages
   format them for output
   attempt to deliver the messages
   if (message delived sucessfully)
     mark messages in the queue as delivered
     X=max_messages (reset X in case it was reduced due to delivery errors)
   else (delivering this batch failed, reset and try to deliver the first half)
     unmark the messages that it tried to deliver (putting them back into the status where no delivery has been attempted)
     X=int(# messages attempted / 2)
     if (X=0)
       unable to deliver a single message, do existing message error 
process



this approach is more complex than a simple 'wait for X messages, then 
insert them all', but it has some significant advantages

1. no waiting for 'enough' things to happen before something gets written

2. if you have one bad message, it will transmit all the good messages 
before the bad one, then error out only on the bad one before picking up 
with the ones after the bad one.

3. nothing is marked as delivered before delivery is confirmed.



an example of how this would work

max_messages=15

messages arrive 1/sec

it takes 2+(# messages/2) seconds to process each message (in reality the 
time to insert things into a database is more like 10 + (# messages / 100) 
or even more drastic)

with the traditional rsyslog output, this would require multiple output 
threads to keep up (processing a single message takes 1.5 seconds with 
messages arriving 1/sec)

with the new approach and a cold start you would see

message arrives (Q=1) at T=0
om starts processing message a T=0 (expected to take 2.5)
message arrives (Q=2) at T=1
message arrives (Q=3) at T=2
om finishes processing message (Q=2) at T=2.5
om starts processing 2 messages at T=2.5 (expected to take 3)
message arrives (Q=4) at T=3
message arrives (Q=5) at T=4
message arrives (Q=6) at T=5
om finishes processing 2 messages  (Q=4) at T=5.5
om starts processing 4 messages at T=5.5 (expected to take 4)
message arrives (Q=5) at T=6
message arrives (Q=6) at T=7
message arrives (Q=7) at T=8
message arrives (Q=8) at T=9
om finishes processing 4 messages  (Q=4) at T=9.5
om starts processing 4 messages at T=9.5 (expected to take 4)

the system is now in a steady state

message arrives (Q=5) at T=10
message arrives (Q=6) at T=11
message arrives (Q=7) at T=12
message arrives (Q=8) at T=13
om finishes processing 4 messages  (Q=4) at T=13.5
om starts processing 4 messages at T=13.5 (expected to take 4)

if a burst of 10 extra messages arrived at time 13.5 this last item would 
become

11 messages arrive at (Q=14) at T=13.5
om starts processing 14 messages at T=13.5 (expected to take 9)
message arrives (Q=15) at T=14
message arrives (Q=16) at T=15
message arrives (Q=17) at T=16
message arrives (Q=18) at T=17
message arrives (Q=19) at T=18
message arrives (Q=20) at T=19
message arrives (Q=21) at T=20
message arrives (Q=22) at T=21
message arrives (Q=23) at T=22
om finishes processing 14 messages (Q=9) at T=22.5
om starts processing 9 messages at T=22.5 (expected to take 6.5)



thoughts?

David Lang



More information about the rsyslog mailing list