[rsyslog] output plugin calling interface
david at lang.hm
david at lang.hm
Sun May 3 02:42:18 CEST 2009
On Sat, 2 May 2009, Rainer Gerhards wrote:
> After a lot of thinking today, we can have a "kind of" transactional queue,
> but we need to accept potential message *duplication* in the event of
> failures (but no loss).
this is the approach that you have taken for other things (relp for
example), and when we were discussing reliability for direct mode vs disk
queues you mentioned that rsyslog could duplicate messages in case of
failures, but would not loose messages.
> This would work without a two-phase commit. However,
> there still is considerable effort to implement it.
as I understand things the current process is
thread A recieves the message and puts it in the Queue
worker thread B pulls the message from the queue formats it and puts it in
the action queue (if there is no action queue, this triggers the output
modulecode as part of thread B.)
if there is an action queue, thread C is running, and does basicly the
same thing that thread B would do if there was no action queue
what I am envisoning is that the worker thread would touch the queue one
additional time.
instead of removing the message from the queue to perform the action it
would mark the message as being 'in process', then after the message is
delivered it would delte it from the queue (touching the queue three times
instead of two)
> I wonder if the use case
> actually justifies it. Please also consider what I wrote below on the
> performance of any ultra-reliable version. And, yes, I know we have fast and
> reliable controllers today, but even then the disk path is much, much slower
> than any memory based queue. I fail to believe you can build a very
> high-performance syslog server on a disk queue, even with the best hardware
> money can buy today.
I'm going to be testing this shortly ;-) I have a fusion IO drive to try
and will be getting some boxes with the Intel X-25E SSD drives in a couple
of weeks. the only thing I can't try is the ram-based drive.
David Lang
> Rainer
>
>> -----Original Message-----
>> From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog-
>> bounces at lists.adiscon.com] On Behalf Of Rainer Gerhards
>> Sent: Saturday, May 02, 2009 10:33 AM
>> To: rsyslog-users
>> Subject: Re: [rsyslog] output plugin calling interface
>>
>>> -----Original Message-----
>>> From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog-
>>> bounces at lists.adiscon.com] On Behalf Of david at lang.hm
>>> Sent: Saturday, May 02, 2009 10:21 AM
>>> To: rsyslog-users
>>> Subject: Re: [rsyslog] output plugin calling interface
>>>
>>> On Sat, 2 May 2009, Rainer Gerhards wrote:
>>>
>>>>> From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog-
>>>>> bounces at lists.adiscon.com] On Behalf Of david at lang.hm
>>>>>
>>>>> On Fri, 1 May 2009, Rainer Gerhards wrote:
>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: rsyslog-bounces at lists.adiscon.com
>>>>>>> [mailto:rsyslog-bounces at lists.adiscon.com] On Behalf Of
> david at lang.hm
>>>>>>>
>>>>>>> On Fri, 1 May 2009, Rainer Gerhards wrote:
>>>>>>>
>>>>>>>> Please let me know if you also find a math model useful
>>>>>>> (but I'll probably
>>>>>>>> need to do it in any case, because it helps me clean up my
> mind...).
>>>>>>>
>>>>>>> I think it will help clarify things a lot. with a good model
>>>>>>> we won't have
>>>>>>> misunderstandings about what we are talking about.
>>>>>>
>>>>>> Yes - and I also think that with the model some complexities
> disappear.
>> I
>>>>>> think (hope I am right) the solution will become obvious. I know I am
>>>>>> investing a lot of time in a tiny portion of the code, but this is
> one
>> of
>>>>> the
>>>>>> core elements involving many complexities.
>>>>>>
>>>>>>> with my 'binary search' approach, handling permanently bad
>>>>>>> messages could
>>>>>>> be as simple as 'too many retries once we hit a batch size of
>>>>>>> 1' (with a
>>>>>>> possible option of the output module reporting back that it
> dectected
>>>>>>> something that makes retries useless, but this is just an
>>>>>>> optimization)
>>>>>>
>>>>>> Yes, indeed. One quick thought: I see a batch as a set of (msg,
> state)
>>>>>> ordered pairs. Once we have procssed it in one action (all of them
> have
>>>>>> entered one permanent state), we can than build a subset that we use
> as
>>>> the
>>>>>> new (remaining) batch in the backup actions. So the "bad record
> search"
>>>> is
>>>>>> "just" one facet of many that we need to handle with little and
>> hopefully
>>>>>> simple code (doing it with 2000 LoC would be rather easy ;)).
>>>>>
>>>>> I agree with the definition of a batch. Let's see what different
> states
>>>>> you are thinking of.
>>>>>
>>>>> I am currently assuming that the messages stay in the queue (with the
>>>>> state attached) so that if rsyslog restarts (assuming disk queues), it
>>>>> will realize that the message hasn't been delivered and try again.
>>>>
>>>> No, it is different: the batch is actually dequeued. So if at that
> point
>> we
>>>> have a system power failure (for whatever reason), the messages are
> lost.
>>>> While the rsyslog engine intends to be very reliable, it is not a
>> complete
>>>> transactional system. A slight risk remains. For this, you need to
>>> understand
>>>> what happens when the batch is processed. I assume that we have no
>> sudden,
>>>> untrappable process termination. Then, if a batch cannot be processed,
> it
>> is
>>>> returned back to the top of queue. This is not yet implemented, but is
>> how
>>>> single messages (which you can think of an abstraction of a batch in
> the
>>>> current code) are handled. If, for example, the engine shuts down, but
> an
>>>> action takes longer than the configured shutdown timeout, the action is
>>>> cancelled and the queue engine reclaims the unprocessed messages. They
> go
>>>> into a special area inside the .qi file and are placed on top of the
>> queue
>>>> once the engine restarts.
>>>>
>>>> The only case where this not work is sudden process termination. I see
>> two
>>>> cases:
>>>>
>>>> a) a fatal software bug
>>>> We cannot really address this. Even if the messages were remaining in
> the
>>>> queue until finally processed, a software bug (maybe an invalid
> pointer)
>> may
>>>> affect the queue structures at large, possibly even at the risk of
> total
>>> loss
>>>> of all data inside that queue. So this is an inevitable risk.
>>>>
>>>> b) sudden power fail
>>>> ... which can and should be mitigated at another level
>>>>
>>>> One may argue that there also is
>>>>
>>>> c) admin error
>>>> e.g, kill -9 rsyslogd
>>>> Here a fully transactional queue will probably help.
>>>>
>>>> However, I do not think that the risk involved justifies a far more
>> complex
>>>> fully transactional implementation of the queue object. Some risk
> always
>>>> remains (what in the disaster case, even with a fully transactional
>> queue?).
>>>>
>>>> And it is so complex to let the messages stay in queue because it is
>> complex
>>>> to work with such messages and disk queues. It would also cost a lot of
>>>> performance, especially when done reliably (need to sync). We would
> then
>>> need
>>>> to touch each element at least four times, twice as much as currently.
>> Also,
>>>> the hybrid disk/memory queues become very, very complex. There are more
>>>> complexities around this, I just wanted to tell the most obvious.
>>>>
>>>> So, all in all, the idea is that messages are dequeued, processed and
> put
>>>> back to the queue (think: ungetc()) when something goes wrong.
> Reasonable
>>>> (but not more) effort is made to prevent message loss while the
> messages
>> are
>>>> in unprocessed state outside of the queue.
>>>>
>>>> Hope that clarifies and I am glad you brought this up. Made me think
>> again,
>>>> but I concluded to what I've written above ;)
>>>
>>> this is definantly different from the way I thought things worked from
> our
>>> prior discussions about reliability. from those I understood that rsyslog
>>> could be used to make a fully reliable system, if you are willing to take
>>> the performance hit to do so.
>>
>> You can, but than you need to use batch sizes of 1.
>>
>>> as batch size increases (to gain efficiancy) the number of log messages
>>> that can be lost also increase.
>>>
>>> unfortunantly I have the belief that power outages cannot be avoided
> (I've
>>> seen cases where millions have been spent on the power systems and still
>>> ended up with a datacenter-wide blackout.
>>
>> Let me think about this, but I think to protect against this problem, you
>> really need to have two-phase commit, which I am not sure belongs into a
>> syslogd.
>>
>>> when you get the model of things togeather we will be in a much better
>>> position to discuss this.
>>
>> Well, we'd probably restart discussing reliability requirements. If it
> turns
>> out that you need 100% reliability, not matter what happens at all, I am
> not
>> sure if we can implement this without adding considerable database-ish
>> processing. "Under all circumstances" reliability is very hard to achive,
>> especially if you also would like to have high performance. Think about it:
>> to guard against the data center full power loss scenario, you need to have
> a
>> disk-only queue, being synced to disk for every single en- and dequeue
>> operation. This is extremely costly. Does it than really matter if we have
>> large batches or not? The system, I think, will be so slow, that you cannot
>> use it for any demanding real-life application, so some compromise between
>> speed and reliability, I think, must be made in any case.
>>
>>> it's 1:20am here and I'm ready to collapse.
>>
>> I hadn't even expected this response at this time ;)
>>
>> Rainer
>>>
>>> David Lang
>>> _______________________________________________
>>> rsyslog mailing list
>>> http://lists.adiscon.net/mailman/listinfo/rsyslog
>>> http://www.rsyslog.com
>> _______________________________________________
>> rsyslog mailing list
>> http://lists.adiscon.net/mailman/listinfo/rsyslog
>> http://www.rsyslog.com
> _______________________________________________
> rsyslog mailing list
> http://lists.adiscon.net/mailman/listinfo/rsyslog
> http://www.rsyslog.com
>
More information about the rsyslog
mailing list