[rsyslog] output plugin calling interface

david at lang.hm david at lang.hm
Fri May 8 09:06:37 CEST 2009


On Thu, 7 May 2009, Rainer Gerhards wrote:

>>> So, I'd appreciate if you could have a look at sections 3.2
>> and 3.3 of
>>>
>>> http://www.rsyslog.com/download/design.pdf
>>
>> overall it looks good.
>>
>> one suggestion I would make is that since message based
>> failures cannot be
>> reliably detected, I would consider using the same failure
>> process for all
>> failures, and declare a message as bad if it fails the max
>> retry number of
>> times by itself (once you hit n=1)
>
> But then you either
>
> A) do not need the batch logic at all (because the action is configured for
> infinite retries)
>
> Or
>
> B) you loose many messages if the action is not configured for infinite
> retries and you have a longer-duration outage e.g. on a database server.
> Let's say it is offline for a couple of hours, then you lose almost
> everything in that period
>
> To prevent this, you need two different retry methods.

good point.

the problem is trying to figure out which type of failure you have.

some failures can be identified by the output module as being data driven 
or infrastructure, but there are cases where it just can't tell 
(especially when talking to remote servers, database, relp, etc)

how should these be handled?

David Lang

>> otherwise you end up resubmitting the entire batch a number of times
>> before you try to narrow it down to the particular message. since the
>> process of finding the bad message will take a number of
>> retries, and then
>> you will want to retry the suspect message several times (to
>> make sure
>> that it's really a message error, not a action error) this
>> could result in
>> a lot of retries.
>>
>> also, the algorithm that you posted has a subtle difference
>> from what I
>> had listed.
>
> It must, because it has two different levels of retries.
>
>> yours is more straightforward and easier to
>> understand (and
>> requires no global knowledge), I think that mine is more
>> efficiant in the
>> rare failure case. there is a potential (very subtle) race
>> condition in
>> this area that will need attention when we get down to lower level
>> discussion (no matter which algorithm is used)
>>
>> at this point I don't see this as critical (not even very
>> important) as we
>> are talking high-level concepts at this point, but I wanted
>> to note this
>> for a future conversation.
>
> I agree on that is is not critical at this point. I also have not even tried
> to optimize it. The critical point is the discussion above on the two
> different retry modes. It took me a lot of thinking to see the subtle issues,
> but trying to do all with just one mode was the root cause of the problems at
> least I faced.
>
> I am not sure how you could solve the dilemma above with just a single retry
> mode.
>
>>
>>
>> two notes on the reliability section
>
> That's why I not mentioned this section - so far, it is just a copy of a
> mailing list post (and all the comments it raised apply to it)
>
>>
>> 1. I think we had figured out that reliability required
>> touching each item
>> 3 times instead of 2 (not 4 times as you note in the text)
>>
>> 2. I disagree with you on the idea that power issues should
>> be handled at
>> a different level. I'll try to track down some discussions on
>> sysadmin/security mailing lists about this.
>
> Keep in mind that my key point is that you can not currently protect a busy
> system against message loss. The issue is not if a power failure may happen.
> I agree it can. I just think that you can not build a busy system without
> using at least partial in-memory queuing, which by definition is not save
> from power failures. So it doesn't make sense to protect a handful of
> messages when we loose much more of them anyways.
>
>>
>> David Lang
>>
>> _______________________________________________
>> rsyslog mailing list
>> http://lists.adiscon.net/mailman/listinfo/rsyslog
>> http://www.rsyslog.com
>>
> _______________________________________________
> rsyslog mailing list
> http://lists.adiscon.net/mailman/listinfo/rsyslog
> http://www.rsyslog.com
>



More information about the rsyslog mailing list