[rsyslog] Development of failsafe disk based queue

david at lang.hm david at lang.hm
Wed Oct 1 14:02:37 CEST 2008


On Wed, 1 Oct 2008, David Ecker wrote:

> Rainer Gerhards schrieb:
>>> I am looking for a failsafe solution to store syslog messages localy
>>> until they could be send later. I already looked at the disk based
>>> memory queue and the disk based queue. Both queue's don't work if you
>>> just power down the system immediatly actually loosing the whole
>>>
>> queue.
>>
>>> I already looked at queue.c and it seemed to me that both queues were
>>> not designed for that kind of failure, but I could be wrong there.
>>> Since
>>> an immediate power down of the system is the major failure which will
>>> occure pretty often I need to create a soltution there.
>>>
>>
>> I doubt there is a software soution against this (one that does not
>> depend on a transactional file system, of course). What prevents you
>> from using a UPS? I'd say that a sudden power-loss is by far the least
>> probable error cause for a system that is configured to do any serious
>> work.
>>
>> Please elaborate why you (or others ;)) consider this case important.
>>
>>
> The client systems (about 200 of them planned) are stationed in public
> places around the world connected to centralized servers through vpn
> connections over an unreliable network connection. Since space and look
> requirement is important a UPS won't fit there. There is actually no
> space for an UPS. The main problem is that customers are actually
> pulling the plug to restart the system, to charge their laptops or
> mobile phones or just for the fun of it.

you can get UPS systems that are PCI cards, completly internal. they still 
may not fit, but you at least have a chance.

> The client base image is a read-only system (Knoppix Like) with an extra
> hard disk for swap and other informations like syslog messages. Since
> there are no administrators close to the client system the client itself
> needs to have the capability to send all the missing log information
> between a network failure and an immediate power down to the central
> server for error analysis since those are usualy the most important once
> to pinpoint the cause of the inital error.
>
> My approach would be to use a block device directly since a file system
> if fault-prone if you shut down the system immediatly. Each entry
> including the header information guarded by a checksum value. It would
> be actually something like a fixed array based queue just that it would
> store the information in a block device. But this is just an inital thought.

you are inventing a new filesystem here. it's not that easy to be reliable 
becouse the disk can lie to you. unless you are doing interesting things 
at the ATA/SCSI command level the disk may re-order your writes and may 
cache them in memory on the drive for an unknown time before actually 
writing them

if you need reliable writes at anything close to a reasonable speed you 
need to have a battery backed cache or solid state drive in your machine 
(and the solid state drives are not all fast to write to)

David Lang

>>> Did you already start to develop something addressing that problem?
>>> Could you help me extend rsyslog (3.18.4) so that I can develop a new
>>> queue myself? I would contribute the code to the rsyslog project if
>>>
>> you
>>
>>> would like afterwards.
>>>
>>> bye
>>> David Ecker
>>>
>> _______________________________________________
>> rsyslog mailing list
>> http://lists.adiscon.net/mailman/listinfo/rsyslog
>> http://www.rsyslog.com
>>
>
>
-------------- next part --------------
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com


More information about the rsyslog mailing list