[rsyslog] Development of failsafe disk based queue

David Ecker david at ecker-software.de
Wed Oct 1 14:05:42 CEST 2008


Hi,

should I use 3.18.4 (latest stable, I would preferr that one) or do I
need the latest development version? I would actually alter queue.c
directly changing the fctl flags in the disk based queue
(O_DIRECT,O_SYNC,.O_NOATIME).

Performance is not really an issue. There will be only 1000 to 2000
Messages per hour in peak times

bye
David Ecker

Rainer Gerhards schrieb:
> David,
>
> the file syncing mentioned in the compatibility doc applies to the
> output action, only. 
>
> The queue does never do synchronous writes - I always assumed that a
> critical system would have a UPS and could never think (so far) about a
> valid reason for not having it. So the queue would need to have an extra
> option to do sync writes. Obviously, that's not a big deal.
>
> Performance, of course, will be extremely terrible with such a setup...
>
> Rainer
>
> On Wed, 2008-10-01 at 04:55 -0700, david at lang.hm wrote:
>   
>> On Wed, 1 Oct 2008, David Ecker wrote:
>>
>>     
>>> Hi,
>>>
>>> I am looking for a failsafe solution to store syslog messages localy
>>> until they could be send later. I already looked at the disk based
>>> memory queue and the disk based queue. Both queue's don't work if you
>>> just power down the system immediatly actually loosing the whole queue.
>>>       
>> are you sure about the disk based queue?
>>
>> per file:///usr/src/rsyslog-3.21.4/doc/queues.html the disk based queue 
>> can be set to do a commit of the metadata after each message.
>>
>> Disk Queues
>>
>> Disk queues use disk drives for buffering. The important fact is that the 
>> always use the disk and do not buffer anything in memory. Thus, the queue 
>> is ultra-reliable, but by far the slowest mode. For regular use cases, 
>> this queue mode is not recommended. It is useful if log data is so 
>> important that it must not be lost, even in extreme cases.
>>
>> When a disk queue is written, it is done in chunks. Each chunk receives 
>> its individual file. Files are named with a prefix (set via the 
>> "$<object>QueueFilename" config directive) and followed by a 7-digit 
>> number (starting at one and incremented for each file). Chunks are 10mb by 
>> default, a different size can be set via the"$<object>QueueMaxFileSize" 
>> config directive. Note that the size limit is not a sharp one: rsyslog 
>> always writes one complete queue entry, even if it violates the size 
>> limit. So chunks are actually a little but (usually less than 1k) larger 
>> then the configured size. Each chunk also has a different size for the 
>> same reason. If you observe different chunk sizes, you can relax: this is 
>> not a problem.
>>
>> Writing in chunks is used so that processed data can quickly be deleted 
>> and is free for other uses - while at the same time keeping no artificial 
>> upper limit on disk space used. If a disk quota is set (instructions 
>> further below), be sure that the quota/chunk size allows at least two 
>> chunks to be written. Rsyslog currently does not check that and will fail 
>> miserably if a single chunk is over the quota.
>>
>> Creating new chunks costs performance but provides quicker ability to free 
>> disk space. The 10mb default is considered a good compromise between these 
>> two. However, it may make sense to adapt these settings to local policies. 
>> For example, if a disk queue is written on a dedicated 200gb disk, it may 
>> make sense to use a 2gb (or even larger) chunk size.
>>
>> Please note, however, that the disk queue by default does not update its 
>> housekeeping structures every time it writes to disk. This is for 
>> performance reasons. In the event of failure, data will still be lost 
>> (except when manually is mangled with the file structures). However, disk 
>> queues can be set to write bookkeeping information on checkpoints (every n 
>> records), so that this can be made ultra-reliable, too. If the checkpoint 
>> interval is set to one, no data can be lost, but the queue is 
>> exceptionally slow.
>>
>> Each queue can be placed on a different disk for best performance and/or 
>> isolation. This is currently selected by specifying different 
>> $WorkDirectory config directives before the queue creation statement.
>>
>> To create a disk queue, use the "$<object>QueueType Disk" config 
>> directive. Checkpoint intervals can be specified via 
>> "$<object>QueueCheckpointInterval", with 0 meaning no checkpoints.
>>
>>
>>
>>
>>
>> you also need to specificly enable syncing (from 
>> http://www.rsyslog.com/doc-v3compatibility.html )
>>
>> Output File Syncing
>> Rsyslogd tries to keep as compatible to stock syslogd as possible. As 
>> such, it retained stock syslogd's default of syncing every file write if 
>> not specified otherwise (by placing a dash in front of the output file 
>> name). While this was a useful feature in past days where hardware was 
>> much less reliable and UPS seldom, this no longer is useful in today's 
>> worl. Instead, the syncing is a high performace hit. With it, rsyslogd 
>> writes files around 50 *times* slower than without it. It also affects 
>> overall system performance due to the high IO activity. In rsyslog v3, 
>> syncing has been turned off by default. This is done via a specific 
>> configuration directive "$ActionFileEnableSync on/off" which is off by 
>> default. So even if rsyslogd finds sync selector lines, it ignores them by 
>> default. In order to enable file syncing, the administrator must specify 
>> "$ActionFileEnableSync on" at the top of rsyslog.conf. This ensures that 
>> syncing only happens in some installations where the administrator 
>> actually wanted that (performance-intense) feature. In the fast majority 
>> of cases (if not all), this dramatically increases rsyslogd performance 
>> without any negative effects.
>>
>>
>>
>>     
>>> I already looked at queue.c and it seemed to me that both queues were
>>> not designed for that kind of failure, but I could be wrong there. Since
>>> an immediate power down of the system is the major failure which will
>>> occure pretty often I need to create a soltution there.
>>>       
>> with checkpoint interval set to 1 and syncing enabled the data should be 
>> in on the disk safely (assuming you have hardware that supports this) and 
>> a power-off won't affect it.
>>
>> David Lang
>>
>>
>>
>>     
>>> Did you already start to develop something addressing that problem?
>>> Could you help me extend rsyslog (3.18.4) so that I can develop a new
>>> queue myself? I would contribute the code to the rsyslog project if you
>>> would like afterwards.
>>>
>>> bye
>>> David Ecker
>>>
>>>       
>> _______________________________________________ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com
>> _______________________________________________ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com
>>     
>
> _______________________________________________
> rsyslog mailing list
> http://lists.adiscon.net/mailman/listinfo/rsyslog
> http://www.rsyslog.com
>   



More information about the rsyslog mailing list