[rsyslog] Development of failsafe disk based queue

david at lang.hm david at lang.hm
Wed Oct 1 13:55:42 CEST 2008


On Wed, 1 Oct 2008, David Ecker wrote:

> Hi,
>
> I am looking for a failsafe solution to store syslog messages localy
> until they could be send later. I already looked at the disk based
> memory queue and the disk based queue. Both queue's don't work if you
> just power down the system immediatly actually loosing the whole queue.

are you sure about the disk based queue?

per file:///usr/src/rsyslog-3.21.4/doc/queues.html the disk based queue 
can be set to do a commit of the metadata after each message.

Disk Queues

Disk queues use disk drives for buffering. The important fact is that the 
always use the disk and do not buffer anything in memory. Thus, the queue 
is ultra-reliable, but by far the slowest mode. For regular use cases, 
this queue mode is not recommended. It is useful if log data is so 
important that it must not be lost, even in extreme cases.

When a disk queue is written, it is done in chunks. Each chunk receives 
its individual file. Files are named with a prefix (set via the 
"$<object>QueueFilename" config directive) and followed by a 7-digit 
number (starting at one and incremented for each file). Chunks are 10mb by 
default, a different size can be set via the"$<object>QueueMaxFileSize" 
config directive. Note that the size limit is not a sharp one: rsyslog 
always writes one complete queue entry, even if it violates the size 
limit. So chunks are actually a little but (usually less than 1k) larger 
then the configured size. Each chunk also has a different size for the 
same reason. If you observe different chunk sizes, you can relax: this is 
not a problem.

Writing in chunks is used so that processed data can quickly be deleted 
and is free for other uses - while at the same time keeping no artificial 
upper limit on disk space used. If a disk quota is set (instructions 
further below), be sure that the quota/chunk size allows at least two 
chunks to be written. Rsyslog currently does not check that and will fail 
miserably if a single chunk is over the quota.

Creating new chunks costs performance but provides quicker ability to free 
disk space. The 10mb default is considered a good compromise between these 
two. However, it may make sense to adapt these settings to local policies. 
For example, if a disk queue is written on a dedicated 200gb disk, it may 
make sense to use a 2gb (or even larger) chunk size.

Please note, however, that the disk queue by default does not update its 
housekeeping structures every time it writes to disk. This is for 
performance reasons. In the event of failure, data will still be lost 
(except when manually is mangled with the file structures). However, disk 
queues can be set to write bookkeeping information on checkpoints (every n 
records), so that this can be made ultra-reliable, too. If the checkpoint 
interval is set to one, no data can be lost, but the queue is 
exceptionally slow.

Each queue can be placed on a different disk for best performance and/or 
isolation. This is currently selected by specifying different 
$WorkDirectory config directives before the queue creation statement.

To create a disk queue, use the "$<object>QueueType Disk" config 
directive. Checkpoint intervals can be specified via 
"$<object>QueueCheckpointInterval", with 0 meaning no checkpoints.





you also need to specificly enable syncing (from 
http://www.rsyslog.com/doc-v3compatibility.html )

Output File Syncing
Rsyslogd tries to keep as compatible to stock syslogd as possible. As 
such, it retained stock syslogd's default of syncing every file write if 
not specified otherwise (by placing a dash in front of the output file 
name). While this was a useful feature in past days where hardware was 
much less reliable and UPS seldom, this no longer is useful in today's 
worl. Instead, the syncing is a high performace hit. With it, rsyslogd 
writes files around 50 *times* slower than without it. It also affects 
overall system performance due to the high IO activity. In rsyslog v3, 
syncing has been turned off by default. This is done via a specific 
configuration directive "$ActionFileEnableSync on/off" which is off by 
default. So even if rsyslogd finds sync selector lines, it ignores them by 
default. In order to enable file syncing, the administrator must specify 
"$ActionFileEnableSync on" at the top of rsyslog.conf. This ensures that 
syncing only happens in some installations where the administrator 
actually wanted that (performance-intense) feature. In the fast majority 
of cases (if not all), this dramatically increases rsyslogd performance 
without any negative effects.



> I already looked at queue.c and it seemed to me that both queues were
> not designed for that kind of failure, but I could be wrong there. Since
> an immediate power down of the system is the major failure which will
> occure pretty often I need to create a soltution there.

with checkpoint interval set to 1 and syncing enabled the data should be 
in on the disk safely (assuming you have hardware that supports this) and 
a power-off won't affect it.

David Lang



> Did you already start to develop something addressing that problem?
> Could you help me extend rsyslog (3.18.4) so that I can develop a new
> queue myself? I would contribute the code to the rsyslog project if you
> would like afterwards.
>
> bye
> David Ecker
>
-------------- next part --------------
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com


More information about the rsyslog mailing list