[rsyslog] Development of failsafe disk based queue

david at lang.hm david at lang.hm
Wed Oct 1 14:25:18 CEST 2008


On Wed, 1 Oct 2008, Rainer Gerhards wrote:

> David,
>
> the file syncing mentioned in the compatibility doc applies to the
> output action, only.

ouch.

> The queue does never do synchronous writes - I always assumed that a
> critical system would have a UPS and could never think (so far) about a
> valid reason for not having it. So the queue would need to have an extra
> option to do sync writes. Obviously, that's not a big deal.

good

> Performance, of course, will be extremely terrible with such a setup...

only if you have to wait for a spinning disk to do the write.

this is the same problem that databases have. they need to guarentee that 
once the database tells the writing program that the data is written it 
will be there even if the system looses power immediatly.

if you run a database on standard desktop hardware (and it doesn't have 
this safety disabled) you cannot do more then about 80 writes/second. If 
you upgrade to the super speedy 15K rpm drives you can do ~160 
writes/second.

given that you need to write the data + metadata it gets even uglier, so 
what the databases do (and some journaling filesystems) is to write a log 
that says what they are going to do, sync that, and then later write the 
data to the actual files (updating the journal when they complete the 
write)

it sounds like you order your write correctly for a disk-based queue, but 
you would need the option of issuing the syncs (probably when you do the 
checkpoints)

if you do this on the wrong hardware (say a laptop 5200 rpm drive or the 
wrong flash drive), the fact that you need to do four writes per log entry 
(data to queue, metadata to queue, data to output, update metadata for 
queue) could drop you to below 15 logs/sec (60/4 but then you loose time 
to seeking as well)

however, with the correct drive to write to (say a $2,400 80G fusion-io 
flash card that can do ~100k IO ops/sec) you should be able to sustain 
20,000 logs/sec.

realisticly very few people need the sustained write capacity that you 
would get from such a setup. but if you go with a $500-$700 raid card with 
a battery-backed cache you get very similar performance, but with some 
possibility that you can't sustain it forever.

David Lang

> Rainer
>
> On Wed, 2008-10-01 at 04:55 -0700, david at lang.hm wrote:
>> On Wed, 1 Oct 2008, David Ecker wrote:
>>
>>> Hi,
>>>
>>> I am looking for a failsafe solution to store syslog messages localy
>>> until they could be send later. I already looked at the disk based
>>> memory queue and the disk based queue. Both queue's don't work if you
>>> just power down the system immediatly actually loosing the whole queue.
>>
>> are you sure about the disk based queue?
>>
>> per file:///usr/src/rsyslog-3.21.4/doc/queues.html the disk based queue
>> can be set to do a commit of the metadata after each message.
>>
>> Disk Queues
>>
>> Disk queues use disk drives for buffering. The important fact is that the
>> always use the disk and do not buffer anything in memory. Thus, the queue
>> is ultra-reliable, but by far the slowest mode. For regular use cases,
>> this queue mode is not recommended. It is useful if log data is so
>> important that it must not be lost, even in extreme cases.
>>
>> When a disk queue is written, it is done in chunks. Each chunk receives
>> its individual file. Files are named with a prefix (set via the
>> "$<object>QueueFilename" config directive) and followed by a 7-digit
>> number (starting at one and incremented for each file). Chunks are 10mb by
>> default, a different size can be set via the"$<object>QueueMaxFileSize"
>> config directive. Note that the size limit is not a sharp one: rsyslog
>> always writes one complete queue entry, even if it violates the size
>> limit. So chunks are actually a little but (usually less than 1k) larger
>> then the configured size. Each chunk also has a different size for the
>> same reason. If you observe different chunk sizes, you can relax: this is
>> not a problem.
>>
>> Writing in chunks is used so that processed data can quickly be deleted
>> and is free for other uses - while at the same time keeping no artificial
>> upper limit on disk space used. If a disk quota is set (instructions
>> further below), be sure that the quota/chunk size allows at least two
>> chunks to be written. Rsyslog currently does not check that and will fail
>> miserably if a single chunk is over the quota.
>>
>> Creating new chunks costs performance but provides quicker ability to free
>> disk space. The 10mb default is considered a good compromise between these
>> two. However, it may make sense to adapt these settings to local policies.
>> For example, if a disk queue is written on a dedicated 200gb disk, it may
>> make sense to use a 2gb (or even larger) chunk size.
>>
>> Please note, however, that the disk queue by default does not update its
>> housekeeping structures every time it writes to disk. This is for
>> performance reasons. In the event of failure, data will still be lost
>> (except when manually is mangled with the file structures). However, disk
>> queues can be set to write bookkeeping information on checkpoints (every n
>> records), so that this can be made ultra-reliable, too. If the checkpoint
>> interval is set to one, no data can be lost, but the queue is
>> exceptionally slow.
>>
>> Each queue can be placed on a different disk for best performance and/or
>> isolation. This is currently selected by specifying different
>> $WorkDirectory config directives before the queue creation statement.
>>
>> To create a disk queue, use the "$<object>QueueType Disk" config
>> directive. Checkpoint intervals can be specified via
>> "$<object>QueueCheckpointInterval", with 0 meaning no checkpoints.
>>
>>
>>
>>
>>
>> you also need to specificly enable syncing (from
>> http://www.rsyslog.com/doc-v3compatibility.html )
>>
>> Output File Syncing
>> Rsyslogd tries to keep as compatible to stock syslogd as possible. As
>> such, it retained stock syslogd's default of syncing every file write if
>> not specified otherwise (by placing a dash in front of the output file
>> name). While this was a useful feature in past days where hardware was
>> much less reliable and UPS seldom, this no longer is useful in today's
>> worl. Instead, the syncing is a high performace hit. With it, rsyslogd
>> writes files around 50 *times* slower than without it. It also affects
>> overall system performance due to the high IO activity. In rsyslog v3,
>> syncing has been turned off by default. This is done via a specific
>> configuration directive "$ActionFileEnableSync on/off" which is off by
>> default. So even if rsyslogd finds sync selector lines, it ignores them by
>> default. In order to enable file syncing, the administrator must specify
>> "$ActionFileEnableSync on" at the top of rsyslog.conf. This ensures that
>> syncing only happens in some installations where the administrator
>> actually wanted that (performance-intense) feature. In the fast majority
>> of cases (if not all), this dramatically increases rsyslogd performance
>> without any negative effects.
>>
>>
>>
>>> I already looked at queue.c and it seemed to me that both queues were
>>> not designed for that kind of failure, but I could be wrong there. Since
>>> an immediate power down of the system is the major failure which will
>>> occure pretty often I need to create a soltution there.
>>
>> with checkpoint interval set to 1 and syncing enabled the data should be
>> in on the disk safely (assuming you have hardware that supports this) and
>> a power-off won't affect it.
>>
>> David Lang
>>
>>
>>
>>> Did you already start to develop something addressing that problem?
>>> Could you help me extend rsyslog (3.18.4) so that I can develop a new
>>> queue myself? I would contribute the code to the rsyslog project if you
>>> would like afterwards.
>>>
>>> bye
>>> David Ecker
>>>
>> _______________________________________________ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com
>> _______________________________________________ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com
>
> _______________________________________________
> rsyslog mailing list
> http://lists.adiscon.net/mailman/listinfo/rsyslog
> http://www.rsyslog.com
>



More information about the rsyslog mailing list