[rsyslog] Development of failsafe disk based queue
Rainer Gerhards
rgerhards at hq.adiscon.com
Wed Oct 1 15:17:42 CEST 2008
On Wed, 2008-10-01 at 15:09 +0200, David Ecker wrote:
> Hi,
>
> [quote]
> Depending on the media and the block device driver design, individual
> sector writes may
> not be atomic. A physical sector in typical devices is 512 bytes. In
> most cases, physical
> sector writes are atomic (either completely written, or not modified at
> all). A truly
> reliable file system, however, cannot count on this.
> [/quote]
>
> In most cases it works
exactly - in most cases. That means it does not work always.
> but some way of validating the data
how can you validate if there is no power and the machine is off?
> is needed if
> you want ultra reliability, which I don't need. If the last few messages
> a few seconds before an immediate shutdown are lost but all other
> messages are send correctly afterwards then that would be OK in my case.
but we can not guarantee that, at least not in all cases. Let's assume
the disk died in the middle of the write access. Chances are good you'll
never be able to read that sector again. Using a journaling file system
will help, but without it, you may just have destroyed the sector that
contained the .qi file. So on next startup the .qi is either not
readable at all or not pointing at the correct information. The end
result can be total loss of information.
This scenario is probably acceptable in your case, because it is really,
really highly unlikely. But it still exists.
> I'll just test version 2.21.5 with the altered open behauvior. The disk
> based queue-array developed by myself is just a fallback solution if the
> disk-based queue doesn't work with an immediate shutdown.
If it does not work under the constraints described here, this would
point to a problem in the queue implementation (I have to admit the
reason to provide a capability to write periodic qi file updates was
related to a scenario like this, though not thought in this extreme ;)).
Rainer
>
> David
>
> Rainer Gerhards schrieb:
> > On Wed, 2008-10-01 at 14:45 +0200, David Ecker wrote:
> >
> > [snip]
> >
> >
> >> as long as you do sector based writes (512 byte per sector, usual) you
> >> can be sure that the write wasn"t partial.. Writing more than one sector
> >> or not starting at a correct offset (n*512,n=0,1,2,...x) might result in
> >> a partial write. I'll already tested that with my devel client here. So
> >> fencing each sector with a crc32 value would help detecting errors
> >> during a write operation. This is actually only a problem if you are
> >> writing directly to a block device like any filesystem does and yes,
> >> reordering is definitly a problem. So validating the content written to
> >> the disk afterwards is important.
> >>
> >> If writing through a filesystem reserving space in the destination file
> >> beforehand actually minimizes errors since the file system table doesn't
> >> have to be updated (you should also use the Flag O_NOATIME for that
> >> case). See for example VMWare ESX VMDK file handling.
> >>
> >
> > Well, first of all let me re-iterate that I do not intend to do a block
> > device driver for rsyslog (but I definitely do not object getting one
> > contributed ;)).
> >
> > Still thinking about the case and thinking about non-solid-state,
> > non-internal-battery-backed-up disk, I can't see how you can be sure the
> > data will be written. David just told me there are no capacitors. So if
> > power fails, it fails rather quickly. So how can you be sure the disk
> > will be able to finish writing that sector? Let's say the drive has
> > begun to write the sector and been able to write the first 5 bytes. Now
> > power fails. No capacitors, no battery-backup, so why should there be
> > enough power to drive the disk write head for another 507 bytes? It the
> > drives assures it can do that, it needs capacitors - doesn't it?
> >
> > Am I overlooking something obvious?
> >
> > Rainer
> >
> > _______________________________________________
> > rsyslog mailing list
> > http://lists.adiscon.net/mailman/listinfo/rsyslog
> > http://www.rsyslog.com
> >
>
> _______________________________________________
> rsyslog mailing list
> http://lists.adiscon.net/mailman/listinfo/rsyslog
> http://www.rsyslog.com
More information about the rsyslog
mailing list