[rsyslog] Development of failsafe disk based queue

Rainer Gerhards rgerhards at hq.adiscon.com
Wed Oct 1 15:11:09 CEST 2008


Sorry, I overlooked this mail in the big bunch of messages. That's good
reasoning.

To cover these scenarios, we need to do everything with syncing. This
also means that you can not use any of the disk-assisted modes, because
in these modes we always try to keep things in memory in order to save
writes.

So while you have convinced me things can go wrong, I'd still say that
is is very unusual (at least very costly) to care for all these things.
But, of course, there are situations where it is needed. I'll probably
see that I provide a facility to open files in "always sync" mode, but
that for sure will not be the default setting ;)

But even with the fast solid state disks (and similar methods) you
mention, I think there will be a severe impact on performance because
everything now needs to go through two write (data+metadata) and two
read (again, data+metadata) OS call where we currently simply update an
in-memory structure.

Just out of curiosity: do you expect the majority of you rollouts to be
using such methods?

Rainer

On Wed, 2008-10-01 at 05:35 -0700, david at lang.hm wrote:
> > ... And I have never heard of anybody doing serious datacenter work
> > without a proper UPS. Is this *really* an issue?
> 
> Yes.
> 
> UPSs fail.
> generators fail
> power cords come loose.
> power cords get unplugged by someone who thinks they are unplugging a 
> different system
> people bump power switches on power strips.
> power supplies are defective
> 
> I had one production outage where a visiting tech pulled a power cord from 
> an overhead plug and dropped it on the ground, where it happened to hit 
> the power switch on a power strip.
> 
> I've had high-end systems with redundant power supplies go down becouse of 
> faulty hardware that decided to disble both power supplies at once (it 
> turned out that there was a defect in the whole batch of servers, but it 
> took IBM several weeks to figure out what was going on)
> 
> I've had UPS systems blow up (literally)
> 
> I've had a datacenter go down becouse the it was running on generator 
> power (due to other issues), and the refueling guy filled the tank 
> incorrectly and got air bubbles into the fuel system, a few min later the 
> 500Kw diesel generator couldn't maintain constant speed and the safety 
> triggers kicked in and disabled it.
> 
> it's amazing the things that happen in real-life





More information about the rsyslog mailing list