[rsyslog] HUP restart or not - was: RE: rsyslog - what's next?
Rainer Gerhards
rgerhards at hq.adiscon.com
Tue Jul 14 10:37:48 CEST 2009
> -----Original Message-----
> From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog-
> bounces at lists.adiscon.com] On Behalf Of david at lang.hm
> Sent: Monday, July 13, 2009 11:59 PM
> To: rsyslog-users
> Subject: Re: [rsyslog] rsyslog - what's next?
>
> On Mon, 13 Jul 2009, RB wrote:
>
> > On Mon, Jul 13, 2009 at 14:38, <david at lang.hm> wrote:
> >> HUP does not interrupt services.
> >
> > I'm sorry; last October you noted that for certain configurations
> > SIGHUP will drop memory-queued messages. In addition to the other
> > notes in the thread (tearing down TCP connections, etc.), it sounds an
> > awful lot like a service interruption. Has that since changed for 3.x
> > or 4.2?
>
> yes, for 4.2 there is the option of 'HUPisRestart no', which makes HUP not
> do a complete teardown, halt, and restart, but instead just closes and
> re-opens files and connections so that no long messages are lost in a
> restart.
And that is the default setting!
>
> this matches the traditional use of HUP to get syslogd to release the
> files it's writing to so that they can be rotated away.
>
> it doesn't re-read the config file due to the fact that the rsyslog config
> file is so complex and can significantly alter the software by loading
> modules.
I am still tempted to remove the non-restart type of HUP. Actually, it is the
root cause for a lot of complexities (read performance bottlenecks) inside
the engine. And all this for something that is not strictly needed. However,
there are always many opponents when I intend to remove the traditional HUP
behavior.
For v5, I actually think of adding a "rsyslogd restart deamon" for those that
insist on traditional HUP semantics. That deamon would lie dormant in memory
until it receives HUP, in which case it does a shutdown and restart of the
real rsyslogd. With that, I could cleanup a lot of complexity.
One major issue is that under some circumstances I need to cancel output
threads when they do not finish their processing within the allotted
timeouts. Canceling threads is always a bit dangerous, but there currently is
no always-reliable way around this. To make it happen, a lot of
enable/disable cancel calls, including cancel cleanup handlers are made
throughout the code. If we would not have restart-type HUPs, I could simply
cancel those threads and not care about resources not being cleaned up (the
process cleanup will take care of that). So I could also remove all the
enable/disable cancel logic and its helpers. Of course, this is not something
done as quickly as I write those lines and it must be made sure that any
inconsistence does not affect the rest of the shutdown. But without those
annoying restart-type HUPs, it is much simpler to do...
>
> >> the system calls to access it (and to first lookup if it should be
> >> accessed at all) take enough time to be significant at these speeds.
> >
> > May I gently prod you to substantiate this statement? I don't doubt
> > that it makes calls to external libraries, and that those calls are
> > likely more expensive than internal resolution, but "significant" is
> > not significant without numbers to back it up. Even a simple speed
> > comparison for a few million packets between 'no resolution' and 'in
> > /etc/hosts with a cold cache' would be useful.
>
> I would have to do a new set of tests to get you concrete numbers, but
> back in roughly the october timeframe last year I was doing extensive
> tests on this and with hosts files vs. no resolution I was seeing 4x or
> more differences (with a tiny, 5-line hosts file to parse)
>
> at the time I think I discussed it on a long performance thread on the
> website.
>
> at the time I did quite a number of strace dumps to show how much time was
> being eaten up in the system calls. RG has done a LOT of cleanup since
> then (to the point that the failsafe audit mode is in the ballpark of
> being as fast as the in-memory mode was back then)
>
> >> you want the ability to cache the name lookup no matter what method is
> >> used. only DNS provides a TTL, hosts files, NIS, LDAP, etc do not
include
> >> a TTL.
> >
> > I have no problem expanding the scope of discussion to resolution
> > methods beyond DNS (which was the original premise), but if a name
> > service does not provide an explicit TTL, it must be assumed as an
> > implicit '0', which blind caching will break. That said, each of
> > those methods still provides some [internal] method of caching and
> > invalidation that, to be audit-grade correct, rsyslog will either have
> > to replicate or trust. I still can't see a problem with having
> > something to the effect of a "$BreakNSCache" configuration option, but
> > by default a syslog daemon should play as strictly by the rules as
> > possible.
>
> well, if you are really wanting audit-grade logging, will you use anything
> other than the IP address (or what is already in the message)? any lookup
> that you do is a potential for corruption.
>
> I'm fine with the default being to do a lookup for each item. I just want
> a way to avoid it, and if I can do that with a cache instead of having to
> disable DNS, I will do so.
>
> it's very common for servers to need to disable DNS lookups for their
> logs. do a quick search for apache dns performance and you will find that
> it's a very common problem people have there if they enable lookups on the
> sender's IP address.
>
> > Then again, I'm RB, not RG. ;)
>
> more people speaking up is good. I have strong opinions, and real problems
> to address, but RG needs to hear from people other than me.
Definitely! I often simply do not realize where a need is, and I always often
make mistakes or have wrong perceptions. If nobody objects, I am lost and so
is the project ;)
Rainer
>
> David Lang
> _______________________________________________
> rsyslog mailing list
> http://lists.adiscon.net/mailman/listinfo/rsyslog
> http://www.rsyslog.com
More information about the rsyslog
mailing list