[rsyslog] Rsyslog 4.4.2: server out-of-memory with gnutls
Rainer Gerhards
rgerhards at hq.adiscon.com
Tue Nov 10 09:44:25 CET 2009
And finally a reply to "the meat" ;)
> -----Original Message-----
> From: rsyslog-bounces at lists.adiscon.com
> [mailto:rsyslog-bounces at lists.adiscon.com] On Behalf Of Mr. Demeanour
> Sent: Monday, November 09, 2009 5:25 PM
> To: rsyslog-users
> Subject: Re: [rsyslog] Rsyslog 4.4.2: server out-of-memory with gnutls
>
> Rainer Gerhards wrote:
> > Jack,
> >
> > Quick update: I was able to run a (relatively quick) test on a
> > configuration that hopefully later becomes part of the testbench (I
> > am working towards that goal and thought doing a manual
> check at that
> > stage would not hurt). It is not based your config, and it is
> > relatively simple and straightforward. Still, it uses TLS, and uses
> > it in anon mode like you did. I used 4.4.2 on the server. I
> processed
> > around 500,000 message (not kept track of the actual number). I did
> > three such runs, all runing under the excellent valgrind[1] memory
> > debugger.
> >
> > For none of the runs, valgrind reported any memory leaks. While this
> > may not be an ultimate indication, valgrind is *very* effective in
> > finding leaks and based on what you wrote I would have expected a
> > small chunk of memory to be lost per message.
>
> OK (thanks).
>
> I have formed the impression that the problems are occurring
> after some
> period of running time; free memory decreases quite slowly, until some
> unknown event causes a rather rapid degeneration. That is: I suspect
> that any leak is not likely to be observable on a per-message basis,
> until this unknown event has occured.
This would explain what we currently see. It seems to be TLS-related, as you
said. So it is unlikely to be a message at all, but rather a TLS event that
causes the mem leak. With that said, I should probbably see if a connection
abort can trigger one...
> > So the bottom line is that I currently cannot reproduce the
> bug. This
> > may change when I finally import your config. However, it would be
> > useful for me if you could run valgrind on rsyslogd in your
> > environment and let me know if valgrind reports any memory leaks.
> > Doing so considerably slows down rsyslogd, but given your load, I'd
> > expect that it would be acceptable.
> >
> > To run under valgrind is relatively simply. Valgrind is available as
> > a package on almost all distros. All you need to do is run valgrind
> > and specify your usual rsyslogd command line as the parameter. It is
> > recommended to do this in the foreground (see rsyslog
> troubleshooting
> > doc).
> >
> > So, for example, if you start rsyslogd usually by
> >
> > /sbin/rsyslogd -c4 -...
>
> So "valgrind /usr/sbin/rsyslogd -c4" results in rsyslogd backgrounding
> promptly, at which point valgrind prints its report (which shows no
> leaks - unsurprisingly).
>
> "valgrind /usr/sbin/rsyslogd -c4 -n" results in a hang.
> CTRL-C fails to
> kill the foreground task.
That's intended behavior, ctrl-c is only enabled in debug mode (I took this
over from sysklogd and never question it".
> "kill -9 <pid>" kills the task, but no
> valgrind report is produced.
You must not use -9, which is untrappable. Just send the default SIGTERM:
$ kill <pid> # no signal specified at all!
>The same command without valgrind also
> results in a hung foreground task. If run under valgrind,
> memcheck-x86-li goes to 99% CPU on CTRL-C.
>
> I currently suspect problems with mySQL as the origin of
> these problems.
> I was this morning getting messages of the form
> "Lost connection to MySQL server at 'reading authorization packet'".
> I was also observing MySQL aborted clients and connects. I have
> increased the MySQL connect timeout, and can no longer reproduce these
> reports. For now, I assume that problem is fixed, but I can't
> yet say if
> the rsyslog hangs have stopped.
>
> I wonder if what was happening was that MySQL was "going away" in some
> sense, and that rsyslog was not reconnecting to it successfully, *and*
> not retrying?
Hard to guess...
>
> I noticed that although the ActionQueue for the mysql output module is
> not disk-assisted, the debug log records:
> action 4 queue: save on shutdown 1, max disk space allowed 0
The debug output just spits out the values of the variables. Saveonshutdown
is true by default, but if there is no disk queue, that won't help anything
at all ;) In short: that's OK.
> So I've set $ActionQueueSaveOnShutdown off. However this
> hasn't changed
> the hang behaviour with -n.
>
> Since I can't get rsyslog to run in the foreground under
> valgrind, I am
> now running daemonized without valgrind (but with encryption); perhaps
> these changes have fixed the problem. I should know by late
> this evening
> - when the problem is observed, the server never lives for more than a
> few hours.
I hope this -and the other- mail will help straighten out the issue. Any logs
you can send to my private mail address as you have already done ;)
Rainer
> --
> Jack.
> _______________________________________________
> rsyslog mailing list
> http://lists.adiscon.net/mailman/listinfo/rsyslog
> http://www.rsyslog.com
>
More information about the rsyslog
mailing list