[rsyslog] abort in 4.2.1

Rainer Gerhards rgerhards at hq.adiscon.com
Tue Aug 25 12:31:21 CEST 2009


On Mon, 2009-08-24 at 14:06 -0700, david at lang.hm wrote:
> > I'm testing to see if it has the problem I reported with 4.2.1 where it dies 
> > under load from malformed messages.
> 
> It finally died just like 4.2.1 did. It took a _lot_ longer (which may 
> just be that the race condition to cause the crash is smaller, 5.x is 
> _significantly_ more efficiant than 4.x is. processing ~1800 messages/sec, 
> writing them locally and relaying them to another machine eats up <2% cpu 
> according to top)
> 
> I restarted it in debug mode (this takes more cpu, almost 10% of a cpu)

The bad thing about debug mode is that not only it is slower, but it
introduces some synchronization. So race bugs frequently disappear when
debug mode is turned on. Anyhow, sometimes they persist and then the
debug log often provides good information (aka "definitely worth a
try" ;)).

I did some basic testing with the malformed message you provided in an
earlier message, but I unfortunately did not see anything that is not
clean. I am still a bit of the assumption that the malformednes of the
message is not a necessary condition for the segfault - but that needs
to be seen. No abort happened (yet) in my lab.

If the issue is easier to reproduce in v4, I suggest you go back to
4.4.0 (the current v4-stable) and we try to nail down it there. It would
be good if we could find some predicate (like this and that traffic
pattern) that would enable me to reproduce the problem in lab (if the
debug log does not help).

Please let me know your thoughts.

Rainer




More information about the rsyslog mailing list