From theinric at redhat.com Mon Jan 5 15:52:38 2009 From: theinric at redhat.com (Tomas Heinrich) Date: Mon, 05 Jan 2009 15:52:38 +0100 Subject: [rsyslog] suggested tweak to rsyslog In-Reply-To: <577465F99B41C842AAFBE9ED71E70ABA44F916@grfint2.intern.adiscon.com> References: <1229626907.12594.19.camel@localhost.localdomain><1229627751.12594.23.camel@localhost.localdomain> <577465F99B41C842AAFBE9ED71E70ABA44F916@grfint2.intern.adiscon.com> Message-ID: <49621EB6.9010504@redhat.com> On 12/19/2008 12:57 PM, Rainer Gerhards wrote: > David, > > one thing I can do rather quickly. Maybe it's good enough. I've done a > tester, which lacks proper configuration, but I would appreciate > feedback on it: > > http://git.adiscon.com/?p=rsyslog.git;a=commitdiff;h=a185665be4cf6997525 > 89d81ef6e396dd61f68b6 > > Details in git commit comment. > > Rainer Hi, I think there's a small bug in the new code: - snprintf((char*)szRepMsg, sizeof(szRepMsg), "message repeated %d times: [%.800]", + snprintf((char*)szRepMsg, sizeof(szRepMsg), "message repeated %d times: [%.800s]", Tomas From theinric at redhat.com Tue Jan 6 18:02:37 2009 From: theinric at redhat.com (Tomas Heinrich) Date: Tue, 06 Jan 2009 18:02:37 +0100 Subject: [rsyslog] redundant message in log files Message-ID: <49638EAD.5080104@redhat.com> Hi, we've received a bug report [1] regarding a message that started to appear in the log files. The bug first appeared in version 3.21.5. This patch [2] should fix it. Tomas [1] https://bugzilla.redhat.com/show_bug.cgi?id=478612 [2] http://pastebin.ca/1301001 From mikel at irontec.com Sun Jan 11 21:41:11 2009 From: mikel at irontec.com (Mikel Jimenez Fernandez) Date: Sun, 11 Jan 2009 21:41:11 +0100 Subject: [rsyslog] [Fwd: Re: milliseconds timestamp] Message-ID: <496A5967.1050805@irontec.com> Dear Andre and Rainer Any progress in this? Thanks From rgerhards at hq.adiscon.com Mon Jan 12 08:56:10 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Mon, 12 Jan 2009 08:56:10 +0100 Subject: [rsyslog] [Fwd: Re: milliseconds timestamp] In-Reply-To: <496A5967.1050805@irontec.com> References: <496A5967.1050805@irontec.com> Message-ID: <577465F99B41C842AAFBE9ED71E70ABA44F964@grfint2.intern.adiscon.com> Hi, please quote what exactly you are looking for, I am no longer able to trace the question back to an issue ;) Rainer > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of Mikel Jimenez Fernandez > Sent: Sunday, January 11, 2009 9:41 PM > To: rsyslog-users > Subject: [rsyslog] [Fwd: Re: milliseconds timestamp] > > Dear Andre and Rainer > > Any progress in this? > > Thanks From rgerhards at hq.adiscon.com Mon Jan 12 11:11:10 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Mon, 12 Jan 2009 11:11:10 +0100 Subject: [rsyslog] suggested tweak to rsyslog In-Reply-To: <49621EB6.9010504@redhat.com> References: <1229626907.12594.19.camel@localhost.localdomain><1229627751.12594.23.camel@localhost.localdomain> <577465F99B41C842AAFBE9ED71E70ABA44F916@grfint2.intern.adiscon.com> <49621EB6.9010504@redhat.com> Message-ID: <577465F99B41C842AAFBE9ED71E70ABA44F96B@grfint2.intern.adiscon.com> Hi Tomas, thanks for the patch, looks like I have forgotten a commit ;) David and others: do you find this functionality useful? If I do not receive any further comments, I'll conclude it is not and will not further work on it. Thanks all, Rainer > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of Tomas Heinrich > Sent: Monday, January 05, 2009 3:53 PM > To: rsyslog-users > Subject: Re: [rsyslog] suggested tweak to rsyslog > > On 12/19/2008 12:57 PM, Rainer Gerhards wrote: > > David, > > > > one thing I can do rather quickly. Maybe it's good enough. I've done > a > > tester, which lacks proper configuration, but I would appreciate > > feedback on it: > > > > > http://git.adiscon.com/?p=rsyslog.git;a=commitdiff;h=a185665be4cf699752 > 5 > > 89d81ef6e396dd61f68b6 > > > > Details in git commit comment. > > > > Rainer > > Hi, > > I think there's a small bug in the new code: > > - snprintf((char*)szRepMsg, sizeof(szRepMsg), "message repeated %d > times: [%.800]", > + snprintf((char*)szRepMsg, sizeof(szRepMsg), "message repeated %d > times: [%.800s]", > > Tomas > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com From pieter.thysebaert at intec.ugent.be Wed Jan 14 13:37:31 2009 From: pieter.thysebaert at intec.ugent.be (pieter.thysebaert at intec.ugent.be) Date: Wed, 14 Jan 2009 13:37:31 +0100 (CET) Subject: [rsyslog] Property filter - output formatting Message-ID: <29691.212.190.198.36.1231936651.squirrel@webserver6.intec.ugent.be> Hello, I've started exploring rsyslog 3.20.2 As I have been toying around and looking at the example configurations, I have not been able to solve the following problem: how can I use a property filter to select an output file AND format the output using a defined template For instance: $template testtemplate,"%msg%" :syslogtag, contains, "test" /tmp/test.log;testtemplate Doesn't seem to be a supported syntax (it works when I leave off the ;testtemplate). I'm sorry if this is obvious, but how can I filter based on properties AND specify output formatting at the same time? Thanks, Pieter From rgerhards at hq.adiscon.com Wed Jan 14 00:08:27 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Wed, 14 Jan 2009 00:08:27 +0100 Subject: [rsyslog] Property filter - output formatting In-Reply-To: <29691.212.190.198.36.1231936651.squirrel@webserver6.intec.ugent.be> References: <29691.212.190.198.36.1231936651.squirrel@webserver6.intec.ugent.be> Message-ID: <1231888107.22744.19.camel@localhost.localdomain> Hi Pieter, I just tried this out in lab. For me, it works. If I generate a message with logger -t test my message the message is properly dispatched. I guess that the problem actually is the tag, which I guess does not contain what you think it does (a frequent problem with many senders). Try this template $template testtemplate,"tag: '%syslogtag%', rawmsg: '%rawmsg%'\n" *.* /some/file;testtemplate and let us know the result. HTH Rainer On Wed, 2009-01-14 at 13:37 +0100, pieter.thysebaert at intec.ugent.be wrote: > Hello, > > I've started exploring rsyslog 3.20.2 > > As I have been toying around and looking at the example configurations, I > have not been able to solve the following problem: > > how can I use a property filter to select an output file AND format the > output using a defined template > > For instance: > > $template testtemplate,"%msg%" > > :syslogtag, contains, "test" /tmp/test.log;testtemplate > > Doesn't seem to be a supported syntax (it works when I leave off the > ;testtemplate). > > I'm sorry if this is obvious, but how can I filter based on properties AND > specify output formatting at the same time? > > Thanks, > Pieter > > > > > > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com From rgerhards at hq.adiscon.com Wed Jan 14 17:14:45 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Wed, 14 Jan 2009 17:14:45 +0100 Subject: [rsyslog] rsyslog on LinkedIn Message-ID: <577465F99B41C842AAFBE9ED71E70ABA44F996@grfint2.intern.adiscon.com> Hi all, please pardon the shameless self-promotion. I have just created a rsyslog group on LinkedIn: http://www.linkedin.com/e/gis/1761607 It is an experiment. I've seen so many project creating groups on that platform that I wonder if if would make sense to create one for rsyslog. My intent is not to replace any of our technical and discussion forums, but open a new networking opportunity for those that are interested. I do not yet know if that's a good idea or not, but why not give it a try? ;) Back to our regular programming... Rainer From ray at jhax.net Wed Jan 14 12:50:48 2009 From: ray at jhax.net (Ray Whitmer) Date: Wed, 14 Jan 2009 04:50:48 -0700 Subject: [rsyslog] Use of application-level acks in RELP. Message-ID: <20090114045048.o2wpiannk4okcgw4@webmail.xmission.com> In my research of rsyslog to determine its suitability for a particular situation I have some questions left unanswered. I need relatively-guaranteed delivery. I will continue to review the available info including source code to see if I can answer the questions, but I hope it may be productive to ask questions here. In the documentation, you describe the situation where syslog silently loses tcp messages, not because the tcp protocol permits it but because the send function returns after delivering the message to a local buffer before it is actually delivered. But there is a more-fundamental reason an application-level ack is required. An application can fail (someone trips over the power cord) between when the application receives the data and when it records it. 1. Does rsyslog send the ack in the RELP protocol occur after the message has been safely recorded in whatever queue has been configured or forwarded on so its delivery status is as safe as it will get (of course how safe depends upon options chosen), or was it only intended to solve the case of TCP buffering-based unreliability? 2. Presumably there is a client API that speaks RELP. Can it be configured to return an error to the client if there is no ACK (i.e. if the log it sent did not make it into the configured safe location which could be on a disk-based queue), or does it only retry? Where is this API? Certainly the TCP caching case you mention in your pages is one a user is more likely to be able to reproduce, but that is all the more reason for me to be concerned that the less-reproducible situations that could cause a message to occasionally become lost are handled correctly. From rgerhards at hq.adiscon.com Thu Jan 15 09:16:36 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Thu, 15 Jan 2009 09:16:36 +0100 Subject: [rsyslog] Use of application-level acks in RELP. In-Reply-To: <20090114045048.o2wpiannk4okcgw4@webmail.xmission.com> References: <20090114045048.o2wpiannk4okcgw4@webmail.xmission.com> Message-ID: <1232007397.22744.27.camel@localhost.localdomain> Hi Ray, thanks for your excellent questions. I've also made a blog post out of them, as I think this needs some better visibility (and can be used for future reference). Just if you are curios: http://blog.gerhards.net/2009/01/use-of-application-level-acks-in-relp.html (no need to read, all answers are inline below) On Wed, 2009-01-14 at 04:50 -0700, Ray Whitmer wrote: > In my research of rsyslog to determine its suitability for a > particular situation I have some questions left unanswered. I need > relatively-guaranteed delivery. I will continue to review the > available info including source code to see if I can answer the > questions, but I hope it may be productive to ask questions here. > > In the documentation, you describe the situation where syslog silently > loses tcp messages, not because the tcp protocol permits it but > because the send function returns after delivering the message to a > local buffer before it is actually delivered. > > But there is a more-fundamental reason an application-level ack is > required. An application can fail (someone trips over the power cord) > between when the application receives the data and when it records it. > > 1. Does rsyslog send the ack in the RELP protocol occur after the > message has been safely recorded in whatever queue has been configured > or forwarded on so its delivery status is as safe as it will get (of > course how safe depends upon options chosen), or was it only intended > to solve the case of TCP buffering-based unreliability? RELP is designed to provide end-to-end reliability. The TCP buffering issue is just highlighted because it is so subtle that most people tend to overlook it. An application abort seems to be more obvious and RELP handles that. HOWEVER, that does not mean messages are necessarily recorded when the ACK is sent. It depends on the configuration. In RELP, the acknowledgment is sent after the reception callback has been called. This can be seen in the relevant RELP module. For rsyslog's imrelp, this means the callback returns after the message has been enqueued in the main message queue. It now depends on how that queue is configured. By default, messages are buffered in main memory. So when rsyslog aborts for some reason (or is terminated by user request) before this message is being processed, it is lost - while the sender still got a positive ACK. This is how things are done by default, and it is useful for many scenarios. Of course, it does not provide the audit-grade reliability that RELP aims for. But the default config needs to take care of the usual use case and this is not audit-grade reliablity (just think of the numerous home systems that run rsyslog and should do so in the least intrusive way). If you are serious about your logs, you need to configure the engine to be fully reliable. The most important thing is a good understanding of the queue engine. You need to read and understand the rsyslog queue ( http://www.rsyslog.com/doc-queues.html ) docs, as they form the basis on which reliability can be built. The other thing you need to know is your exact requirements. Asking for reliability is easy, implementing it is not. The more you near 100% reliability (which you will never reach for one reason or the other) the more complex scenarios get. I am sure the original post knows quite well what he want, but I am often approached by people who just want to have it "totally reliable" ... but don't want to spent the fortune it requires (really - ever thought about the redundant data centers, power plants, satellite and sea links et all you need for that?). So it is absolutely vital to have good requirements, which also includes of when loss is acceptable, and at what cost this comes. Once you have these requirements, a rsyslog configuration that matches them can be designed. At this point, I'd like to note that it may also be useful to consider rsyslog professional services ( http://www.rsyslog.com/doc-professional_support.html ) as it provides valuable aid during design and probably deployment of a solution (I can't go into the full depth of enterprise requirements here). To go back to the original question: RELP has almost everything that is needed, but configuring the whole system in an audit-grade way requires (ample) work. > 2. Presumably there is a client API that speaks RELP. Can it be > configured to return an error to the client if there is no ACK (i.e. > if the log it sent did not make it into the configured safe location > which could be on a disk-based queue), or does it only retry? Where is > this API? The API is in librelp ( http://www.librelp.com/ ). But actually this is not what you are looking for. In rsyslog, an output module (here: omrelp) provides the status back to the caller. Then, configuration decides what happens. Messages may be discarded, sent to a different destination or retried. With omrelp, I think we have some hardcoded ways to preserve the message, but I have no time yet to look this up in detail. In any case, RELP will not loose messages but may duplicate few of them (within the current unacked window) if the remote peer simply dies. Again, this requires proper configuration of the rsyslog components. Even with that, you may loose messages if the local rsyslogd dies (not terminates, but dies for some unexpected reason, e.g. a segfault, kill -9 or whatever) but still has messages in a not persisted queue. Again, this can be mitigated by proper configuration, but that must be designed. Also, it is very costly in terms of performance. A good reading on the subtleties can be in the rsyslog mailing list archive (http://lists.adiscon.net/pipermail/rsyslog/2008-October/001224.html ). I suggest to have a look at it. > > Certainly the TCP caching case you mention in your pages is one a user > is more likely to be able to reproduce, but that is all the more > reason for me to be concerned that the less-reproducible situations > that could cause a message to occasionally become lost are handled > correctly. I don't think app-abort is less reproducable kill -9 `cat /var/run/rsyslog.pid` will do nicely. Actually, from feedback I received, many users seem to understand the implications of a program/system abort. But far fewer understand the issues inherent in TCP. Thus I am focusing so much on the later. But of course, everything needs to be considered. Read the thread about the reliable queue (really!). It goes great lengths, but still does not offer a full solution. Getting things reliable (or secure) is very, very challenging and requires in-depth knowledge. So I am glad you asked and provided an opportunity for this to be written :) Rainer From rgerhards at hq.adiscon.com Thu Jan 15 13:00:37 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Thu, 15 Jan 2009 13:00:37 +0100 Subject: [rsyslog] redundant message in log files In-Reply-To: <49638EAD.5080104@redhat.com> References: <49638EAD.5080104@redhat.com> Message-ID: <1232020837.22744.28.camel@localhost.localdomain> Thanks, this one now finally is corrected, too (still catching up with vacation mail ;)). Will release it as part of 3.21.10. Rainer On Tue, 2009-01-06 at 18:02 +0100, Tomas Heinrich wrote: > Hi, > > we've received a bug report [1] regarding a message that started to > appear in the log files. The bug first appeared in version 3.21.5. > This patch [2] should fix it. > > Tomas > > [1] https://bugzilla.redhat.com/show_bug.cgi?id=478612 > [2] http://pastebin.ca/1301001 > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com From fjianella at gmail.com Thu Jan 15 15:45:53 2009 From: fjianella at gmail.com (Frank Ianella) Date: Thu, 15 Jan 2009 09:45:53 -0500 Subject: [rsyslog] uclibc compile failure Message-ID: <9f1ad2df0901150645u5cd90986k6b92a473beb73257@mail.gmail.com> hello all compiling stable and dev versions of rsyslog against uclibc-0.9.30 results in the following error: /home/build/project/sources/rsyslog-3.21.9/tools/syslogd.c:2995: undefined reference to `rpl_malloc' rsyslogd-syslogd.o: In function `legacyOptsEnq': /home/build/project/sources/rsyslog-3.21.9/tools/syslogd.c:1742: undefined reference to `rpl_malloc' rsyslogd-syslogd.o: In function `crunch_list': /home/build/project/sources/rsyslog-3.21.9/tools/syslogd.c:490: undefined reference to `rpl_malloc' /home/build/project/sources/rsyslog-3.21.9/tools/syslogd.c:502: undefined reference to `rpl_malloc' /home/build/project/sources/rsyslog-3.21.9/tools/syslogd.c:512: undefined reference to `rpl_malloc' rsyslogd-syslogd.o:/home/build/project/sources/rsyslog-3.21.9/tools/syslogd.c:1319: more undefined references to `rpl_malloc' follow ../runtime/.libs/librsyslog.a(librsyslog_la-wtp.o): In function `wtpStartWrkr': /home/build/project/sources/rsyslog-3.21.9/runtime/wtp.c:487: undefined reference to `pthread_yield' ../runtime/.libs/librsyslog.a(librsyslog_la-wtp.o): In function `wtpConstructFinalize': /home/build/project/sources/rsyslog-3.21.9/runtime/wtp.c:109: undefined reference to `rpl_malloc' ../runtime/.libs/librsyslog.a(librsyslog_la-wti.o): In function `wtiSetDbgHdr': /home/build/project/sources/rsyslog-3.21.9/runtime/wti.c:456: undefined reference to `rpl_malloc' ../runtime/.libs/librsyslog.a(librsyslog_la-wti.o): In function `wtiWorker': /home/build/project/sources/rsyslog-3.21.9/runtime/wti.c:370: undefined reference to `pthread_yield' ../runtime/.libs/librsyslog.a(librsyslog_la-queue.o): In function `queueAddLinkedList': /home/build/project/sources/rsyslog-3.21.9/runtime/queue.c:528: undefined reference to `rpl_malloc' /home/build/project/sources/rsyslog-3.21.9/runtime/queue.c:528: undefined reference to `rpl_malloc' ../runtime/.libs/librsyslog.a(librsyslog_la-queue.o): In function `qConstructFixedArray': /home/build/project/sources/rsyslog-3.21.9/runtime/queue.c:459: undefined reference to `rpl_malloc' ../runtime/.libs/librsyslog.a(librsyslog_la-queue.o): In function `queueSetFilePrefix': /home/build/project/sources/rsyslog-3.21.9/runtime/queue.c:2081: undefined reference to `rpl_malloc' ../runtime/.libs/librsyslog.a(librsyslog_la-queue.o): In function `queueStart': /home/build/project/sources/rsyslog-3.21.9/runtime/queue.c:1794: undefined reference to `rpl_malloc' ../runtime/.libs/librsyslog.a(librsyslog_la-threads.o):/home/build/project/sources/rsyslog-3.21.9/runtime/../threads.c:60: more undefined references to `rpl_malloc' follow I recompiled uclibc with MALLOC_GLIBC_COMPAT=y but the result was the same. The only reference to this that I can find is in the rsyslog bug tracker but the patch listed there does not allow it to compile. Just wondering if anybody has a working patch or suggestion. TIA -Frank From danson at rackspace.com Thu Jan 15 18:28:57 2009 From: danson at rackspace.com (Daniel Anson) Date: Thu, 15 Jan 2009 11:28:57 -0600 Subject: [rsyslog] Baclogged files to disk are pretty slow Message-ID: <21149_1232040593_n0FHSsbU031125_96AF20FDF4301D419B33CCE8E3A0132B0AB699D2@SAT4MX07.RACKSPACE.CORP> I have been dealing with this problem for a few days now and perhaps I will be able to solicit some advice or help. Here is the issue. I have an rsyslog relay writing to a remote database server and caching to disk. The write to the database uses a MySQL stored procedure that can write about 4000 records per second. The rsyslog.conf parts are set up like so: $ModLoad immark $ModLoadd imudp $UDPServerAddress 172.16.12.138 $UDPServerRun 514 $ModLoad imtcp $ModLoad imuxsock $ModLoad imklog $ModLoad ommysql.so $template template1,"CALL SAT2_RSYSLOG_EVENT_INSERT('%timestamp:::date-mysql%', '%timegenerated:::date-mysql%', '%syslogfacility%', '%syslogpriority%', '%hostname%', '%syslogtag%', '%msg%')", sql $WorkDirectory /rsyslog/work $ActionQueueType LinkedList # use asynchronous processing $ActionQueueFileName dbq # set file name, also enables disk mode $ActionResumeRetryCount -1 # infinite retries on insert failure *.* >172.16.2.238,rsyslog,syslogwriter,topsecret;template1 If I turn off the database, in this case I turned it off for almost a day, it backlogs nearly a 1 GB worth of information. The problem is that it takes nearly 6 hours to catch back up from this. While catching up, it only uses about 1% of the proc. Bandwidth is not an issue as the fibre link is only about 50% saturated. Is there a way to force rsyslogd to consume more of the proc and move faster. I have placed a -20 nice value on the process in hopes that would help but it really has not. Is there a way to force rsyslogd to use a pool of MySQL connections or intiate a new connection each time a record is written? Daniel M. Anson Linux Systems Engineer Rackspace Managed Hosting danson at rackspace.com Confidentiality Notice: This e-mail message (including any attached or embedded documents) is intended for the exclusive and confidential use of the individual or entity to which this message is addressed, and unless otherwise expressly indicated, is confidential and privileged information of Rackspace. Any dissemination, distribution or copying of the enclosed material is prohibited. If you receive this transmission in error, please notify us immediately by e-mail at abuse at rackspace.com, and delete the original message. Your cooperation is appreciated. From rgerhards at hq.adiscon.com Thu Jan 15 18:45:09 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Thu, 15 Jan 2009 18:45:09 +0100 Subject: [rsyslog] Baclogged files to disk are pretty slow In-Reply-To: <21149_1232040593_n0FHSsbU031125_96AF20FDF4301D419B33CCE8E3A0132B0AB699D2@SAT4MX07.RACKSPACE.CORP> References: <21149_1232040593_n0FHSsbU031125_96AF20FDF4301D419B33CCE8E3A0132B0AB699D2@SAT4MX07.RACKSPACE.CORP> Message-ID: <577465F99B41C842AAFBE9ED71E70ABA44F9A8@grfint2.intern.adiscon.com> Mhhh... with the current design, it submits messages individually to the database. I think what you experience is simply the turn-around from the database call (no other idea what it could be). It doesn't use more CPU because the database layer seems not to return any faster. There has been some discussion on batching multiple statements together, but this is non-trivial. I lost funding and things like this need a corporate sponsor now (they are not of importance for the non-commercial user field...). You could try to run the action on its own queue and with multiple workers. That could (could!) improve performance. But it is just a guess. Do you have any chance to see how long the query takes inside the SQL engine? Rainer > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of Daniel Anson > Sent: Thursday, January 15, 2009 6:29 PM > To: rsyslog at lists.adiscon.com > Subject: [rsyslog] Baclogged files to disk are pretty slow > > I have been dealing with this problem for a few days now and perhaps I > will be able to solicit some advice or help. Here is the issue. I > have > an rsyslog relay writing to a remote database server and caching to > disk. The write to the database uses a MySQL stored procedure that can > write about 4000 records per second. The rsyslog.conf parts are set up > like so: > > $ModLoad immark > $ModLoadd imudp > $UDPServerAddress 172.16.12.138 > $UDPServerRun 514 > $ModLoad imtcp > $ModLoad imuxsock > $ModLoad imklog > $ModLoad ommysql.so > > $template template1,"CALL > SAT2_RSYSLOG_EVENT_INSERT('%timestamp:::date-mysql%', > '%timegenerated:::date-mysql%', '%syslogfacility%', '%syslogpriority%', > '%hostname%', '%syslogtag%', '%msg%')", sql > > $WorkDirectory /rsyslog/work > $ActionQueueType LinkedList # use asynchronous processing > $ActionQueueFileName dbq # set file name, also enables disk mode > $ActionResumeRetryCount -1 # infinite retries on insert failure > > *.* >172.16.2.238,rsyslog,syslogwriter,topsecret;template1 > > If I turn off the database, in this case I turned it off for almost a > day, it backlogs nearly a 1 GB worth of information. The problem is > that it takes nearly 6 hours to catch back up from this. While > catching > up, it only uses about 1% of the proc. Bandwidth is not an issue as > the > fibre link is only about 50% saturated. Is there a way to force > rsyslogd to consume more of the proc and move faster. I have placed a > -20 nice value on the process in hopes that would help but it really > has > not. Is there a way to force rsyslogd to use a pool of MySQL > connections or intiate a new connection each time a record is written? > > > Daniel M. Anson > Linux Systems Engineer > Rackspace Managed Hosting > danson at rackspace.com > > > > > Confidentiality Notice: This e-mail message (including any attached or > embedded documents) is intended for the exclusive and confidential use > of the > individual or entity to which this message is addressed, and unless > otherwise > expressly indicated, is confidential and privileged information of > Rackspace. > Any dissemination, distribution or copying of the enclosed material is > prohibited. > If you receive this transmission in error, please notify us immediately > by e-mail > at abuse at rackspace.com, and delete the original message. > Your cooperation is appreciated. > > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com From lorenzo at sancho.ccd.uniroma2.it Thu Jan 15 18:58:37 2009 From: lorenzo at sancho.ccd.uniroma2.it (Lorenzo M. Catucci) Date: Thu, 15 Jan 2009 18:58:37 +0100 (CET) Subject: [rsyslog] rsyslog still crashes Message-ID: I've just tried again rsyslog on my 8 core mail server, and got the very same crash from september/october. I've restarted the server under valgrind control, and all seems to be running well... A good 2009 to all! Yours, lorenzo +-------------------------+----------------------------------------------+ | Lorenzo M. Catucci | Centro di Calcolo e Documentazione | | catucci at ccd.uniroma2.it | Universit? degli Studi di Roma "Tor Vergata" | | | Via O. Raimondo 18 ** I-00173 ROMA ** ITALY | | Tel. +39 06 7259 2255 | Fax. +39 06 7259 2125 | +-------------------------+----------------------------------------------+ -------------- next part -------------- GNU gdb 6.8-debian Copyright (C) 2008 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-linux-gnu"... Reading symbols from /usr/lib/libz.so.1...done. Loaded symbols for /usr/lib/libz.so.1 Reading symbols from /lib/libpthread.so.0...Reading symbols from /usr/lib/debug/lib/libpthread-2.7.so...done. done. Loaded symbols for /lib/libpthread.so.0 Reading symbols from /lib/libdl.so.2...Reading symbols from /usr/lib/debug/lib/libdl-2.7.so...done. done. Loaded symbols for /lib/libdl.so.2 Reading symbols from /lib/librt.so.1...Reading symbols from /usr/lib/debug/lib/librt-2.7.so...done. done. Loaded symbols for /lib/librt.so.1 Reading symbols from /lib/libc.so.6...Reading symbols from /usr/lib/debug/lib/libc-2.7.so...done. done. Loaded symbols for /lib/libc.so.6 Reading symbols from /lib/ld-linux-x86-64.so.2...Reading symbols from /usr/lib/debug/lib/ld-2.7.so...done. done. Loaded symbols for /lib64/ld-linux-x86-64.so.2 Reading symbols from /usr/lib/rsyslog/lmnet.so...done. Loaded symbols for /usr/lib/rsyslog/lmnet.so Reading symbols from /lib/libnss_files.so.2...Reading symbols from /usr/lib/debug/lib/libnss_files-2.7.so...done. done. Loaded symbols for /lib/libnss_files.so.2 Reading symbols from /usr/lib/rsyslog/imuxsock.so...done. Loaded symbols for /usr/lib/rsyslog/imuxsock.so Reading symbols from /usr/lib/rsyslog/imklog.so...done. Loaded symbols for /usr/lib/rsyslog/imklog.so Reading symbols from /lib/libnss_compat.so.2...Reading symbols from /usr/lib/debug/lib/libnss_compat-2.7.so...done. done. Loaded symbols for /lib/libnss_compat.so.2 Reading symbols from /lib/libnsl.so.1...Reading symbols from /usr/lib/debug/lib/libnsl-2.7.so...done. done. Loaded symbols for /lib/libnsl.so.1 Reading symbols from /lib/libnss_nis.so.2...Reading symbols from /usr/lib/debug/lib/libnss_nis-2.7.so...done. done. Loaded symbols for /lib/libnss_nis.so.2 Reading symbols from /usr/lib/rsyslog/lmnetstrms.so...done. Loaded symbols for /usr/lib/rsyslog/lmnetstrms.so Reading symbols from /usr/lib/rsyslog/lmtcpclt.so...done. Loaded symbols for /usr/lib/rsyslog/lmtcpclt.so Reading symbols from /usr/lib/rsyslog/lmnsd_ptcp.so...done. Loaded symbols for /usr/lib/rsyslog/lmnsd_ptcp.so Core was generated by `rsyslogd -c4'. Program terminated with signal 6, Aborted. [New process 22774] [New process 22776] [New process 22775] [New process 22773] [New process 22772] #0 0x00002b6037978ed5 in raise () from /lib/libc.so.6 (gdb) Thread 5 (process 22772): #0 0x00002b6037a0fce2 in select () from /lib/libc.so.6 #1 0x000000000040db53 in mainThread () at syslogd.c:2704 #2 0x000000000040ee56 in realMain (argc=, argv=) at syslogd.c:3631 #3 0x00002b60379651a6 in __libc_start_main () from /lib/libc.so.6 #4 0x000000000040a219 in _start () Thread 4 (process 22773): #0 0x00002b6037327fad in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib/libpthread.so.0 #1 0x0000000000432f5f in wtiWorker (pThis=0x685140) at wti.c:406 #2 0x000000000043172a in wtpWorker (arg=0x685140) at wtp.c:425 #3 0x00002b6037323fc7 in start_thread () from /lib/libpthread.so.0 #4 0x00002b6037a165ad in clone () from /lib/libc.so.6 #5 0x0000000000000000 in ?? () Thread 3 (process 22775): #0 0x00002b6037a0fce2 in select () from /lib/libc.so.6 #1 0x00002b60380b59fd in runInput (pThrd=) at imuxsock.c:280 #2 0x00000000004436ff in thrdStarter (arg=0x6a5c80) at ../threads.c:139 #3 0x00002b6037323fc7 in start_thread () from /lib/libpthread.so.0 #4 0x00002b6037a165ad in clone () from /lib/libc.so.6 #5 0x0000000000000000 in ?? () Thread 2 (process 22776): #0 0x00002b603732a7db in read () from /lib/libpthread.so.0 #1 0x00002b60382ba1ef in klogLogKMsg () at linux.c:449 #2 0x00002b60382b9594 in runInput (pThrd=0x6aafc0) at imklog.c:224 #3 0x00000000004436ff in thrdStarter (arg=0x6aafc0) at ../threads.c:139 #4 0x00002b6037323fc7 in start_thread () from /lib/libpthread.so.0 #5 0x00002b6037a165ad in clone () from /lib/libc.so.6 #6 0x0000000000000000 in ?? () Thread 1 (process 22774): #0 0x00002b6037978ed5 in raise () from /lib/libc.so.6 #1 0x00002b603797a3f3 in abort () from /lib/libc.so.6 #2 0x0000000000423657 in sigsegvHdlr (signum=6) at debug.c:759 #3 #4 0x00002b6037978ed5 in raise () from /lib/libc.so.6 #5 0x00002b603797a3f3 in abort () from /lib/libc.so.6 #6 0x00002b6037971dc9 in __assert_fail () from /lib/libc.so.6 #7 0x000000000041ce78 in msgDestruct (ppThis=0x68ace8) at msg.c:330 #8 0x0000000000443036 in actionCallAction (pAction=0x68ac70, pMsg=0x6b2010) at ../action.c:774 #9 0x000000000040b2c7 in processMsgDoActions (pData=0x68ac70, pParam=0x41000e90) at syslogd.c:1140 #10 0x000000000041de78 in llExecFunc (pThis=0x68aae0, pFunc=0x40b270 , pParam=0x41000e90) at linkedlist.c:391 #11 0x000000000040add9 in msgConsumer (notNeeded=, pUsr=) at syslogd.c:1183 #12 0x000000000043c4f7 in queueConsumerReg (pThis=0x68ff20, pWti=0x6a3bb0, iCancelStateSave=) at queue.c:1598 #13 0x0000000000432fd0 in wtiWorker (pThis=0x6a3bb0) at wti.c:416 #14 0x000000000043172a in wtpWorker (arg=0x6a3bb0) at wtp.c:425 #15 0x00002b6037323fc7 in start_thread () from /lib/libpthread.so.0 #16 0x00002b6037a165ad in clone () from /lib/libc.so.6 #17 0x0000000000000000 in ?? () (gdb) quit From hks.private at gmail.com Thu Jan 15 19:44:45 2009 From: hks.private at gmail.com ((private) HKS) Date: Thu, 15 Jan 2009 13:44:45 -0500 Subject: [rsyslog] rsyslog still crashes In-Reply-To: References: Message-ID: On Thu, Jan 15, 2009 at 12:58 PM, Lorenzo M. Catucci wrote: > I've just tried again rsyslog on my 8 core mail server, and got the very > same crash from september/october. I've restarted the server under valgrind > control, and all seems to be running well... > > A good 2009 to all! > > Yours, > > lorenzo Version you're using? -HKS From aoz.syn at gmail.com Thu Jan 15 20:11:09 2009 From: aoz.syn at gmail.com (RB) Date: Thu, 15 Jan 2009 12:11:09 -0700 Subject: [rsyslog] Baclogged files to disk are pretty slow In-Reply-To: <21149_1232040593_n0FHSsbU031125_96AF20FDF4301D419B33CCE8E3A0132B0AB699D2@SAT4MX07.RACKSPACE.CORP> References: <21149_1232040593_n0FHSsbU031125_96AF20FDF4301D419B33CCE8E3A0132B0AB699D2@SAT4MX07.RACKSPACE.CORP> Message-ID: <4255c2570901151111n6696fbc9md66a30c9bc9b4a10@mail.gmail.com> Many more measurements are needed before declaring a conclusive cause, but on the surface it seems that your bottleneck is not rsyslog or the sending server but the database itself. Comments below. On Thu, Jan 15, 2009 at 10:28, Daniel Anson wrote: > I have been dealing with this problem for a few days now and perhaps I > will be able to solicit some advice or help. Here is the issue. I have > an rsyslog relay writing to a remote database server and caching to > disk. The write to the database uses a MySQL stored procedure that can > write about 4000 records per second. The rsyslog.conf parts are set up Is that 4000 TPS burst or sustained speed? > If I turn off the database, in this case I turned it off for almost a > day, it backlogs nearly a 1 GB worth of information. The problem is Roughly how many records? > that it takes nearly 6 hours to catch back up from this. While catching > up, it only uses about 1% of the proc. Bandwidth is not an issue as the What's the processor and disk load look like on your MySQL server? > fibre link is only about 50% saturated. Is there a way to force Presuming 50% is your bps, what was your PPS? Depending on how large your average event/transaction are, you may never see 100% due to small packets. > not. Is there a way to force rsyslogd to use a pool of MySQL > connections or intiate a new connection each time a record is written? Ranier confirmed my suspicion that rsyslog executes a single transaction per event, which is (as he also notes) sub-optimal for performance. Batching really should be about the same logic as the MARK functionality: every N foo, output "bar". Multiple actions per transaction (batching) is a classic query tuning technique and can be approached many ways, but you probably need to verify your database I/O is indeed the bottleneck. From danson at rackspace.com Fri Jan 16 00:01:19 2009 From: danson at rackspace.com (Daniel Anson) Date: Thu, 15 Jan 2009 17:01:19 -0600 Subject: [rsyslog] Baclogged files to disk are pretty slow In-Reply-To: <4255c2570901151111n6696fbc9md66a30c9bc9b4a10@mail.gmail.com> References: <21149_1232040593_n0FHSsbU031125_96AF20FDF4301D419B33CCE8E3A0132B0AB699D2@SAT4MX07.RACKSPACE.CORP> <4255c2570901151111n6696fbc9md66a30c9bc9b4a10@mail.gmail.com> Message-ID: <19646_1232060570_n0FN2mSc020768_96AF20FDF4301D419B33CCE8E3A0132B0ACECB55@SAT4MX07.RACKSPACE.CORP> A few things about the MySQL server itself, I have eliminated bandwidth, proc speed, disk I/O as potential bottlenecks. The obvious bottleneck is the MySQL server. For a temporary solution, I have placed an rsyslog relay on the MySQL server. So: Client_message -> local_datacenter_relay -> remote_datacenter_relay -> MySQL_server The messages are traveling much faster (kudos to the socket programming there) as the remote relay writes to a local MySQL server. I do not believe this to be an optimal solution. In an earlier email, Rainer mentions and I quote: "You could try to run the action on its own queue and with multiple workers. That could (could!) improve performance. But it is just a guess. Do you have any chance to see how long the query takes inside the SQL engine?" MySQL will run about 4000 inserts per second (constant speed). I am willing to try what Rainer suggests; however, I am unsure how to direct specific actions to act on a queue. Any help s appreciated. I know I could add the two following lines and create worker threads: $ActionQueueWorkerThreads 20 $MainMsgQueueWorkerThreads 20 Would I have to add additional lines to the config. My config once again looks like so: $ModLoad immark $ModLoadd imudp $UDPServerAddress 172.16.12.138 $UDPServerRun 514 $ModLoad imtcp $ModLoad imuxsock $ModLoad imklog $ModLoad ommysql.so $template template1,"CALL SAT2_RSYSLOG_EVENT_INSERT('%timestamp:::date-mysql%', '%timegenerated:::date-mysql%', '%syslogfacility%', syslogpriority%', '%hostname%', '%syslogtag%', '%msg%')", sql $WorkDirectory /rsyslog/work $ActionQueueType LinkedList # use asynchronous processing $ActionQueueFileName dbq # set file name, also enables disk mode $ActionResumeRetryCount -1 # infinite retries on insert failure *.* >172.16.2.238,rsyslog,syslogwriter,topsecret;template1 I would hope that there is an easy solution as my next idea is to write some type of daemonized process that can insert messages from a pool of MySQL connections. I can achieve this in C but would rather hopefully find a solution inside of the configuration. Daniel M. Anson Linux Systems Engineer Rackspace Managed Hosting danson at rackspace.com -----Original Message----- From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog-bounces at lists.adiscon.com] On Behalf Of RB Sent: Thursday, January 15, 2009 1:11 PM To: rsyslog-users Subject: Re: [rsyslog] Baclogged files to disk are pretty slow Many more measurements are needed before declaring a conclusive cause, but on the surface it seems that your bottleneck is not rsyslog or the sending server but the database itself. Comments below. On Thu, Jan 15, 2009 at 10:28, Daniel Anson wrote: > I have been dealing with this problem for a few days now and perhaps I > will be able to solicit some advice or help. Here is the issue. I have > an rsyslog relay writing to a remote database server and caching to > disk. The write to the database uses a MySQL stored procedure that can > write about 4000 records per second. The rsyslog.conf parts are set up Is that 4000 TPS burst or sustained speed? > If I turn off the database, in this case I turned it off for almost a > day, it backlogs nearly a 1 GB worth of information. The problem is Roughly how many records? > that it takes nearly 6 hours to catch back up from this. While catching > up, it only uses about 1% of the proc. Bandwidth is not an issue as the What's the processor and disk load look like on your MySQL server? > fibre link is only about 50% saturated. Is there a way to force Presuming 50% is your bps, what was your PPS? Depending on how large your average event/transaction are, you may never see 100% due to small packets. > not. Is there a way to force rsyslogd to use a pool of MySQL > connections or intiate a new connection each time a record is written? Ranier confirmed my suspicion that rsyslog executes a single transaction per event, which is (as he also notes) sub-optimal for performance. Batching really should be about the same logic as the MARK functionality: every N foo, output "bar". Multiple actions per transaction (batching) is a classic query tuning technique and can be approached many ways, but you probably need to verify your database I/O is indeed the bottleneck. _______________________________________________ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com Confidentiality Notice: This e-mail message (including any attached or embedded documents) is intended for the exclusive and confidential use of the individual or entity to which this message is addressed, and unless otherwise expressly indicated, is confidential and privileged information of Rackspace. Any dissemination, distribution or copying of the enclosed material is prohibited. If you receive this transmission in error, please notify us immediately by e-mail at abuse at rackspace.com, and delete the original message. Your cooperation is appreciated. From mbiebl at gmail.com Fri Jan 16 01:20:22 2009 From: mbiebl at gmail.com (Michael Biebl) Date: Fri, 16 Jan 2009 01:20:22 +0100 Subject: [rsyslog] rsyslog still crashes In-Reply-To: References: Message-ID: 2009/1/15 (private) HKS : > On Thu, Jan 15, 2009 at 12:58 PM, Lorenzo M. Catucci > wrote: >> I've just tried again rsyslog on my 8 core mail server, and got the very >> same crash from september/october. I've restarted the server under valgrind >> control, and all seems to be running well... >> >> A good 2009 to all! >> >> Yours, >> >> lorenzo > > > Version you're using? Given the -c4 command line argument, I'd expect it to be 4.1.3. Sounds familiar to http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=509292 (which is 3.18.6). It seems to be a more general problem with multi core (= very fast??) systems. Cheers, Michael -- Why is it that all of the instruments seeking intelligent life in the universe are pointed away from Earth? From lorenzo at sancho.ccd.uniroma2.it Fri Jan 16 01:37:14 2009 From: lorenzo at sancho.ccd.uniroma2.it (Lorenzo M. Catucci) Date: Fri, 16 Jan 2009 01:37:14 +0100 (CET) Subject: [rsyslog] rsyslog still crashes In-Reply-To: References: Message-ID: On Thu, 15 Jan 2009, (private) HKS wrote: pH> pH> Version you're using? pH> git origin/master branch as of today. Sorry for forgetting to mention! +-------------------------+----------------------------------------------+ | Lorenzo M. Catucci | Centro di Calcolo e Documentazione | | catucci at ccd.uniroma2.it | Universit? degli Studi di Roma "Tor Vergata" | | | Via O. Raimondo 18 ** I-00173 ROMA ** ITALY | | Tel. +39 06 7259 2255 | Fax. +39 06 7259 2125 | +-------------------------+----------------------------------------------+ From rgerhards at hq.adiscon.com Thu Jan 15 20:06:06 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Thu, 15 Jan 2009 20:06:06 +0100 Subject: [rsyslog] rsyslog still crashes In-Reply-To: References: Message-ID: <1232046366.22744.34.camel@localhost.localdomain> On Fri, 2009-01-16 at 01:20 +0100, Michael Biebl wrote: > Given the -c4 command line argument, I'd expect it to be 4.1.3. > > Sounds familiar to > http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=509292 (which is > 3.18.6). > > It seems to be a more general problem with multi core (= very fast??) systems. Yes, that is what my analysis so far points to. It's also part of the problem, because I do not have very fast hardware to reproduce the issue (and it is also not easy to reliably reproduce if you have...). I've gotten a couple of reports (I think most on the mailing list) on such problems and all they have in common is 4+ core machines. I'll try to get hold based on what Lorenzo submits. In his environment, the problem seems to occur most reliably (he probably has the fastest machine...). Lorenzo: details follow soon. Rainer From rgerhards at hq.adiscon.com Thu Jan 15 20:14:18 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Thu, 15 Jan 2009 20:14:18 +0100 Subject: [rsyslog] rsyslog still crashes In-Reply-To: References: Message-ID: <1232046859.22744.39.camel@localhost.localdomain> On Thu, 2009-01-15 at 18:58 +0100, Lorenzo M. Catucci wrote: > I've just tried again rsyslog on my 8 core mail server, and got the very > same crash from september/october. So, without valgrind, can you reproduce the issue each time you start it? That would be very useful. > I've restarted the server under > valgrind control, and all seems to be running well... I guess the issue here is that valgrind slows down things and also simulates (I think) 2 CPUs only. > A good 2009 to all! same to you! Thanks for being persistent with this issue (it begins to drive me crazy). >From what I have learned so far we seem to have a race condition that causes memory corrupt. The backtrace you include also points into that direction. Those few cases where I got a usable backtrace all point to the very same location. However, that does not mean this location has the bug. It seems to occur some time earlier, and manifests when the message is destructed. It could be a double-free or even some wild memory access that accidently overwrites some structures. If we are able to get a stable repro, and we are able to run with at least some minimal diagnostics, we may be much better of tackeling that beast. First step is to see that we get a stable repro. If we do, I need to think about minimal debug. The full debugging system makes the bug disappear, I think because it changes the timing. Rainer From lorenzo at sancho.ccd.uniroma2.it Fri Jan 16 12:28:59 2009 From: lorenzo at sancho.ccd.uniroma2.it (Lorenzo M. Catucci) Date: Fri, 16 Jan 2009 12:28:59 +0100 (CET) Subject: [rsyslog] rsyslog still crashes In-Reply-To: <1232046859.22744.39.camel@localhost.localdomain> References: <1232046859.22744.39.camel@localhost.localdomain> Message-ID: On Thu, 15 Jan 2009, Rainer Gerhards wrote: RG> On Thu, 2009-01-15 at 18:58 +0100, Lorenzo M. Catucci wrote: RG> > I've just tried again rsyslog on my 8 core mail server, and got the very RG> > same crash from september/october. RG> RG> So, without valgrind, can you reproduce the issue each time you start RG> it? That would be very useful. RG> Yes: any time I start a free-running instance, I get the very same segmentation fault and core-file to backtrace. RG> RG> > I've restarted the server under RG> > valgrind control, and all seems to be running well... RG> RG> I guess the issue here is that valgrind slows down things and also RG> simulates (I think) 2 CPUs only. RG> Right, I didn't know valgrind both limited the CPU bandwidth and the (v)CPU number, but any of them would hide the existing race condition RG> RG> From what I have learned so far we seem to have a race condition that RG> causes memory corrupt. The backtrace you include also points into that RG> direction. Those few cases where I got a usable backtrace all point to RG> the very same location. However, that does not mean this location has RG> the bug. It seems to occur some time earlier, and manifests when the RG> message is destructed. It could be a double-free or even some wild RG> memory access that accidently overwrites some structures. RG> RG> If we are able to get a stable repro, and we are able to run with at RG> least some minimal diagnostics, we may be much better of tackeling that RG> beast. RG> RG> First step is to see that we get a stable repro. If we do, I need to RG> think about minimal debug. The full debugging system makes the bug RG> disappear, I think because it changes the timing. RG> I don't think we could hope for a stable reproducer for an heisen-bug... all I can provide is a very high throughput system generating a very high local message rate. As a matter of facts, this rsyslog instance is acting as a forwader to a remote instance that didn't suffer any crash. The only differences between the engines' configurations are: 1. the remote logs to a postgres instance instead of spool files, 2. the remote does just run the postgresql instance and the logger My gut feeling is that the different behaviour doesn't come from any of these differences, but from the different memory-path taken from the messages, which in the remote case are serialised from the underlying network transport. We'll see! Yours, lorenzo +-------------------------+----------------------------------------------+ | Lorenzo M. Catucci | Centro di Calcolo e Documentazione | | catucci at ccd.uniroma2.it | Universit? degli Studi di Roma "Tor Vergata" | | | Via O. Raimondo 18 ** I-00173 ROMA ** ITALY | | Tel. +39 06 7259 2255 | Fax. +39 06 7259 2125 | +-------------------------+----------------------------------------------+ From rgerhards at hq.adiscon.com Fri Jan 16 12:44:53 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Fri, 16 Jan 2009 12:44:53 +0100 Subject: [rsyslog] rsyslog still crashes In-Reply-To: References: <1232046859.22744.39.camel@localhost.localdomain> Message-ID: <577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com> > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of Lorenzo M. Catucci > Sent: Friday, January 16, 2009 12:29 PM > To: rsyslog-users > Subject: Re: [rsyslog] rsyslog still crashes > > On Thu, 15 Jan 2009, Rainer Gerhards wrote: > > RG> On Thu, 2009-01-15 at 18:58 +0100, Lorenzo M. Catucci wrote: > RG> > I've just tried again rsyslog on my 8 core mail server, and got > the very > RG> > same crash from september/october. > RG> > RG> So, without valgrind, can you reproduce the issue each time you > start > RG> it? That would be very useful. > RG> > > Yes: any time I start a free-running instance, I get the very same > segmentation fault and core-file to backtrace. > > RG> > RG> > I've restarted the server under > RG> > valgrind control, and all seems to be running well... > RG> > RG> I guess the issue here is that valgrind slows down things and also > RG> simulates (I think) 2 CPUs only. > RG> > > Right, I didn't know valgrind both limited the CPU bandwidth and the > (v)CPU number, but any of them would hide the existing race condition Actually, valgrind executes the app in a virtual CPU/Memory environment. So this is *quite different* from the real machine, but nevertheless extremely useful in most cases. While in theory so the actual hardware should not affect the valgrind outcome, my former debugging has shown it does. Thus my first try is always valgrind. But it seems not to help here as we have seen... > RG> > RG> From what I have learned so far we seem to have a race condition > that > RG> causes memory corrupt. The backtrace you include also points into > that > RG> direction. Those few cases where I got a usable backtrace all point > to > RG> the very same location. However, that does not mean this location > has > RG> the bug. It seems to occur some time earlier, and manifests when > the > RG> message is destructed. It could be a double-free or even some wild > RG> memory access that accidently overwrites some structures. > RG> > RG> If we are able to get a stable repro, and we are able to run with > at > RG> least some minimal diagnostics, we may be much better of tackeling > that > RG> beast. > RG> > RG> First step is to see that we get a stable repro. If we do, I need > to > RG> think about minimal debug. The full debugging system makes the bug > RG> disappear, I think because it changes the timing. > RG> > > I don't think we could hope for a stable reproducer for an heisen- > bug... Of course not 100%. But what you have sounds good enough. I must now see that/how I can change the system so that we have some additional instrumentation while the bug is still there. I'll first look at some compile options. Is it OK for you if I just send some messages to stdout? > all I can provide is a very high throughput system generating a very > high > local message rate. As a matter of facts, this rsyslog instance is > acting as a forwader to a remote instance that didn't suffer any crash. > > The only differences between the engines' configurations are: > 1. the remote logs to a postgres instance instead of spool files, > 2. the remote does just run the postgresql instance and the logger > > My gut feeling is that the different behaviour doesn't come from any of > these differences, but from the different memory-path taken from the > messages, which in the remote case are serialised from the underlying > network transport. This may be... Rainer From lorenzo at sancho.ccd.uniroma2.it Fri Jan 16 13:01:47 2009 From: lorenzo at sancho.ccd.uniroma2.it (Lorenzo M. Catucci) Date: Fri, 16 Jan 2009 13:01:47 +0100 (CET) Subject: [rsyslog] rsyslog still crashes In-Reply-To: <577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com> References: <1232046859.22744.39.camel@localhost.localdomain> <577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com> Message-ID: On Fri, 16 Jan 2009, Rainer Gerhards wrote: RG> RG> Of course not 100%. But what you have sounds good enough. I must now see RG> that/how I can change the system so that we have some additional RG> instrumentation while the bug is still there. I'll first look at some RG> compile options. Is it OK for you if I just send some messages to RG> stdout? RG> Yes, be it stdout... I'm eager to have an rsyslog instance running well, since I've really liked what I've seen (with the small exception of the crashes!) See you soon, lorenzo +-------------------------+----------------------------------------------+ | Lorenzo M. Catucci | Centro di Calcolo e Documentazione | | catucci at ccd.uniroma2.it | Universit? degli Studi di Roma "Tor Vergata" | | | Via O. Raimondo 18 ** I-00173 ROMA ** ITALY | | Tel. +39 06 7259 2255 | Fax. +39 06 7259 2125 | +-------------------------+----------------------------------------------+ From rgerhards at hq.adiscon.com Fri Jan 16 15:22:02 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Fri, 16 Jan 2009 15:22:02 +0100 Subject: [rsyslog] rsyslog still crashes In-Reply-To: References: <1232046859.22744.39.camel@localhost.localdomain><577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com> Message-ID: <577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com> Lorenzo, I have created a new branch "raceDebug" and done a first commit to it. The change is very lightweight. Please pull, compile as usual and give it a try. It spits out some info to stdout from time to time (hopefully). I am not sure if it aborts, depending on the output it may or may not. Even if we get messages, they are probably not enough to pinpoint the bug, but I wanted to do something very light to see if the bug stays. Feedback appreciated. Rainer > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of Lorenzo M. Catucci > Sent: Friday, January 16, 2009 1:02 PM > To: rsyslog-users > Subject: Re: [rsyslog] rsyslog still crashes > > On Fri, 16 Jan 2009, Rainer Gerhards wrote: > RG> > RG> Of course not 100%. But what you have sounds good enough. I must > now see > RG> that/how I can change the system so that we have some additional > RG> instrumentation while the bug is still there. I'll first look at > some > RG> compile options. Is it OK for you if I just send some messages to > RG> stdout? > RG> > > Yes, be it stdout... I'm eager to have an rsyslog instance running > well, > since I've really liked what I've seen (with the small exception of the > crashes!) > > See you soon, > > lorenzo > > > +-------------------------+-------------------------------------------- > --+ > | Lorenzo M. Catucci | Centro di Calcolo e Documentazione > | > | catucci at ccd.uniroma2.it | Universit? degli Studi di Roma "Tor > Vergata" | > | | Via O. Raimondo 18 ** I-00173 ROMA ** > ITALY | > | Tel. +39 06 7259 2255 | Fax. +39 06 7259 2125 > | > +-------------------------+-------------------------------------------- > --+ From pieter.thysebaert at intec.ugent.be Fri Jan 16 15:07:19 2009 From: pieter.thysebaert at intec.ugent.be (pieter.thysebaert at intec.ugent.be) Date: Fri, 16 Jan 2009 15:07:19 +0100 (CET) Subject: [rsyslog] (no subject) Message-ID: <56908.212.190.198.36.1232114839.squirrel@webserver6.intec.ugent.be> Hello, I've found on-line claims that rsyslog can be compiled (and maybe even runs ok?) on HP-UX. However, I've not found too much information about this, so I'd like to ask: has anyone been able to compile (and run) rsyslog 3.20.2 on HP-UX 11? If so, does it need patching? What packages are required to build it successfully? (only HP software or gcc + gnu tools?) I'm asking because a colleague briefly attempted to configure the package on hpux UX11.11, and configure ended with > checking for pthread.h... yes > checking for pthread_create in -lpthread... no Any success stories out there? Thanks! Pieter From aoz.syn at gmail.com Fri Jan 16 16:19:39 2009 From: aoz.syn at gmail.com (RB) Date: Fri, 16 Jan 2009 08:19:39 -0700 Subject: [rsyslog] Baclogged files to disk are pretty slow In-Reply-To: <19646_1232060570_n0FN2mSc020768_96AF20FDF4301D419B33CCE8E3A0132B0ACECB55@SAT4MX07.RACKSPACE.CORP> References: <21149_1232040593_n0FHSsbU031125_96AF20FDF4301D419B33CCE8E3A0132B0AB699D2@SAT4MX07.RACKSPACE.CORP> <4255c2570901151111n6696fbc9md66a30c9bc9b4a10@mail.gmail.com> <19646_1232060570_n0FN2mSc020768_96AF20FDF4301D419B33CCE8E3A0132B0ACECB55@SAT4MX07.RACKSPACE.CORP> Message-ID: <4255c2570901160719o4aa3bc6bk9813225374bfc53c@mail.gmail.com> On Thu, Jan 15, 2009 at 16:01, Daniel Anson wrote: > I would hope that there is an easy solution as my next idea is to write > some type of daemonized process that can insert messages from a pool of > MySQL connections. I can achieve this in C but would rather hopefully > find a solution inside of the configuration. Short of implementing the queue/worker configuration (no idea how), it seems the only current option would be to implement something of the sort, either by an update to the ommysql module (optimal, as it gets your code supported by someone else for its lifetim) or by some external program. I'd think an optimal external solution would be some sort of relp2mysql bridge, but suspect that would end up reimplementing a good chunk of rsyslog. From lorenzo at sancho.ccd.uniroma2.it Fri Jan 16 16:22:45 2009 From: lorenzo at sancho.ccd.uniroma2.it (Lorenzo M. Catucci) Date: Fri, 16 Jan 2009 16:22:45 +0100 (CET) Subject: [rsyslog] rsyslog still crashes In-Reply-To: <577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com> References: <1232046859.22744.39.camel@localhost.localdomain><577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com> Message-ID: On Fri, 16 Jan 2009, Rainer Gerhards wrote: RG> Lorenzo, RG> RG> I have created a new branch "raceDebug" and done a first commit to it. RG> The change is very lightweight. Please pull, compile as usual and give RG> it a try. It spits out some info to stdout from time to time RG> (hopefully). I am not sure if it aborts, depending on the output it RG> may or may not. Even if we get messages, they are probably not enough RG> to pinpoint the bug, but I wanted to do something very light to see if RG> the bug stays. RG> RG> Feedback appreciated. RG> Rainer, I've just checked-out the branch; I've run configure with the following command line: ./configure --prefix=/usr --enable-mysql --enable-pgsql --enable-mail --enable-imfile --enable-debug --enable-rtinst --enable-valgrind --no-create --no-recursion From "git diff -r HEAD^ HEAD" I've seen an #if 0 section in the commit. Let me know if you'd prefer if I change it to #if 1. I've just started rsyslogd with rsyslogd -c4 -n on a screen session, with the same configuration files I'm using since september. Since both the "rsyslogd -c4 -n" and the later "rsyslogd -c4 -d" invocation crashed very quickly, I've restarted it once more with stdout redirected to a a logfile, and now it's running. Will let you know if it crashes once more. Yours, lorenzo +-------------------------+----------------------------------------------+ | Lorenzo M. Catucci | Centro di Calcolo e Documentazione | | catucci at ccd.uniroma2.it | Universit? degli Studi di Roma "Tor Vergata" | | | Via O. Raimondo 18 ** I-00173 ROMA ** ITALY | | Tel. +39 06 7259 2255 | Fax. +39 06 7259 2125 | +-------------------------+----------------------------------------------+ From rgerhards at hq.adiscon.com Fri Jan 16 16:33:04 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Fri, 16 Jan 2009 16:33:04 +0100 Subject: [rsyslog] rsyslog still crashes In-Reply-To: References: <1232046859.22744.39.camel@localhost.localdomain><577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com> Message-ID: <577465F99B41C842AAFBE9ED71E70ABA44F9C2@grfint2.intern.adiscon.com> > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of Lorenzo M. Catucci > Sent: Friday, January 16, 2009 4:23 PM > To: rsyslog-users > Subject: Re: [rsyslog] rsyslog still crashes > > On Fri, 16 Jan 2009, Rainer Gerhards wrote: > > RG> Lorenzo, > RG> > RG> I have created a new branch "raceDebug" and done a first commit to > it. > RG> The change is very lightweight. Please pull, compile as usual and > give > RG> it a try. It spits out some info to stdout from time to time > RG> (hopefully). I am not sure if it aborts, depending on the output it > RG> may or may not. Even if we get messages, they are probably not > enough > RG> to pinpoint the bug, but I wanted to do something very light to see > if > RG> the bug stays. > RG> > RG> Feedback appreciated. > RG> > > Rainer, I've just checked-out the branch; I've run configure with the > following command line: > > ./configure --prefix=/usr --enable-mysql --enable-pgsql --enable-mail > --enable-imfile --enable-debug --enable-rtinst --enable-valgrind > --no-create --no-recursion > > From "git diff -r HEAD^ HEAD" I've seen an #if 0 section in the > commit. > Let me know if you'd prefer if I change it to #if 1. Mmmhh... you can use debug. Yes, please then change it to 1. > > I've just started rsyslogd with rsyslogd -c4 -n on a screen session, > with > the same configuration files I'm using since september. > > Since both the "rsyslogd -c4 -n" and the later "rsyslogd -c4 -d" > invocation crashed very quickly, I've restarted it once more with > stdout > redirected to a a logfile, and now it's running. Will let you know if > it > crashes once more. That sounds good. Do you happen to have the output from those crashes? Anyway, I will be interested in what it now comes up with. As a side-note, I have introduced another race by calling the library functions. There is always some good and bad. The regular debugging system prevents this problem by protecting the writes with mutexes. That, however, affects the timing and thus we do not see the real issue. So what I have done is bad, but may be useful. I forgot to mention that with my last post... Rainer > > Yours, > > lorenzo > > > +-------------------------+-------------------------------------------- > --+ > | Lorenzo M. Catucci | Centro di Calcolo e Documentazione > | > | catucci at ccd.uniroma2.it | Universit? degli Studi di Roma "Tor > Vergata" | > | | Via O. Raimondo 18 ** I-00173 ROMA ** > ITALY | > | Tel. +39 06 7259 2255 | Fax. +39 06 7259 2125 > | > +-------------------------+-------------------------------------------- > --+ From lorenzo at sancho.ccd.uniroma2.it Fri Jan 16 17:07:30 2009 From: lorenzo at sancho.ccd.uniroma2.it (Lorenzo M. Catucci) Date: Fri, 16 Jan 2009 17:07:30 +0100 (CET) Subject: [rsyslog] rsyslog still crashes In-Reply-To: <577465F99B41C842AAFBE9ED71E70ABA44F9C2@grfint2.intern.adiscon.com> References: <1232046859.22744.39.camel@localhost.localdomain><577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44F9C2@grfint2.intern.adiscon.com> Message-ID: On Fri, 16 Jan 2009, Rainer Gerhards wrote: RG> RG> That sounds good. Do you happen to have the output from those crashes? RG> The -n crash was completely silent; the -d run was chatty (as expected); with stdout redirected, it took a lot more time to crash, but here are both the logfile and the gdb backtrace. Yours, lorenzo +-------------------------+----------------------------------------------+ | Lorenzo M. Catucci | Centro di Calcolo e Documentazione | | catucci at ccd.uniroma2.it | Universit? degli Studi di Roma "Tor Vergata" | | | Via O. Raimondo 18 ** I-00173 ROMA ** ITALY | | Tel. +39 06 7259 2255 | Fax. +39 06 7259 2125 | +-------------------------+----------------------------------------------+ -------------- next part -------------- GNU gdb 6.8-debian Copyright (C) 2008 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-linux-gnu"... Reading symbols from /usr/lib/libz.so.1...done. Loaded symbols for /usr/lib/libz.so.1 Reading symbols from /lib/libpthread.so.0...Reading symbols from /usr/lib/debug/lib/libpthread-2.7.so...done. done. Loaded symbols for /lib/libpthread.so.0 Reading symbols from /lib/libdl.so.2...Reading symbols from /usr/lib/debug/lib/libdl-2.7.so...done. done. Loaded symbols for /lib/libdl.so.2 Reading symbols from /lib/librt.so.1...Reading symbols from /usr/lib/debug/lib/librt-2.7.so...done. done. Loaded symbols for /lib/librt.so.1 Reading symbols from /lib/libc.so.6...Reading symbols from /usr/lib/debug/lib/libc-2.7.so...done. done. Loaded symbols for /lib/libc.so.6 Reading symbols from /lib/ld-linux-x86-64.so.2...Reading symbols from /usr/lib/debug/lib/ld-2.7.so...done. done. Loaded symbols for /lib64/ld-linux-x86-64.so.2 Reading symbols from /usr/lib/rsyslog/lmnet.so...done. Loaded symbols for /usr/lib/rsyslog/lmnet.so Reading symbols from /lib/libnss_files.so.2...Reading symbols from /usr/lib/debug/lib/libnss_files-2.7.so...done. done. Loaded symbols for /lib/libnss_files.so.2 Reading symbols from /usr/lib/rsyslog/imuxsock.so...done. Loaded symbols for /usr/lib/rsyslog/imuxsock.so Reading symbols from /usr/lib/rsyslog/imklog.so...done. Loaded symbols for /usr/lib/rsyslog/imklog.so Reading symbols from /lib/libnss_compat.so.2...Reading symbols from /usr/lib/debug/lib/libnss_compat-2.7.so...done. done. Loaded symbols for /lib/libnss_compat.so.2 Reading symbols from /lib/libnsl.so.1...Reading symbols from /usr/lib/debug/lib/libnsl-2.7.so...done. done. Loaded symbols for /lib/libnsl.so.1 Reading symbols from /lib/libnss_nis.so.2...Reading symbols from /usr/lib/debug/lib/libnss_nis-2.7.so...done. done. Loaded symbols for /lib/libnss_nis.so.2 Reading symbols from /usr/lib/rsyslog/lmnetstrms.so...done. Loaded symbols for /usr/lib/rsyslog/lmnetstrms.so Reading symbols from /usr/lib/rsyslog/lmtcpclt.so...done. Loaded symbols for /usr/lib/rsyslog/lmtcpclt.so Reading symbols from /usr/lib/rsyslog/lmnsd_ptcp.so...done. Loaded symbols for /usr/lib/rsyslog/lmnsd_ptcp.so Core was generated by `rsyslogd -c4 -n'. Program terminated with signal 11, Segmentation fault. [New process 19309] [New process 19311] [New process 19310] [New process 19308] [New process 19307] #0 0x000000000041cb79 in msgDestruct (ppThis=0x68ae18) at msg.c:354 354 if(strcmp((char*)(((obj_t*)pThis)->pObjInfo->pszID), "msg")) { (gdb) Thread 5 (process 19307): #0 0x00002af4d1020ce2 in select () from /lib/libc.so.6 #1 0x000000000040db93 in mainThread () at syslogd.c:2704 #2 0x000000000040ee96 in realMain (argc=, argv=) at syslogd.c:3631 #3 0x00002af4d0f761a6 in __libc_start_main () from /lib/libc.so.6 #4 0x000000000040a259 in _start () Thread 4 (process 19308): #0 0x00002af4d0938fad in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib/libpthread.so.0 #1 0x0000000000432f9f in wtiWorker (pThis=0x685270) at wti.c:406 #2 0x000000000043176a in wtpWorker (arg=0x685270) at wtp.c:425 #3 0x00002af4d0934fc7 in start_thread () from /lib/libpthread.so.0 #4 0x00002af4d10275ad in clone () from /lib/libc.so.6 #5 0x0000000000000000 in ?? () Thread 3 (process 19310): #0 0x00002af4d1020ce2 in select () from /lib/libc.so.6 #1 0x00002af4d16c69fd in runInput (pThrd=) at imuxsock.c:280 #2 0x000000000044373f in thrdStarter (arg=0x6a5db0) at ../threads.c:139 #3 0x00002af4d0934fc7 in start_thread () from /lib/libpthread.so.0 #4 0x00002af4d10275ad in clone () from /lib/libc.so.6 #5 0x0000000000000000 in ?? () Thread 2 (process 19311): #0 0x00002af4d093b7db in read () from /lib/libpthread.so.0 #1 0x00002af4d18cb1ef in klogLogKMsg () at linux.c:449 #2 0x00002af4d18ca594 in runInput (pThrd=0x6a9020) at imklog.c:224 #3 0x000000000044373f in thrdStarter (arg=0x6a9020) at ../threads.c:139 #4 0x00002af4d0934fc7 in start_thread () from /lib/libpthread.so.0 #5 0x00002af4d10275ad in clone () from /lib/libc.so.6 #6 0x0000000000000000 in ?? () Thread 1 (process 19309): #0 0x000000000041cb79 in msgDestruct (ppThis=0x68ae18) at msg.c:354 #1 0x0000000000443076 in actionCallAction (pAction=0x68ada0, pMsg=0x2aaaac0008c0) at ../action.c:774 #2 0x000000000040b307 in processMsgDoActions (pData=0x68ada0, pParam=0x41000e90) at syslogd.c:1140 #3 0x000000000041deb8 in llExecFunc (pThis=0x68ac10, pFunc=0x40b2b0 , pParam=0x41000e90) at linkedlist.c:391 #4 0x000000000040ae19 in msgConsumer (notNeeded=, pUsr=) at syslogd.c:1183 #5 0x000000000043c537 in queueConsumerReg (pThis=0x690050, pWti=0x6a3ce0, iCancelStateSave=) at queue.c:1598 #6 0x0000000000433010 in wtiWorker (pThis=0x6a3ce0) at wti.c:416 #7 0x000000000043176a in wtpWorker (arg=0x6a3ce0) at wtp.c:425 #8 0x00002af4d0934fc7 in start_thread () from /lib/libpthread.so.0 #9 0x00002af4d10275ad in clone () from /lib/libc.so.6 #10 0x0000000000000000 in ?? () (gdb) quit -------------- next part -------------- GNU gdb 6.8-debian Copyright (C) 2008 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-linux-gnu"... Reading symbols from /usr/lib/libz.so.1...done. Loaded symbols for /usr/lib/libz.so.1 Reading symbols from /lib/libpthread.so.0...Reading symbols from /usr/lib/debug/lib/libpthread-2.7.so...done. done. Loaded symbols for /lib/libpthread.so.0 Reading symbols from /lib/libdl.so.2...Reading symbols from /usr/lib/debug/lib/libdl-2.7.so...done. done. Loaded symbols for /lib/libdl.so.2 Reading symbols from /lib/librt.so.1...Reading symbols from /usr/lib/debug/lib/librt-2.7.so...done. done. Loaded symbols for /lib/librt.so.1 Reading symbols from /lib/libc.so.6...Reading symbols from /usr/lib/debug/lib/libc-2.7.so...done. done. Loaded symbols for /lib/libc.so.6 Reading symbols from /lib/ld-linux-x86-64.so.2...Reading symbols from /usr/lib/debug/lib/ld-2.7.so...done. done. Loaded symbols for /lib64/ld-linux-x86-64.so.2 Reading symbols from /usr/lib/rsyslog/lmnet.so...done. Loaded symbols for /usr/lib/rsyslog/lmnet.so Reading symbols from /lib/libnss_files.so.2...Reading symbols from /usr/lib/debug/lib/libnss_files-2.7.so...done. done. Loaded symbols for /lib/libnss_files.so.2 Reading symbols from /usr/lib/rsyslog/imuxsock.so...done. Loaded symbols for /usr/lib/rsyslog/imuxsock.so Reading symbols from /usr/lib/rsyslog/imklog.so...done. Loaded symbols for /usr/lib/rsyslog/imklog.so Reading symbols from /lib/libnss_compat.so.2...Reading symbols from /usr/lib/debug/lib/libnss_compat-2.7.so...done. done. Loaded symbols for /lib/libnss_compat.so.2 Reading symbols from /lib/libnsl.so.1...Reading symbols from /usr/lib/debug/lib/libnsl-2.7.so...done. done. Loaded symbols for /lib/libnsl.so.1 Reading symbols from /lib/libnss_nis.so.2...Reading symbols from /usr/lib/debug/lib/libnss_nis-2.7.so...done. done. Loaded symbols for /lib/libnss_nis.so.2 Reading symbols from /usr/lib/rsyslog/lmnetstrms.so...done. Loaded symbols for /usr/lib/rsyslog/lmnetstrms.so Reading symbols from /usr/lib/rsyslog/lmtcpclt.so...done. Loaded symbols for /usr/lib/rsyslog/lmtcpclt.so Reading symbols from /usr/lib/rsyslog/lmnsd_ptcp.so...done. Loaded symbols for /usr/lib/rsyslog/lmnsd_ptcp.so Core was generated by `rsyslogd -c4 -d'. Program terminated with signal 11, Segmentation fault. [New process 20676] [New process 20678] [New process 20677] [New process 20675] [New process 20674] #0 0x000000000041cb79 in msgDestruct (ppThis=0x68ace8) at msg.c:354 354 if(strcmp((char*)(((obj_t*)pThis)->pObjInfo->pszID), "msg")) { (gdb) Thread 5 (process 20674): #0 0x00002ab1af573ce2 in select () from /lib/libc.so.6 #1 0x000000000040db93 in mainThread () at syslogd.c:2704 #2 0x000000000040ee96 in realMain (argc=, argv=) at syslogd.c:3631 #3 0x00002ab1af4c91a6 in __libc_start_main () from /lib/libc.so.6 #4 0x000000000040a259 in _start () Thread 4 (process 20675): #0 0x00002ab1aee8bfad in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib/libpthread.so.0 #1 0x0000000000432f9f in wtiWorker (pThis=0x685140) at wti.c:406 #2 0x000000000043176a in wtpWorker (arg=0x685140) at wtp.c:425 #3 0x00002ab1aee87fc7 in start_thread () from /lib/libpthread.so.0 #4 0x00002ab1af57a5ad in clone () from /lib/libc.so.6 #5 0x0000000000000000 in ?? () Thread 3 (process 20677): #0 0x00002ab1af573ce2 in select () from /lib/libc.so.6 #1 0x00002ab1afc199fd in runInput (pThrd=) at imuxsock.c:280 #2 0x000000000044373f in thrdStarter (arg=0x6a5c80) at ../threads.c:139 #3 0x00002ab1aee87fc7 in start_thread () from /lib/libpthread.so.0 #4 0x00002ab1af57a5ad in clone () from /lib/libc.so.6 #5 0x0000000000000000 in ?? () Thread 2 (process 20678): #0 0x00002ab1aee8e7db in read () from /lib/libpthread.so.0 #1 0x00002ab1afe1e1ef in klogLogKMsg () at linux.c:449 #2 0x00002ab1afe1d594 in runInput (pThrd=0x6a8ef0) at imklog.c:224 #3 0x000000000044373f in thrdStarter (arg=0x6a8ef0) at ../threads.c:139 #4 0x00002ab1aee87fc7 in start_thread () from /lib/libpthread.so.0 #5 0x00002ab1af57a5ad in clone () from /lib/libc.so.6 #6 0x0000000000000000 in ?? () Thread 1 (process 20676): #0 0x000000000041cb79 in msgDestruct (ppThis=0x68ace8) at msg.c:354 #1 0x0000000000443076 in actionCallAction (pAction=0x68ac70, pMsg=0x6aee30) at ../action.c:774 #2 0x000000000040b307 in processMsgDoActions (pData=0x68ac70, pParam=0x41000e90) at syslogd.c:1140 #3 0x000000000041deb8 in llExecFunc (pThis=0x68aae0, pFunc=0x40b2b0 , pParam=0x41000e90) at linkedlist.c:391 #4 0x000000000040ae19 in msgConsumer (notNeeded=, pUsr=) at syslogd.c:1183 #5 0x000000000043c537 in queueConsumerReg (pThis=0x68ff20, pWti=0x6a3bb0, iCancelStateSave=) at queue.c:1598 #6 0x0000000000433010 in wtiWorker (pThis=0x6a3bb0) at wti.c:416 #7 0x000000000043176a in wtpWorker (arg=0x6a3bb0) at wtp.c:425 #8 0x00002ab1aee87fc7 in start_thread () from /lib/libpthread.so.0 #9 0x00002ab1af57a5ad in clone () from /lib/libc.so.6 #10 0x0000000000000000 in ?? () (gdb) quit -------------- next part -------------- GNU gdb 6.8-debian Copyright (C) 2008 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-linux-gnu"... Reading symbols from /usr/lib/libz.so.1...done. Loaded symbols for /usr/lib/libz.so.1 Reading symbols from /lib/libpthread.so.0...Reading symbols from /usr/lib/debug/lib/libpthread-2.7.so...done. done. Loaded symbols for /lib/libpthread.so.0 Reading symbols from /lib/libdl.so.2...Reading symbols from /usr/lib/debug/lib/libdl-2.7.so...done. done. Loaded symbols for /lib/libdl.so.2 Reading symbols from /lib/librt.so.1...Reading symbols from /usr/lib/debug/lib/librt-2.7.so...done. done. Loaded symbols for /lib/librt.so.1 Reading symbols from /lib/libc.so.6...Reading symbols from /usr/lib/debug/lib/libc-2.7.so...done. done. Loaded symbols for /lib/libc.so.6 Reading symbols from /lib/ld-linux-x86-64.so.2...Reading symbols from /usr/lib/debug/lib/ld-2.7.so...done. done. Loaded symbols for /lib64/ld-linux-x86-64.so.2 Reading symbols from /usr/lib/rsyslog/lmnet.so...done. Loaded symbols for /usr/lib/rsyslog/lmnet.so Reading symbols from /lib/libnss_files.so.2...Reading symbols from /usr/lib/debug/lib/libnss_files-2.7.so...done. done. Loaded symbols for /lib/libnss_files.so.2 Reading symbols from /usr/lib/rsyslog/imuxsock.so...done. Loaded symbols for /usr/lib/rsyslog/imuxsock.so Reading symbols from /usr/lib/rsyslog/imklog.so...done. Loaded symbols for /usr/lib/rsyslog/imklog.so Reading symbols from /lib/libnss_compat.so.2...Reading symbols from /usr/lib/debug/lib/libnss_compat-2.7.so...done. done. Loaded symbols for /lib/libnss_compat.so.2 Reading symbols from /lib/libnsl.so.1...Reading symbols from /usr/lib/debug/lib/libnsl-2.7.so...done. done. Loaded symbols for /lib/libnsl.so.1 Reading symbols from /lib/libnss_nis.so.2...Reading symbols from /usr/lib/debug/lib/libnss_nis-2.7.so...done. done. Loaded symbols for /lib/libnss_nis.so.2 Reading symbols from /usr/lib/rsyslog/lmnetstrms.so...done. Loaded symbols for /usr/lib/rsyslog/lmnetstrms.so Reading symbols from /usr/lib/rsyslog/lmtcpclt.so...done. Loaded symbols for /usr/lib/rsyslog/lmtcpclt.so Reading symbols from /usr/lib/rsyslog/lmnsd_ptcp.so...done. Loaded symbols for /usr/lib/rsyslog/lmnsd_ptcp.so Core was generated by `rsyslogd -c4 -d'. Program terminated with signal 6, Aborted. [New process 21096] [New process 21098] [New process 21097] [New process 21095] [New process 21094] #0 0x00002ac0a65dded5 in raise () from /lib/libc.so.6 (gdb) Thread 5 (process 21094): #0 0x00002ac0a6674ce2 in select () from /lib/libc.so.6 #1 0x000000000040db93 in mainThread () at syslogd.c:2704 #2 0x000000000040ee96 in realMain (argc=, argv=) at syslogd.c:3631 #3 0x00002ac0a65ca1a6 in __libc_start_main () from /lib/libc.so.6 #4 0x000000000040a259 in _start () Thread 4 (process 21095): #0 0x00002ac0a5f8cfad in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib/libpthread.so.0 #1 0x0000000000432f9f in wtiWorker (pThis=0x685140) at wti.c:406 #2 0x000000000043176a in wtpWorker (arg=0x685140) at wtp.c:425 #3 0x00002ac0a5f88fc7 in start_thread () from /lib/libpthread.so.0 #4 0x00002ac0a667b5ad in clone () from /lib/libc.so.6 #5 0x0000000000000000 in ?? () Thread 3 (process 21097): #0 0x00002ac0a6674ce2 in select () from /lib/libc.so.6 #1 0x00002ac0a6d1a9fd in runInput (pThrd=) at imuxsock.c:280 #2 0x000000000044373f in thrdStarter (arg=0x6a5c80) at ../threads.c:139 #3 0x00002ac0a5f88fc7 in start_thread () from /lib/libpthread.so.0 #4 0x00002ac0a667b5ad in clone () from /lib/libc.so.6 #5 0x0000000000000000 in ?? () Thread 2 (process 21098): #0 0x00002ac0a5f8f7db in read () from /lib/libpthread.so.0 #1 0x00002ac0a6f1f1ef in klogLogKMsg () at linux.c:449 #2 0x00002ac0a6f1e594 in runInput (pThrd=0x6a8ef0) at imklog.c:224 #3 0x000000000044373f in thrdStarter (arg=0x6a8ef0) at ../threads.c:139 #4 0x00002ac0a5f88fc7 in start_thread () from /lib/libpthread.so.0 #5 0x00002ac0a667b5ad in clone () from /lib/libc.so.6 #6 0x0000000000000000 in ?? () Thread 1 (process 21096): #0 0x00002ac0a65dded5 in raise () from /lib/libc.so.6 #1 0x00002ac0a65df3f3 in abort () from /lib/libc.so.6 #2 0x0000000000423697 in sigsegvHdlr (signum=6) at debug.c:759 #3 #4 0x00002ac0a65dded5 in raise () from /lib/libc.so.6 #5 0x00002ac0a65df3f3 in abort () from /lib/libc.so.6 #6 0x00002ac0a65d6dc9 in __assert_fail () from /lib/libc.so.6 #7 0x000000000043a4be in queueChkDiscardMsg (pThis=0x68ff20, iQueueSize=0, bRunsDA=0, pUsr=0x2aaaac002e30) at queue.c:1393 #8 0x000000000043bde3 in queueDequeueConsumable (pThis=0x68ff20, pWti=0x6a3bb0, iCancelStateSave=0) at queue.c:1478 #9 0x000000000043c4f1 in queueConsumerReg (pThis=0x68ff20, pWti=0x6a3bb0, iCancelStateSave=0) at queue.c:1597 #10 0x0000000000433010 in wtiWorker (pThis=0x6a3bb0) at wti.c:416 #11 0x000000000043176a in wtpWorker (arg=0x6a3bb0) at wtp.c:425 #12 0x00002ac0a5f88fc7 in start_thread () from /lib/libpthread.so.0 #13 0x00002ac0a667b5ad in clone () from /lib/libc.so.6 #14 0x0000000000000000 in ?? () (gdb) quit From lorenzo at sancho.ccd.uniroma2.it Fri Jan 16 17:10:29 2009 From: lorenzo at sancho.ccd.uniroma2.it (Lorenzo M. Catucci) Date: Fri, 16 Jan 2009 17:10:29 +0100 (CET) Subject: [rsyslog] rsyslog still crashes In-Reply-To: References: <1232046859.22744.39.camel@localhost.localdomain><577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44F9C2@grfint2.intern.adiscon.com> Message-ID: On Fri, 16 Jan 2009, Lorenzo M. Catucci wrote: LMC> LMC> The -n crash was completely silent; the -d run was chatty (as expected); LMC> with stdout redirected, it took a lot more time to crash, but here are LMC> both the logfile and the gdb backtrace. LMC> As for the last crash, I found on the screen session the line: rsyslogd: queue.c:1393: queueChkDiscardMsg: Assertion `(unsigned) ((obj_t*)(pUsr))->iObjCooCKiE == (unsigned) 0xBADEFEE' failed. since I forgot redirecting stderr too. Yours, lorenzo +-------------------------+----------------------------------------------+ | Lorenzo M. Catucci | Centro di Calcolo e Documentazione | | catucci at ccd.uniroma2.it | Universit? degli Studi di Roma "Tor Vergata" | | | Via O. Raimondo 18 ** I-00173 ROMA ** ITALY | | Tel. +39 06 7259 2255 | Fax. +39 06 7259 2125 | +-------------------------+----------------------------------------------+ From rgerhards at hq.adiscon.com Fri Jan 16 17:17:25 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Fri, 16 Jan 2009 17:17:25 +0100 Subject: [rsyslog] rsyslog still crashes In-Reply-To: References: <1232046859.22744.39.camel@localhost.localdomain><577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9C2@grfint2.intern.adiscon.com> Message-ID: <577465F99B41C842AAFBE9ED71E70ABA44F9C7@grfint2.intern.adiscon.com> Ok, this together with the others is evidence that something runs really wild and overwrites memory blocks. The reason this message did not appear earlier is that I disable the check in DestroyMsg() and permit it to return even though I then know memory is corrupted. So what you see here is a follow-up error. The good news, I think, is that it looks (but may fool me) like the issue seems to be in temporal proximity of the abort. That would be really good news. Let me think a bit about the situation, I'll probably come up with another instrumentation. The issue is that I'd potentially need to output one or even two log lines per message, and that creates other sync issues. Plus, I don't know if I overrun your disk with that (depending on workload, which seems to be quite high). Rainer > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of Lorenzo M. Catucci > Sent: Friday, January 16, 2009 5:10 PM > To: rsyslog-users > Subject: Re: [rsyslog] rsyslog still crashes > > On Fri, 16 Jan 2009, Lorenzo M. Catucci wrote: > > LMC> > LMC> The -n crash was completely silent; the -d run was chatty (as > expected); > LMC> with stdout redirected, it took a lot more time to crash, but here > are > LMC> both the logfile and the gdb backtrace. > LMC> > > As for the last crash, I found on the screen session the line: > > rsyslogd: queue.c:1393: queueChkDiscardMsg: Assertion `(unsigned) > ((obj_t*)(pUsr))->iObjCooCKiE == (unsigned) 0xBADEFEE' failed. > > since I forgot redirecting stderr too. > > Yours, > > lorenzo > > +-------------------------+-------------------------------------------- > --+ > | Lorenzo M. Catucci | Centro di Calcolo e Documentazione > | > | catucci at ccd.uniroma2.it | Universit? degli Studi di Roma "Tor > Vergata" | > | | Via O. Raimondo 18 ** I-00173 ROMA ** > ITALY | > | Tel. +39 06 7259 2255 | Fax. +39 06 7259 2125 > | > +-------------------------+-------------------------------------------- > --+ From rgerhards at hq.adiscon.com Fri Jan 16 17:19:34 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Fri, 16 Jan 2009 17:19:34 +0100 Subject: [rsyslog] rsyslog still crashes In-Reply-To: References: <1232046859.22744.39.camel@localhost.localdomain><577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9C2@grfint2.intern.adiscon.com> Message-ID: <577465F99B41C842AAFBE9ED71E70ABA44F9C8@grfint2.intern.adiscon.com> Lorenzo, one thing: can you change the actionqueuemode to "direct" just for a short period. I would be very interested to see what happens. Rainer > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of Lorenzo M. Catucci > Sent: Friday, January 16, 2009 5:10 PM > To: rsyslog-users > Subject: Re: [rsyslog] rsyslog still crashes > > On Fri, 16 Jan 2009, Lorenzo M. Catucci wrote: > > LMC> > LMC> The -n crash was completely silent; the -d run was chatty (as > expected); > LMC> with stdout redirected, it took a lot more time to crash, but here > are > LMC> both the logfile and the gdb backtrace. > LMC> > > As for the last crash, I found on the screen session the line: > > rsyslogd: queue.c:1393: queueChkDiscardMsg: Assertion `(unsigned) > ((obj_t*)(pUsr))->iObjCooCKiE == (unsigned) 0xBADEFEE' failed. > > since I forgot redirecting stderr too. > > Yours, > > lorenzo > > +-------------------------+-------------------------------------------- > --+ > | Lorenzo M. Catucci | Centro di Calcolo e Documentazione > | > | catucci at ccd.uniroma2.it | Universit? degli Studi di Roma "Tor > Vergata" | > | | Via O. Raimondo 18 ** I-00173 ROMA ** > ITALY | > | Tel. +39 06 7259 2255 | Fax. +39 06 7259 2125 > | > +-------------------------+-------------------------------------------- > --+ From rgerhards at hq.adiscon.com Fri Jan 16 17:47:02 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Fri, 16 Jan 2009 17:47:02 +0100 Subject: [rsyslog] rsyslog still crashes In-Reply-To: <577465F99B41C842AAFBE9ED71E70ABA44F9C8@grfint2.intern.adiscon.com> References: <1232046859.22744.39.camel@localhost.localdomain><577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9C2@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44F9C8@grfint2.intern.adiscon.com> Message-ID: <577465F99B41C842AAFBE9ED71E70ABA44F9C9@grfint2.intern.adiscon.com> Lorenzo and others: I hopefully got a system today where I can reproduce. I am setting it up right now. I also have written a stub wiki page with information useful to hunt this bug: http://wiki.rsyslog.com/index.php/V3_Race_Condition_Hunt_Page Lorenzo, can you please double-check I have used the right config indeed. All others: if you can add scenarios/information, please do. I'll try to repro the problem as soon as the system is ready. Hope it will work... Rainer > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of Rainer Gerhards > Sent: Friday, January 16, 2009 5:20 PM > To: rsyslog-users > Subject: Re: [rsyslog] rsyslog still crashes > > Lorenzo, > > one thing: can you change the actionqueuemode to "direct" just for a > short period. I would be very interested to see what happens. > > Rainer > > > -----Original Message----- > > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > > bounces at lists.adiscon.com] On Behalf Of Lorenzo M. Catucci > > Sent: Friday, January 16, 2009 5:10 PM > > To: rsyslog-users > > Subject: Re: [rsyslog] rsyslog still crashes > > > > On Fri, 16 Jan 2009, Lorenzo M. Catucci wrote: > > > > LMC> > > LMC> The -n crash was completely silent; the -d run was chatty (as > > expected); > > LMC> with stdout redirected, it took a lot more time to crash, but > here > > are > > LMC> both the logfile and the gdb backtrace. > > LMC> > > > > As for the last crash, I found on the screen session the line: > > > > rsyslogd: queue.c:1393: queueChkDiscardMsg: Assertion `(unsigned) > > ((obj_t*)(pUsr))->iObjCooCKiE == (unsigned) 0xBADEFEE' failed. > > > > since I forgot redirecting stderr too. > > > > Yours, > > > > lorenzo > > > > +-------------------------+------------------------------------------ > -- > > --+ > > | Lorenzo M. Catucci | Centro di Calcolo e Documentazione > > | > > | catucci at ccd.uniroma2.it | Universit? degli Studi di Roma "Tor > > Vergata" | > > | | Via O. Raimondo 18 ** I-00173 ROMA ** > > ITALY | > > | Tel. +39 06 7259 2255 | Fax. +39 06 7259 2125 > > | > > +-------------------------+------------------------------------------ > -- > > --+ > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com From lorenzo at sancho.ccd.uniroma2.it Fri Jan 16 17:52:28 2009 From: lorenzo at sancho.ccd.uniroma2.it (Lorenzo M. Catucci) Date: Fri, 16 Jan 2009 17:52:28 +0100 (CET) Subject: [rsyslog] rsyslog still crashes In-Reply-To: <577465F99B41C842AAFBE9ED71E70ABA44F9C8@grfint2.intern.adiscon.com> References: <1232046859.22744.39.camel@localhost.localdomain><577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9C2@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44F9C8@grfint2.intern.adiscon.com> Message-ID: On Fri, 16 Jan 2009, Rainer Gerhards wrote: RG> Lorenzo, RG> RG> one thing: can you change the actionqueuemode to "direct" just for a RG> short period. I would be very interested to see what happens. RG> Very short period... it crashed about as soon as started... I'm enclosing both the log and the backtrace. See you soon, lorenzo +-------------------------+----------------------------------------------+ | Lorenzo M. Catucci | Centro di Calcolo e Documentazione | | catucci at ccd.uniroma2.it | Universit? degli Studi di Roma "Tor Vergata" | | | Via O. Raimondo 18 ** I-00173 ROMA ** ITALY | | Tel. +39 06 7259 2255 | Fax. +39 06 7259 2125 | +-------------------------+----------------------------------------------+ -------------- next part -------------- GNU gdb 6.8-debian Copyright (C) 2008 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-linux-gnu"... Reading symbols from /usr/lib/libz.so.1...done. Loaded symbols for /usr/lib/libz.so.1 Reading symbols from /lib/libpthread.so.0...Reading symbols from /usr/lib/debug/lib/libpthread-2.7.so...done. done. Loaded symbols for /lib/libpthread.so.0 Reading symbols from /lib/libdl.so.2...Reading symbols from /usr/lib/debug/lib/libdl-2.7.so...done. done. Loaded symbols for /lib/libdl.so.2 Reading symbols from /lib/librt.so.1...Reading symbols from /usr/lib/debug/lib/librt-2.7.so...done. done. Loaded symbols for /lib/librt.so.1 Reading symbols from /lib/libc.so.6...Reading symbols from /usr/lib/debug/lib/libc-2.7.so...done. done. Loaded symbols for /lib/libc.so.6 Reading symbols from /lib/ld-linux-x86-64.so.2...Reading symbols from /usr/lib/debug/lib/ld-2.7.so...done. done. Loaded symbols for /lib64/ld-linux-x86-64.so.2 Reading symbols from /usr/lib/rsyslog/lmnet.so...done. Loaded symbols for /usr/lib/rsyslog/lmnet.so Reading symbols from /lib/libnss_files.so.2...Reading symbols from /usr/lib/debug/lib/libnss_files-2.7.so...done. done. Loaded symbols for /lib/libnss_files.so.2 Reading symbols from /usr/lib/rsyslog/imuxsock.so...done. Loaded symbols for /usr/lib/rsyslog/imuxsock.so Reading symbols from /usr/lib/rsyslog/imklog.so...done. Loaded symbols for /usr/lib/rsyslog/imklog.so Reading symbols from /lib/libnss_compat.so.2...Reading symbols from /usr/lib/debug/lib/libnss_compat-2.7.so...done. done. Loaded symbols for /lib/libnss_compat.so.2 Reading symbols from /lib/libnsl.so.1...Reading symbols from /usr/lib/debug/lib/libnsl-2.7.so...done. done. Loaded symbols for /lib/libnsl.so.1 Reading symbols from /lib/libnss_nis.so.2...Reading symbols from /usr/lib/debug/lib/libnss_nis-2.7.so...done. done. Loaded symbols for /lib/libnss_nis.so.2 Reading symbols from /usr/lib/rsyslog/lmnetstrms.so...done. Loaded symbols for /usr/lib/rsyslog/lmnetstrms.so Reading symbols from /usr/lib/rsyslog/lmtcpclt.so...done. Loaded symbols for /usr/lib/rsyslog/lmtcpclt.so Core was generated by `rsyslogd -c4 -d'. Program terminated with signal 11, Segmentation fault. [New process 27339] [New process 27341] [New process 27340] [New process 27338] #0 0x00002b034900f030 in strlen () from /lib/libc.so.6 (gdb) Thread 4 (process 27338): #0 0x00002b03489774c5 in __lll_unlock_wake () from /lib/libpthread.so.0 #1 0x00002b0348973ff9 in _L_unlock_56 () from /lib/libpthread.so.0 #2 0x00002b0348973c56 in __pthread_mutex_unlock_usercnt () from /lib/libpthread.so.0 #3 0x0000000000422a09 in dbgprint (pObj=, pszMsg=0x7fff6256dea0 " X ", lenMsg=3) at debug.c:157 #4 0x0000000000422c33 in dbgprintf (fmt=) at debug.c:892 #5 0x000000000040d522 in init () at syslogd.c:2207 #6 0x000000000040da79 in mainThread () at syslogd.c:2954 #7 0x000000000040ee96 in realMain (argc=, argv=) at syslogd.c:3631 #8 0x00002b0348fb21a6 in __libc_start_main () from /lib/libc.so.6 #9 0x000000000040a259 in _start () Thread 3 (process 27340): #0 0x00002b034905cce2 in select () from /lib/libc.so.6 #1 0x00002b03497029fd in runInput (pThrd=) at imuxsock.c:280 #2 0x000000000044377f in thrdStarter (arg=0x6a3b90) at ../threads.c:139 #3 0x00002b0348970fc7 in start_thread () from /lib/libpthread.so.0 #4 0x00002b03490635ad in clone () from /lib/libc.so.6 #5 0x0000000000000000 in ?? () Thread 2 (process 27341): #0 0x00002b03489777db in read () from /lib/libpthread.so.0 #1 0x00002b03499071ef in klogLogKMsg () at linux.c:449 #2 0x00002b0349906594 in runInput (pThrd=0x6a6b90) at imklog.c:224 #3 0x000000000044377f in thrdStarter (arg=0x6a6b90) at ../threads.c:139 #4 0x00002b0348970fc7 in start_thread () from /lib/libpthread.so.0 #5 0x00002b03490635ad in clone () from /lib/libc.so.6 #6 0x0000000000000000 in ?? () Thread 1 (process 27339): #0 0x00002b034900f030 in strlen () from /lib/libc.so.6 #1 0x00002b0348fdbcb1 in vfprintf () from /lib/libc.so.6 #2 0x00002b0348fe1c08 in fprintf () from /lib/libc.so.6 #3 0x000000000041ce7d in msgDestruct (ppThis=) at msg.c:350 #4 0x000000000044283a in actionCallDoAction (pAction=0x6856d0, pMsg=0x6a41f0) at ../action.c:495 #5 0x0000000000439c9c in qAddDirect (pThis=0x6857e0, pUsr=0x6a41f0) at queue.c:939 #6 0x000000000043dd83 in queueEnqObj (pThis=0x6857e0, flowCtlType=, pUsr=0x6a41f0) at queue.c:1016 #7 0x0000000000442d63 in actionWriteToAction (pAction=0x6856d0) at ../action.c:672 #8 0x00000000004430d0 in actionCallAction (pAction=0x6856d0, pMsg=0x6a41f0) at ../action.c:778 #9 0x000000000040b307 in processMsgDoActions (pData=0x6856d0, pParam=0x407ffe90) at syslogd.c:1140 #10 0x000000000041def8 in llExecFunc (pThis=0x685540, pFunc=0x40b2b0 , pParam=0x407ffe90) at linkedlist.c:391 #11 0x000000000040ae19 in msgConsumer (notNeeded=, pUsr=) at syslogd.c:1183 #12 0x000000000043c577 in queueConsumerReg (pThis=0x68cc80, pWti=0x6a1030, iCancelStateSave=) at queue.c:1598 #13 0x0000000000433050 in wtiWorker (pThis=0x6a1030) at wti.c:416 #14 0x00000000004317aa in wtpWorker (arg=0x6a1030) at wtp.c:425 #15 0x00002b0348970fc7 in start_thread () from /lib/libpthread.so.0 #16 0x00002b03490635ad in clone () from /lib/libc.so.6 #17 0x0000000000000000 in ?? () (gdb) quit -------------- next part -------------- 4037.620068405:main thread: Writing pidfile /var/run/rsyslogd.pid. 4037.620491470:main thread: rsyslog 4.1.3 - called init() 4037.620502795:main thread: Unloading non-static modules. 4037.620513481:main thread: module lmnet NOT unloaded because it still has a refcount of 3 4037.620522445:main thread: Clearing templates. 4037.620569724:main thread: cfline: '$ModLoad imuxsock # provides support for local system logging' 4037.620585477:main thread: Requested to load module 'imuxsock' 4037.620596298:main thread: loading module '/usr/lib/rsyslog/imuxsock.so' 4037.620662954:main thread: imuxsock version 4.1.3 initializing 4037.620699263:main thread: module of type 0 being loaded. 4037.620712772:main thread: cfline: '$ModLoad imklog # provides kernel logging support (previously done by rklogd)' 4037.620724718:main thread: Requested to load module 'imklog' 4037.620733972:main thread: loading module '/usr/lib/rsyslog/imklog.so' 4037.620847557:main thread: module of type 0 being loaded. 4037.620864846:main thread: cfline: '$ActionFileDefaultTemplate RSYSLOG_TraditionalFileFormat' 4037.620884928:main thread: cfline: '$FileOwner root' 4037.621151637:main thread: uid 0 obtained for user 'root' 4037.621164483:main thread: cfline: '$FileGroup adm' 4037.621221737:main thread: gid 4 obtained for group 'adm' 4037.621233731:main thread: cfline: '$FileCreateMode 0640' 4037.621247204:main thread: cfline: '$IncludeConfig /etc/rsyslog.d/*.conf' 4037.621306972:main thread: requested to include config file '/etc/rsyslog.d/remote.conf' 4037.621334470:main thread: cfline: '$WorkDirectory /var/log/rsyslog' 4037.621352254:main thread: cfline: '$ActionQueueType Direct # use synchronous processing' 4037.621692792:main thread: action queue type set to DIRECT (no queueing at all) 4037.621705098:main thread: cfline: '$ActionQueueFileName srvrfwd # set file name, also enables disk mode' 4037.621720665:main thread: cfline: '$ActionResumeRetryCount -1 # infinite retries on insert failure' 4037.621734291:main thread: cfline: '$ActionQueueSaveOnShutdown on # save in-memory data if rsyslog shuts down' 4037.621748715:main thread: cfline: 'mail.* @@xx.yy.zz.tt:514' 4037.621761573:main thread: - traditional PRI filter 4037.621771329:main thread: symbolic name: * ==> 255 4037.621783748:main thread: symbolic name: mail ==> 16 4037.621800473:main thread: tried selector action for builtin-file: -2001 4037.621816553:main thread: caller requested object 'netstrms', not found (iRet -3003) 4037.621829132:main thread: Requested to load module 'lmnetstrms' 4037.621839089:main thread: loading module '/usr/lib/rsyslog/lmnetstrms.so' 4037.621919155:main thread: module of type 2 being loaded. 4037.621932301:main thread: source file omfwd.c requested reference for module 'lmnetstrms', reference count now 1 4037.621945375:main thread: source file omfwd.c requested reference for module 'lmnetstrms', reference count now 2 4037.621960807:main thread: caller requested object 'tcpclt', not found (iRet -3003) 4037.621970535:main thread: Requested to load module 'lmtcpclt' 4037.621979727:main thread: loading module '/usr/lib/rsyslog/lmtcpclt.so' 4037.622039220:main thread: module of type 2 being loaded. 4037.622051937:main thread: source file omfwd.c requested reference for module 'lmtcpclt', reference count now 1 4037.622064386:main thread: hostname 'xx.yy.zz.tt', port '514' 4037.622084093:main thread: tried selector action for builtin-fwd: 0 4037.622095973:main thread: Module builtin-fwd processed this config line. 4037.622111045:main thread: template: 'RSYSLOG_TraditionalForwardFormat' assigned 4037.622134550:main thread: action 1 queue: save on shutdown 1, max disk space allowed 0 4037.622153957:main thread: action 1 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 4037.622166394:main thread: Action 0x6838c0: queue 0x683d60 created 4037.622179432:main thread: cfline: '$ActionExecOnlyWhenPreviousIsSuspended on' 4037.622192407:main thread: cfline: '& /data/var_syslog/failover.log' 4037.622218048:main thread: tried selector action for builtin-file: 0 4037.622239084:main thread: Module builtin-file processed this config line. 4037.622249944:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 4037.622264904:main thread: action 2 queue: save on shutdown 1, max disk space allowed 0 4037.622278185:main thread: action 2 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 4037.622289525:main thread: Action 0x684b30: queue 0x684d70 created 4037.622300676:main thread: cfline: '$ActionExecOnlyWhenPreviousIsSuspended off' 4037.622315313:main thread: selector line successfully processed 4037.622335353:main thread: cfline: 'auth,authpriv.* /var/log/auth.log' 4037.622346713:main thread: - traditional PRI filter 4037.622355695:main thread: symbolic name: * ==> 255 4037.622367074:main thread: symbolic name: auth ==> 32 4037.622378090:main thread: symbolic name: authpriv ==> 80 4037.622399801:main thread: tried selector action for builtin-file: 0 4037.622409569:main thread: Module builtin-file processed this config line. 4037.622419853:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 4037.622431973:main thread: action 3 queue: save on shutdown 1, max disk space allowed 0 4037.622445019:main thread: action 3 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 4037.622456983:main thread: Action 0x685160: queue 0x685220 created 4037.622467966:main thread: cfline: '*.*;auth,authpriv.none -/var/log/syslog' 4037.622477221:main thread: selector line successfully processed 4037.622486077:main thread: - traditional PRI filter 4037.622494606:main thread: symbolic name: * ==> 255 4037.622506225:main thread: symbolic name: none ==> 16 4037.622517007:main thread: symbolic name: auth ==> 32 4037.622527927:main thread: symbolic name: authpriv ==> 80 4037.622547618:main thread: tried selector action for builtin-file: 0 4037.622557092:main thread: Module builtin-file processed this config line. 4037.622567055:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 4037.622578953:main thread: action 4 queue: save on shutdown 1, max disk space allowed 0 4037.622591601:main thread: action 4 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 4037.622603373:main thread: Action 0x6856d0: queue 0x6857e0 created 4037.622614425:main thread: cfline: 'daemon.* -/var/log/daemon.log' 4037.622623611:main thread: selector line successfully processed 4037.622632946:main thread: - traditional PRI filter 4037.622641538:main thread: symbolic name: * ==> 255 4037.622652635:main thread: symbolic name: daemon ==> 24 4037.622672048:main thread: tried selector action for builtin-file: 0 4037.622681333:main thread: Module builtin-file processed this config line. 4037.622690864:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 4037.622704736:main thread: action 5 queue: save on shutdown 1, max disk space allowed 0 4037.622718299:main thread: action 5 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 4037.622730175:main thread: Action 0x685cb0: queue 0x685dc0 created 4037.622740990:main thread: cfline: 'kern.* -/var/log/kern.log' 4037.622749924:main thread: selector line successfully processed 4037.622759053:main thread: - traditional PRI filter 4037.622767804:main thread: symbolic name: * ==> 255 4037.622779282:main thread: symbolic name: kern ==> 0 4037.622799130:main thread: tried selector action for builtin-file: 0 4037.622808619:main thread: Module builtin-file processed this config line. 4037.622818753:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 4037.622830206:main thread: action 6 queue: save on shutdown 1, max disk space allowed 0 4037.622842911:main thread: action 6 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 4037.622854803:main thread: Action 0x686290: queue 0x6863a0 created 4037.622865624:main thread: cfline: 'lpr.* -/var/log/lpr.log' 4037.622874702:main thread: selector line successfully processed 4037.622883912:main thread: - traditional PRI filter 4037.622904459:main thread: symbolic name: * ==> 255 4037.622915496:main thread: symbolic name: lpr ==> 48 4037.622935076:main thread: tried selector action for builtin-file: 0 4037.622944394:main thread: Module builtin-file processed this config line. 4037.622953982:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 4037.622965406:main thread: action 7 queue: save on shutdown 1, max disk space allowed 0 4037.622978123:main thread: action 7 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 4037.622989985:main thread: Action 0x686870: queue 0x686980 created 4037.623000683:main thread: cfline: 'mail.* -/var/log/mail.log' 4037.623009707:main thread: selector line successfully processed 4037.623018565:main thread: - traditional PRI filter 4037.623027088:main thread: symbolic name: * ==> 255 4037.623038884:main thread: symbolic name: mail ==> 16 4037.623058105:main thread: tried selector action for builtin-file: 0 4037.623067588:main thread: Module builtin-file processed this config line. 4037.623077685:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 4037.623093423:main thread: action 8 queue: save on shutdown 1, max disk space allowed 0 4037.623107052:main thread: action 8 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 4037.623118908:main thread: Action 0x686e50: queue 0x686f60 created 4037.623129726:main thread: cfline: 'user.* -/var/log/user.log' 4037.623138774:main thread: selector line successfully processed 4037.623147684:main thread: - traditional PRI filter 4037.623156198:main thread: symbolic name: * ==> 255 4037.623167187:main thread: symbolic name: user ==> 8 4037.623186686:main thread: tried selector action for builtin-file: 0 4037.623196019:main thread: Module builtin-file processed this config line. 4037.623205766:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 4037.623217211:main thread: action 9 queue: save on shutdown 1, max disk space allowed 0 4037.623229541:main thread: action 9 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 4037.623240500:main thread: Action 0x6873f0: queue 0x687500 created 4037.623252272:main thread: cfline: 'mail.info -/var/log/mail.info' 4037.623261136:main thread: selector line successfully processed 4037.623269866:main thread: - traditional PRI filter 4037.623278671:main thread: symbolic name: info ==> 6 4037.623289546:main thread: symbolic name: mail ==> 16 4037.623308401:main thread: tried selector action for builtin-file: 0 4037.623317689:main thread: Module builtin-file processed this config line. 4037.623327277:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 4037.623338569:main thread: action 10 queue: save on shutdown 1, max disk space allowed 0 4037.623351333:main thread: action 10 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 4037.623362865:main thread: Action 0x6879d0: queue 0x687ae0 created 4037.623373563:main thread: cfline: 'mail.warn -/var/log/mail.warn' 4037.623382608:main thread: selector line successfully processed 4037.623391311:main thread: - traditional PRI filter 4037.623399873:main thread: symbolic name: warn ==> 4 4037.623410589:main thread: symbolic name: mail ==> 16 4037.623429414:main thread: tried selector action for builtin-file: 0 4037.623438681:main thread: Module builtin-file processed this config line. 4037.623451643:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 4037.623463664:main thread: action 11 queue: save on shutdown 1, max disk space allowed 0 4037.623476036:main thread: action 11 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 4037.623486893:main thread: Action 0x687fb0: queue 0x6880c0 created 4037.623497465:main thread: cfline: 'mail.err /var/log/mail.err' 4037.623506468:main thread: selector line successfully processed 4037.623515453:main thread: - traditional PRI filter 4037.623523865:main thread: symbolic name: err ==> 3 4037.623545812:main thread: symbolic name: mail ==> 16 4037.623566230:main thread: tried selector action for builtin-file: 0 4037.623575947:main thread: Module builtin-file processed this config line. 4037.623585871:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 4037.623597019:main thread: action 12 queue: save on shutdown 1, max disk space allowed 0 4037.623609634:main thread: action 12 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 4037.623621292:main thread: Action 0x688590: queue 0x6886a0 created 4037.623632775:main thread: cfline: 'news.crit /var/log/news/news.crit' 4037.623642228:main thread: selector line successfully processed 4037.623651312:main thread: - traditional PRI filter 4037.623660168:main thread: symbolic name: crit ==> 2 4037.623671004:main thread: symbolic name: news ==> 56 4037.623692517:main thread: tried selector action for builtin-file: 0 4037.623701901:main thread: Module builtin-file processed this config line. 4037.623711765:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 4037.623723191:main thread: action 13 queue: save on shutdown 1, max disk space allowed 0 4037.623735872:main thread: action 13 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 4037.623747536:main thread: Action 0x688b70: queue 0x688c80 created 4037.623758651:main thread: cfline: 'news.err /var/log/news/news.err' 4037.623767741:main thread: selector line successfully processed 4037.623776690:main thread: - traditional PRI filter 4037.623785240:main thread: symbolic name: err ==> 3 4037.623796478:main thread: symbolic name: news ==> 56 4037.623819517:main thread: tried selector action for builtin-file: 0 4037.623829048:main thread: Module builtin-file processed this config line. 4037.623838879:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 4037.623850438:main thread: action 14 queue: save on shutdown 1, max disk space allowed 0 4037.623862924:main thread: action 14 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 4037.623873871:main thread: Action 0x689150: queue 0x689260 created 4037.623884569:main thread: cfline: 'news.notice -/var/log/news/news.notice' 4037.623893560:main thread: selector line successfully processed 4037.623902664:main thread: - traditional PRI filter 4037.623911415:main thread: symbolic name: notice ==> 5 4037.623922467:main thread: symbolic name: news ==> 56 4037.623942264:main thread: tried selector action for builtin-file: 0 4037.623951402:main thread: Module builtin-file processed this config line. 4037.623961122:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 4037.623972360:main thread: action 15 queue: save on shutdown 1, max disk space allowed 0 4037.623985014:main thread: action 15 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 4037.623996926:main thread: Action 0x689730: queue 0x689840 created 4037.624009085:main thread: cfline: '*.=debug;auth,authpriv.none;news.none;mail.none -/var/log/debug' 4037.624018460:main thread: selector line successfully processed 4037.624027550:main thread: - traditional PRI filter 4037.624036394:main thread: symbolic name: debug ==> 7 4037.624047617:main thread: symbolic name: none ==> 16 4037.624058183:main thread: symbolic name: auth ==> 32 4037.624069187:main thread: symbolic name: authpriv ==> 80 4037.624080178:main thread: symbolic name: none ==> 16 4037.624090699:main thread: symbolic name: news ==> 56 4037.624101499:main thread: symbolic name: none ==> 16 4037.624112416:main thread: symbolic name: mail ==> 16 4037.624131976:main thread: tried selector action for builtin-file: 0 4037.624141360:main thread: Module builtin-file processed this config line. 4037.624151527:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 4037.624166254:main thread: action 16 queue: save on shutdown 1, max disk space allowed 0 4037.624179048:main thread: action 16 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 4037.624203996:main thread: Action 0x689d10: queue 0x689e20 created 4037.624216560:main thread: cfline: '*.=info;*.=notice;*.=warn;auth,authpriv.none;cron,daemon.none;mail,news.none -/var/log/messages' 4037.624225710:main thread: selector line successfully processed 4037.624234992:main thread: - traditional PRI filter 4037.624243941:main thread: symbolic name: info ==> 6 4037.624255317:main thread: symbolic name: notice ==> 5 4037.624266620:main thread: symbolic name: warn ==> 4 4037.624277663:main thread: symbolic name: none ==> 16 4037.624288730:main thread: symbolic name: auth ==> 32 4037.624299497:main thread: symbolic name: authpriv ==> 80 4037.624310429:main thread: symbolic name: none ==> 16 4037.624321088:main thread: symbolic name: cron ==> 72 4037.624331828:main thread: symbolic name: daemon ==> 24 4037.624342664:main thread: symbolic name: none ==> 16 4037.624353199:main thread: symbolic name: mail ==> 16 4037.624363960:main thread: symbolic name: news ==> 56 4037.624383361:main thread: tried selector action for builtin-file: 0 4037.624392931:main thread: Module builtin-file processed this config line. 4037.624402870:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 4037.624414390:main thread: action 17 queue: save on shutdown 1, max disk space allowed 0 4037.624427209:main thread: action 17 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 4037.624438942:main thread: Action 0x68a2f0: queue 0x68a400 created 4037.624450554:main thread: cfline: '*.emerg *' 4037.624459350:main thread: selector line successfully processed 4037.624468485:main thread: - traditional PRI filter 4037.624477275:main thread: symbolic name: emerg ==> 0 4037.624489113:main thread: tried selector action for builtin-file: -2001 4037.624498587:main thread: tried selector action for builtin-fwd: -2001 4037.624509258:main thread: tried selector action for builtin-shell: -2001 4037.624519854:main thread: tried selector action for builtin-discard: -2001 4037.624531161:main thread: write-alltried selector action for builtin-usrmsg: 0 4037.624543715:main thread: Module builtin-usrmsg processed this config line. 4037.624553426:main thread: template: ' WallFmt' assigned 4037.624568261:main thread: action 18 queue: save on shutdown 1, max disk space allowed 0 4037.624581266:main thread: action 18 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 4037.624592975:main thread: Action 0x68ad40: queue 0x68af50 created 4037.624608143:main thread: cfline: 'daemon.*;mail.*;news.err;*.=debug;*.=info;*.=notice;*.=warn |/dev/xconsole' 4037.624617917:main thread: selector line successfully processed 4037.624627063:main thread: - traditional PRI filter 4037.624635829:main thread: symbolic name: * ==> 255 4037.624646719:main thread: symbolic name: daemon ==> 24 4037.624657687:main thread: symbolic name: * ==> 255 4037.624668442:main thread: symbolic name: mail ==> 16 4037.624679359:main thread: symbolic name: err ==> 3 4037.624689994:main thread: symbolic name: news ==> 56 4037.624700698:main thread: symbolic name: debug ==> 7 4037.624711852:main thread: symbolic name: info ==> 6 4037.624722777:main thread: symbolic name: notice ==> 5 4037.624733886:main thread: symbolic name: warn ==> 4 4037.624753131:main thread: Error opening log file: /dev/xconsole 4037.624764081:main thread: Called LogError, msg: /dev/xconsole rsyslogd: /dev/xconsole 4037.624834841:main thread: tried selector action for builtin-file: 0 4037.624844138:main thread: Module builtin-file processed this config line. 4037.624854248:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 4037.624866050:main thread: action 19 queue: save on shutdown 1, max disk space allowed 0 4037.624878512:main thread: action 19 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 4037.624889537:main thread: Action 0x68c870: queue 0x68c980 created 4037.624901089:main thread: selector line successfully processed 4037.624925545:main thread: main queue: is NOT disk-assisted 4037.624949380:main thread: main queue: type 0, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 4037.624967146:main thread: main queue:Reg: finalizing construction of worker thread pool 4037.624985371:main thread: main queue:Reg/w0: finalizing construction of worker instance data 4037.624994322:main thread: main queue: queue starts up without (loading) any DA disk state (this is normal for the DA queue itself!) 4037.625008485:main thread: main queue:Reg: high activity - starting 1 additional worker thread(s). 4037.625021502:main thread: main queue:Reg/w0: receiving command 2 4037.625062410:main thread: main queue:Reg: started with state 0, num workers now 1 4037.625097359:main thread: Main processing queue is initialized and running 4037.625132246:main thread: Opened UNIX socket '/dev/log' (fd 3). 4037.625198155:main thread: main queue: entry added, size now 1 entries 4037.625212867:main thread: wtpAdviseMaxWorkers signals busy 4037.625224705:main thread: main queue: EnqueueMsg advised worker start 4037.625241685:40800950: main queue:Reg/w0: receiving command 4 4037.625272671:imuxsock.c: --------imuxsock calling select, active file descriptors (max 3): 3 4037.625309667:main thread: Active selectors: 4037.625319477:main thread: Selector 1: 4037.625327307:main thread: X X FF X X X X X X X X X X X X X X X X X X X X X X Actions: 4037.625400575:main thread: builtin-fwd: Instance data: 0x680d20 4037.625426870:main thread: RepeatedMsgReduction: 0 4037.625435459:main thread: Resume Interval: 30 4037.625443472:main thread: Suspended: 0 4037.625454034:main thread: Disabled: 0 4037.625462161:main thread: Exec only when previous is suspended: 0 4037.625470180:main thread: 4037.625477854:main thread: 4037.625486236:main thread: builtin-file: Instance data: 0x684870 4037.625499685:main thread: RepeatedMsgReduction: 0 4037.625508049:main thread: Resume Interval: 30 4037.625516113:main thread: Suspended: 0 4037.625526223:main thread: Disabled: 0 4037.625534110:main thread: Exec only when previous is suspended: 1 4037.625542227:main thread: 4037.625549973:main thread: 4037.625558091:main thread: 4037.625565903:main thread: Selector 2: 4037.625573421:main thread: X X X X FF X X X X X FF X X X 4037.625647001:main queue:Reg/w0: main queue: entry deleted, state 0, size now 0 entries 4037.625668214:main queue:Reg/w0: Called action, logging to builtin-file 4037.625702210:main queue:Reg/w0: (/var/log/syslog) From rgerhards at hq.adiscon.com Fri Jan 16 17:54:29 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Fri, 16 Jan 2009 17:54:29 +0100 Subject: [rsyslog] rsyslog still crashes In-Reply-To: References: <1232046859.22744.39.camel@localhost.localdomain><577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9C2@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9C8@grfint2.intern.adiscon.com> Message-ID: <577465F99B41C842AAFBE9ED71E70ABA44F9CA@grfint2.intern.adiscon.com> OK, maybe we can simplify the config, that would remove code pathes from the potential bug candidate list. Could you comment out all the $ActionQueue* settings? Rainer > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of Lorenzo M. Catucci > Sent: Friday, January 16, 2009 5:52 PM > To: rsyslog-users > Subject: Re: [rsyslog] rsyslog still crashes > > On Fri, 16 Jan 2009, Rainer Gerhards wrote: > > RG> Lorenzo, > RG> > RG> one thing: can you change the actionqueuemode to "direct" just for > a > RG> short period. I would be very interested to see what happens. > RG> > > Very short period... it crashed about as soon as started... I'm > enclosing > both the log and the backtrace. > > See you soon, > > lorenzo > > > +-------------------------+-------------------------------------------- > --+ > | Lorenzo M. Catucci | Centro di Calcolo e Documentazione > | > | catucci at ccd.uniroma2.it | Universit? degli Studi di Roma "Tor > Vergata" | > | | Via O. Raimondo 18 ** I-00173 ROMA ** > ITALY | > | Tel. +39 06 7259 2255 | Fax. +39 06 7259 2125 > | > +-------------------------+-------------------------------------------- > --+ From lorenzo at sancho.ccd.uniroma2.it Fri Jan 16 18:07:50 2009 From: lorenzo at sancho.ccd.uniroma2.it (Lorenzo M. Catucci) Date: Fri, 16 Jan 2009 18:07:50 +0100 (CET) Subject: [rsyslog] rsyslog still crashes In-Reply-To: <577465F99B41C842AAFBE9ED71E70ABA44F9CA@grfint2.intern.adiscon.com> References: <1232046859.22744.39.camel@localhost.localdomain><577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9C2@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9C8@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44F9CA@grfint2.intern.adiscon.com> Message-ID: On Fri, 16 Jan 2009, Rainer Gerhards wrote: RG> OK, maybe we can simplify the config, that would remove code pathes RG> from the potential bug candidate list. Could you comment out all the RG> $ActionQueue* settings? RG> Done, it's still crashing immediately! Here are the logs. lorenzo +-------------------------+----------------------------------------------+ | Lorenzo M. Catucci | Centro di Calcolo e Documentazione | | catucci at ccd.uniroma2.it | Universit? degli Studi di Roma "Tor Vergata" | | | Via O. Raimondo 18 ** I-00173 ROMA ** ITALY | | Tel. +39 06 7259 2255 | Fax. +39 06 7259 2125 | +-------------------------+----------------------------------------------+ -------------- next part -------------- GNU gdb 6.8-debian Copyright (C) 2008 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-linux-gnu"... Reading symbols from /usr/lib/libz.so.1...done. Loaded symbols for /usr/lib/libz.so.1 Reading symbols from /lib/libpthread.so.0...Reading symbols from /usr/lib/debug/lib/libpthread-2.7.so...done. done. Loaded symbols for /lib/libpthread.so.0 Reading symbols from /lib/libdl.so.2...Reading symbols from /usr/lib/debug/lib/libdl-2.7.so...done. done. Loaded symbols for /lib/libdl.so.2 Reading symbols from /lib/librt.so.1...Reading symbols from /usr/lib/debug/lib/librt-2.7.so...done. done. Loaded symbols for /lib/librt.so.1 Reading symbols from /lib/libc.so.6...Reading symbols from /usr/lib/debug/lib/libc-2.7.so...done. done. Loaded symbols for /lib/libc.so.6 Reading symbols from /lib/ld-linux-x86-64.so.2...Reading symbols from /usr/lib/debug/lib/ld-2.7.so...done. done. Loaded symbols for /lib64/ld-linux-x86-64.so.2 Reading symbols from /usr/lib/rsyslog/lmnet.so...done. Loaded symbols for /usr/lib/rsyslog/lmnet.so Reading symbols from /lib/libnss_files.so.2...Reading symbols from /usr/lib/debug/lib/libnss_files-2.7.so...done. done. Loaded symbols for /lib/libnss_files.so.2 Reading symbols from /usr/lib/rsyslog/imuxsock.so...done. Loaded symbols for /usr/lib/rsyslog/imuxsock.so Reading symbols from /usr/lib/rsyslog/imklog.so...done. Loaded symbols for /usr/lib/rsyslog/imklog.so Reading symbols from /lib/libnss_compat.so.2...Reading symbols from /usr/lib/debug/lib/libnss_compat-2.7.so...done. done. Loaded symbols for /lib/libnss_compat.so.2 Reading symbols from /lib/libnsl.so.1...Reading symbols from /usr/lib/debug/lib/libnsl-2.7.so...done. done. Loaded symbols for /lib/libnsl.so.1 Reading symbols from /lib/libnss_nis.so.2...Reading symbols from /usr/lib/debug/lib/libnss_nis-2.7.so...done. done. Loaded symbols for /lib/libnss_nis.so.2 Reading symbols from /usr/lib/rsyslog/lmnetstrms.so...done. Loaded symbols for /usr/lib/rsyslog/lmnetstrms.so Reading symbols from /usr/lib/rsyslog/lmtcpclt.so...done. Loaded symbols for /usr/lib/rsyslog/lmtcpclt.so Core was generated by `rsyslogd -c4 -d'. Program terminated with signal 11, Segmentation fault. [New process 4397] [New process 4399] [New process 4398] [New process 4396] #0 0x00002ac177165030 in strlen () from /lib/libc.so.6 (gdb) Thread 4 (process 4396): #0 0x00002ac176acd4c5 in __lll_unlock_wake () from /lib/libpthread.so.0 #1 0x00002ac176ac9ff9 in _L_unlock_56 () from /lib/libpthread.so.0 #2 0x00002ac176ac9c56 in __pthread_mutex_unlock_usercnt () from /lib/libpthread.so.0 #3 0x0000000000422a09 in dbgprint (pObj=, pszMsg=0x7fff34417d50 "FF ", lenMsg=3) at debug.c:157 #4 0x0000000000422c33 in dbgprintf (fmt=) at debug.c:892 #5 0x000000000040d549 in init () at syslogd.c:2209 #6 0x000000000040da79 in mainThread () at syslogd.c:2954 #7 0x000000000040ee96 in realMain (argc=, argv=) at syslogd.c:3631 #8 0x00002ac1771081a6 in __libc_start_main () from /lib/libc.so.6 #9 0x000000000040a259 in _start () Thread 3 (process 4398): #0 0x00002ac1771b2ce2 in select () from /lib/libc.so.6 #1 0x00002ac1778589fd in runInput (pThrd=) at imuxsock.c:280 #2 0x000000000044377f in thrdStarter (arg=0x6a3a30) at ../threads.c:139 #3 0x00002ac176ac6fc7 in start_thread () from /lib/libpthread.so.0 #4 0x00002ac1771b95ad in clone () from /lib/libc.so.6 #5 0x0000000000000000 in ?? () Thread 2 (process 4399): #0 0x00002ac176acd7db in read () from /lib/libpthread.so.0 #1 0x00002ac177a5d1ef in klogLogKMsg () at linux.c:449 #2 0x00002ac177a5c594 in runInput (pThrd=0x6a6a30) at imklog.c:224 #3 0x000000000044377f in thrdStarter (arg=0x6a6a30) at ../threads.c:139 #4 0x00002ac176ac6fc7 in start_thread () from /lib/libpthread.so.0 #5 0x00002ac1771b95ad in clone () from /lib/libc.so.6 #6 0x0000000000000000 in ?? () Thread 1 (process 4397): #0 0x00002ac177165030 in strlen () from /lib/libc.so.6 #1 0x00002ac177131cb1 in vfprintf () from /lib/libc.so.6 #2 0x00002ac177137c08 in fprintf () from /lib/libc.so.6 #3 0x000000000041ce7d in msgDestruct (ppThis=) at msg.c:350 #4 0x000000000044283a in actionCallDoAction (pAction=0x685570, pMsg=0x6a4090) at ../action.c:495 #5 0x0000000000439c9c in qAddDirect (pThis=0x685680, pUsr=0x6a4090) at queue.c:939 #6 0x000000000043dd83 in queueEnqObj (pThis=0x685680, flowCtlType=, pUsr=0x6a4090) at queue.c:1016 #7 0x0000000000442d63 in actionWriteToAction (pAction=0x685570) at ../action.c:672 #8 0x00000000004430d0 in actionCallAction (pAction=0x685570, pMsg=0x6a4090) at ../action.c:778 #9 0x000000000040b307 in processMsgDoActions (pData=0x685570, pParam=0x407ffe90) at syslogd.c:1140 #10 0x000000000041def8 in llExecFunc (pThis=0x6853e0, pFunc=0x40b2b0 , pParam=0x407ffe90) at linkedlist.c:391 #11 0x000000000040ae19 in msgConsumer (notNeeded=, pUsr=) at syslogd.c:1183 #12 0x000000000043c577 in queueConsumerReg (pThis=0x68cb20, pWti=0x6a0ed0, iCancelStateSave=) at queue.c:1598 #13 0x0000000000433050 in wtiWorker (pThis=0x6a0ed0) at wti.c:416 #14 0x00000000004317aa in wtpWorker (arg=0x6a0ed0) at wtp.c:425 #15 0x00002ac176ac6fc7 in start_thread () from /lib/libpthread.so.0 #16 0x00002ac1771b95ad in clone () from /lib/libc.so.6 #17 0x0000000000000000 in ?? () (gdb) quit -------------- next part -------------- 5437.595245610:main thread: Writing pidfile /var/run/rsyslogd.pid. 5437.595686368:main thread: rsyslog 4.1.3 - called init() 5437.595698050:main thread: Unloading non-static modules. 5437.595709554:main thread: module lmnet NOT unloaded because it still has a refcount of 3 5437.595719067:main thread: Clearing templates. 5437.595771624:main thread: cfline: '$ModLoad imuxsock # provides support for local system logging' 5437.595788522:main thread: Requested to load module 'imuxsock' 5437.595799718:main thread: loading module '/usr/lib/rsyslog/imuxsock.so' 5437.595870056:main thread: imuxsock version 4.1.3 initializing 5437.595908971:main thread: module of type 0 being loaded. 5437.595923470:main thread: cfline: '$ModLoad imklog # provides kernel logging support (previously done by rklogd)' 5437.595935908:main thread: Requested to load module 'imklog' 5437.595945421:main thread: loading module '/usr/lib/rsyslog/imklog.so' 5437.596063430:main thread: module of type 0 being loaded. 5437.596081982:main thread: cfline: '$ActionFileDefaultTemplate RSYSLOG_TraditionalFileFormat' 5437.596103387:main thread: cfline: '$FileOwner root' 5437.596366370:main thread: uid 0 obtained for user 'root' 5437.596380167:main thread: cfline: '$FileGroup adm' 5437.596439758:main thread: gid 4 obtained for group 'adm' 5437.596452445:main thread: cfline: '$FileCreateMode 0640' 5437.596466524:main thread: cfline: '$IncludeConfig /etc/rsyslog.d/*.conf' 5437.596530495:main thread: requested to include config file '/etc/rsyslog.d/remote.conf' 5437.596560987:main thread: cfline: '$WorkDirectory /var/log/rsyslog' 5437.596580414:main thread: cfline: '$ActionResumeRetryCount -1 # infinite retries on insert failure' 5437.596596212:main thread: cfline: 'mail.* @@xx.yy.zz.tt:514' 5437.596612292:main thread: - traditional PRI filter 5437.596622579:main thread: symbolic name: * ==> 255 5437.596635854:main thread: symbolic name: mail ==> 16 5437.596652432:main thread: tried selector action for builtin-file: -2001 5437.596668871:main thread: caller requested object 'netstrms', not found (iRet -3003) 5437.596678996:main thread: Requested to load module 'lmnetstrms' 5437.596688740:main thread: loading module '/usr/lib/rsyslog/lmnetstrms.so' 5437.596773657:main thread: module of type 2 being loaded. 5437.596787910:main thread: source file omfwd.c requested reference for module 'lmnetstrms', reference count now 1 5437.596801209:main thread: source file omfwd.c requested reference for module 'lmnetstrms', reference count now 2 5437.596819848:main thread: caller requested object 'tcpclt', not found (iRet -3003) 5437.596830324:main thread: Requested to load module 'lmtcpclt' 5437.596839704:main thread: loading module '/usr/lib/rsyslog/lmtcpclt.so' 5437.596905755:main thread: module of type 2 being loaded. 5437.596919522:main thread: source file omfwd.c requested reference for module 'lmtcpclt', reference count now 1 5437.596932436:main thread: hostname 'xx.yy.zz.tt', port '514' 5437.596953352:main thread: tried selector action for builtin-fwd: 0 5437.596966354:main thread: Module builtin-fwd processed this config line. 5437.596982080:main thread: template: 'RSYSLOG_TraditionalForwardFormat' assigned 5437.597007211:main thread: action 1 queue: save on shutdown 1, max disk space allowed 0 5437.597027685:main thread: action 1 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 5437.597040903:main thread: Action 0x683630: queue 0x683ad0 created 5437.597054232:main thread: cfline: '$ActionExecOnlyWhenPreviousIsSuspended on' 5437.597069310:main thread: cfline: '& /data/var_syslog/failover.log' 5437.597096292:main thread: tried selector action for builtin-file: 0 5437.597106887:main thread: Module builtin-file processed this config line. 5437.597117030:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 5437.597129750:main thread: action 2 queue: save on shutdown 1, max disk space allowed 0 5437.597143076:main thread: action 2 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 5437.597154860:main thread: Action 0x6849d0: queue 0x684c10 created 5437.597184145:main thread: cfline: '$ActionExecOnlyWhenPreviousIsSuspended off' 5437.597199760:main thread: selector line successfully processed 5437.597220784:main thread: cfline: 'auth,authpriv.* /var/log/auth.log' 5437.597232670:main thread: - traditional PRI filter 5437.597241793:main thread: symbolic name: * ==> 255 5437.597253664:main thread: symbolic name: auth ==> 32 5437.597265178:main thread: symbolic name: authpriv ==> 80 5437.597288145:main thread: tried selector action for builtin-file: 0 5437.597298717:main thread: Module builtin-file processed this config line. 5437.597311914:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 5437.597324910:main thread: action 3 queue: save on shutdown 1, max disk space allowed 0 5437.597338377:main thread: action 3 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 5437.597350305:main thread: Action 0x685000: queue 0x6850c0 created 5437.597361812:main thread: cfline: '*.*;auth,authpriv.none -/var/log/syslog' 5437.597371760:main thread: selector line successfully processed 5437.597381303:main thread: - traditional PRI filter 5437.597390459:main thread: symbolic name: * ==> 255 5437.597402201:main thread: symbolic name: none ==> 16 5437.597413358:main thread: symbolic name: auth ==> 32 5437.597424743:main thread: symbolic name: authpriv ==> 80 5437.597445059:main thread: tried selector action for builtin-file: 0 5437.597455309:main thread: Module builtin-file processed this config line. 5437.597465506:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 5437.597477596:main thread: action 4 queue: save on shutdown 1, max disk space allowed 0 5437.597490499:main thread: action 4 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 5437.597501863:main thread: Action 0x685570: queue 0x685680 created 5437.597513116:main thread: cfline: 'daemon.* -/var/log/daemon.log' 5437.597522704:main thread: selector line successfully processed 5437.597532007:main thread: - traditional PRI filter 5437.597540904:main thread: symbolic name: * ==> 255 5437.597552373:main thread: symbolic name: daemon ==> 24 5437.597573067:main thread: tried selector action for builtin-file: 0 5437.597583540:main thread: Module builtin-file processed this config line. 5437.597593506:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 5437.597605173:main thread: action 5 queue: save on shutdown 1, max disk space allowed 0 5437.597618478:main thread: action 5 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 5437.597630567:main thread: Action 0x685b50: queue 0x685c60 created 5437.597641754:main thread: cfline: 'kern.* -/var/log/kern.log' 5437.597651414:main thread: selector line successfully processed 5437.597660795:main thread: - traditional PRI filter 5437.597669852:main thread: symbolic name: * ==> 255 5437.597681123:main thread: symbolic name: kern ==> 0 5437.597705051:main thread: tried selector action for builtin-file: 0 5437.597715490:main thread: Module builtin-file processed this config line. 5437.597725735:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 5437.597737798:main thread: action 6 queue: save on shutdown 1, max disk space allowed 0 5437.597751004:main thread: action 6 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 5437.597762995:main thread: Action 0x686130: queue 0x686240 created 5437.597774134:main thread: cfline: 'lpr.* -/var/log/lpr.log' 5437.597783830:main thread: selector line successfully processed 5437.597793046:main thread: - traditional PRI filter 5437.597801811:main thread: symbolic name: * ==> 255 5437.597813298:main thread: symbolic name: lpr ==> 48 5437.597833524:main thread: tried selector action for builtin-file: 0 5437.597843772:main thread: Module builtin-file processed this config line. 5437.597853705:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 5437.597865321:main thread: action 7 queue: save on shutdown 1, max disk space allowed 0 5437.597890979:main thread: action 7 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 5437.597903456:main thread: Action 0x686710: queue 0x686820 created 5437.597914697:main thread: cfline: 'mail.* -/var/log/mail.log' 5437.597924591:main thread: selector line successfully processed 5437.597934092:main thread: - traditional PRI filter 5437.597943242:main thread: symbolic name: * ==> 255 5437.597954096:main thread: symbolic name: mail ==> 16 5437.597974738:main thread: tried selector action for builtin-file: 0 5437.597985043:main thread: Module builtin-file processed this config line. 5437.597995450:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 5437.598012859:main thread: action 8 queue: save on shutdown 1, max disk space allowed 0 5437.598027103:main thread: action 8 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 5437.598039193:main thread: Action 0x686cf0: queue 0x686e00 created 5437.598050464:main thread: cfline: 'user.* -/var/log/user.log' 5437.598059877:main thread: selector line successfully processed 5437.598069162:main thread: - traditional PRI filter 5437.598078222:main thread: symbolic name: * ==> 255 5437.598089760:main thread: symbolic name: user ==> 8 5437.598110994:main thread: tried selector action for builtin-file: 0 5437.598121194:main thread: Module builtin-file processed this config line. 5437.598130959:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 5437.598142863:main thread: action 9 queue: save on shutdown 1, max disk space allowed 0 5437.598156515:main thread: action 9 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 5437.598168587:main thread: Action 0x687290: queue 0x6873a0 created 5437.598180692:main thread: cfline: 'mail.info -/var/log/mail.info' 5437.598190523:main thread: selector line successfully processed 5437.598199946:main thread: - traditional PRI filter 5437.598208868:main thread: symbolic name: info ==> 6 5437.598220223:main thread: symbolic name: mail ==> 16 5437.598240955:main thread: tried selector action for builtin-file: 0 5437.598251116:main thread: Module builtin-file processed this config line. 5437.598261157:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 5437.598272995:main thread: action 10 queue: save on shutdown 1, max disk space allowed 0 5437.598286279:main thread: action 10 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 5437.598298537:main thread: Action 0x687870: queue 0x687980 created 5437.598309727:main thread: cfline: 'mail.warn -/var/log/mail.warn' 5437.598319450:main thread: selector line successfully processed 5437.598329097:main thread: - traditional PRI filter 5437.598338166:main thread: symbolic name: warn ==> 4 5437.598349602:main thread: symbolic name: mail ==> 16 5437.598369906:main thread: tried selector action for builtin-file: 0 5437.598379983:main thread: Module builtin-file processed this config line. 5437.598389949:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 5437.598405150:main thread: action 11 queue: save on shutdown 1, max disk space allowed 0 5437.598419093:main thread: action 11 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 5437.598430433:main thread: Action 0x687e50: queue 0x687f60 created 5437.598441455:main thread: cfline: 'mail.err /var/log/mail.err' 5437.598450704:main thread: selector line successfully processed 5437.598459923:main thread: - traditional PRI filter 5437.598468857:main thread: symbolic name: err ==> 3 5437.598480887:main thread: symbolic name: mail ==> 16 5437.598501595:main thread: tried selector action for builtin-file: 0 5437.598515449:main thread: Module builtin-file processed this config line. 5437.598525751:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 5437.598537496:main thread: action 12 queue: save on shutdown 1, max disk space allowed 0 5437.598550707:main thread: action 12 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 5437.598573321:main thread: Action 0x688430: queue 0x688540 created 5437.598585363:main thread: cfline: 'news.crit /var/log/news/news.crit' 5437.598595176:main thread: selector line successfully processed 5437.598604833:main thread: - traditional PRI filter 5437.598613572:main thread: symbolic name: crit ==> 2 5437.598624768:main thread: symbolic name: news ==> 56 5437.598647705:main thread: tried selector action for builtin-file: 0 5437.598657971:main thread: Module builtin-file processed this config line. 5437.598668150:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 5437.598680177:main thread: action 13 queue: save on shutdown 1, max disk space allowed 0 5437.598693176:main thread: action 13 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 5437.598705332:main thread: Action 0x688a10: queue 0x688b20 created 5437.598716744:main thread: cfline: 'news.err /var/log/news/news.err' 5437.598726596:main thread: selector line successfully processed 5437.598736043:main thread: - traditional PRI filter 5437.598744979:main thread: symbolic name: err ==> 3 5437.598756160:main thread: symbolic name: news ==> 56 5437.598777286:main thread: tried selector action for builtin-file: 0 5437.598787129:main thread: Module builtin-file processed this config line. 5437.598800314:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 5437.598812803:main thread: action 14 queue: save on shutdown 1, max disk space allowed 0 5437.598826177:main thread: action 14 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 5437.598837804:main thread: Action 0x688ff0: queue 0x689100 created 5437.598849081:main thread: cfline: 'news.notice -/var/log/news/news.notice' 5437.598858618:main thread: selector line successfully processed 5437.598867741:main thread: - traditional PRI filter 5437.598876750:main thread: symbolic name: notice ==> 5 5437.598888111:main thread: symbolic name: news ==> 56 5437.598908859:main thread: tried selector action for builtin-file: 0 5437.598919188:main thread: Module builtin-file processed this config line. 5437.598929240:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 5437.598940865:main thread: action 15 queue: save on shutdown 1, max disk space allowed 0 5437.598953981:main thread: action 15 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 5437.598966119:main thread: Action 0x6895d0: queue 0x6896e0 created 5437.598978587:main thread: cfline: '*.=debug;auth,authpriv.none;news.none;mail.none -/var/log/debug' 5437.598988430:main thread: selector line successfully processed 5437.598997799:main thread: - traditional PRI filter 5437.599006913:main thread: symbolic name: debug ==> 7 5437.599018781:main thread: symbolic name: none ==> 16 5437.599030057:main thread: symbolic name: auth ==> 32 5437.599041136:main thread: symbolic name: authpriv ==> 80 5437.599052494:main thread: symbolic name: none ==> 16 5437.599063705:main thread: symbolic name: news ==> 56 5437.599075069:main thread: symbolic name: none ==> 16 5437.599086205:main thread: symbolic name: mail ==> 16 5437.599107133:main thread: tried selector action for builtin-file: 0 5437.599117174:main thread: Module builtin-file processed this config line. 5437.599127409:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 5437.599139610:main thread: action 16 queue: save on shutdown 1, max disk space allowed 0 5437.599152729:main thread: action 16 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 5437.599164081:main thread: Action 0x689bb0: queue 0x689cc0 created 5437.599176261:main thread: cfline: '*.=info;*.=notice;*.=warn;auth,authpriv.none;cron,daemon.none;mail,news.none -/var/log/messages' 5437.599185849:main thread: selector line successfully processed 5437.599198395:main thread: - traditional PRI filter 5437.599207875:main thread: symbolic name: info ==> 6 5437.599231109:main thread: symbolic name: notice ==> 5 5437.599243598:main thread: symbolic name: warn ==> 4 5437.599255067:main thread: symbolic name: none ==> 16 5437.599266446:main thread: symbolic name: auth ==> 32 5437.599277561:main thread: symbolic name: authpriv ==> 80 5437.599294223:main thread: symbolic name: none ==> 16 5437.599305491:main thread: symbolic name: cron ==> 72 5437.599316587:main thread: symbolic name: daemon ==> 24 5437.599327972:main thread: symbolic name: none ==> 16 5437.599338829:main thread: symbolic name: mail ==> 16 5437.599349656:main thread: symbolic name: news ==> 56 5437.599370203:main thread: tried selector action for builtin-file: 0 5437.599380253:main thread: Module builtin-file processed this config line. 5437.599390312:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 5437.599402081:main thread: action 17 queue: save on shutdown 1, max disk space allowed 0 5437.599414977:main thread: action 17 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 5437.599426824:main thread: Action 0x68a190: queue 0x68a2a0 created 5437.599438617:main thread: cfline: '*.emerg *' 5437.599447848:main thread: selector line successfully processed 5437.599457043:main thread: - traditional PRI filter 5437.599465704:main thread: symbolic name: emerg ==> 0 5437.599477968:main thread: tried selector action for builtin-file: -2001 5437.599487949:main thread: tried selector action for builtin-fwd: -2001 5437.599498509:main thread: tried selector action for builtin-shell: -2001 5437.599509125:main thread: tried selector action for builtin-discard: -2001 5437.599520609:main thread: write-alltried selector action for builtin-usrmsg: 0 5437.599533671:main thread: Module builtin-usrmsg processed this config line. 5437.599543706:main thread: template: ' WallFmt' assigned 5437.599558706:main thread: action 18 queue: save on shutdown 1, max disk space allowed 0 5437.599572392:main thread: action 18 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 5437.599584044:main thread: Action 0x68abe0: queue 0x68adf0 created 5437.599599727:main thread: cfline: 'daemon.*;mail.*;news.err;*.=debug;*.=info;*.=notice;*.=warn |/dev/xconsole' 5437.599609933:main thread: selector line successfully processed 5437.599619050:main thread: - traditional PRI filter 5437.599627543:main thread: symbolic name: * ==> 255 5437.599639018:main thread: symbolic name: daemon ==> 24 5437.599650199:main thread: symbolic name: * ==> 255 5437.599661098:main thread: symbolic name: mail ==> 16 5437.599672207:main thread: symbolic name: err ==> 3 5437.599683163:main thread: symbolic name: news ==> 56 5437.599694127:main thread: symbolic name: debug ==> 7 5437.599705530:main thread: symbolic name: info ==> 6 5437.599716852:main thread: symbolic name: notice ==> 5 5437.599728234:main thread: symbolic name: warn ==> 4 5437.599747710:main thread: Error opening log file: /dev/xconsole 5437.599759170:main thread: Called LogError, msg: /dev/xconsole rsyslogd: /dev/xconsole 5437.599828730:main thread: tried selector action for builtin-file: 0 5437.599838531:main thread: Module builtin-file processed this config line. 5437.599848509:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 5437.599860758:main thread: action 19 queue: save on shutdown 1, max disk space allowed 0 5437.599874021:main thread: action 19 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 5437.599885441:main thread: Action 0x68c710: queue 0x68c820 created 5437.599897609:main thread: selector line successfully processed 5437.599922620:main thread: main queue: is NOT disk-assisted 5437.599936522:main thread: main queue: type 0, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 5437.599955023:main thread: main queue:Reg: finalizing construction of worker thread pool 5437.599972840:main thread: main queue:Reg/w0: finalizing construction of worker instance data 5437.599983459:main thread: main queue: queue starts up without (loading) any DA disk state (this is normal for the DA queue itself!) 5437.600011680:main thread: main queue:Reg: high activity - starting 1 additional worker thread(s). 5437.600025603:main thread: main queue:Reg/w0: receiving command 2 5437.600068026:main thread: main queue:Reg: started with state 0, num workers now 1 5437.600104178:main thread: Main processing queue is initialized and running 5437.600141621:main thread: Opened UNIX socket '/dev/log' (fd 3). 5437.600209693:main thread: main queue: entry added, size now 1 entries 5437.600224753:main thread: wtpAdviseMaxWorkers signals busy 5437.600237254:main thread: main queue: EnqueueMsg advised worker start 5437.600255410:40800950: main queue:Reg/w0: receiving command 4 5437.600288919:imuxsock.c: --------imuxsock calling select, active file descriptors (max 3): 3 5437.600327454:main thread: Active selectors: 5437.600338062:main thread: Selector 1: 5437.600345985:main thread: X X FF X X X X X X X X X X X X X X X X X X X X X X Actions: 5437.600417150:main thread: builtin-fwd: Instance data: 0x680a90 5437.600444615:main thread: RepeatedMsgReduction: 0 5437.600453239:main thread: Resume Interval: 30 5437.600461504:main thread: Suspended: 0 5437.600472064:main thread: Disabled: 0 5437.600480533:main thread: Exec only when previous is suspended: 0 5437.600489317:main thread: 5437.600497369:main thread: 5437.600506120:main thread: builtin-file: Instance data: 0x684710 5437.600520046:main thread: RepeatedMsgReduction: 0 5437.600528425:main thread: Resume Interval: 30 5437.600536885:main thread: Suspended: 0 5437.600547397:main thread: Disabled: 0 5437.600555448:main thread: Exec only when previous is suspended: 1 5437.600563851:main thread: 5437.600571822:main thread: 5437.600579955:main thread: 5437.600587890:main thread: Selector 2: 5437.600595939:main thread: X X X X FF X X X X X FF X X X X X X X X X X X X X X Actions: 5437.600664965:main thread: builtin-file: Instance data: 0x67f920 5437.600677232:main thread: RepeatedMsgReduction: 0 5437.600685740:main thread: Resume Interval: 30 5437.600694011:main thread: Suspended: 0 5437.600704478:main thread: Disabled: 0 5437.600712497:main thread: Exec only when previous is suspended: 0 5437.600720972:main thread: 5437.600728721:main thread: 5437.600736893:main thread: 5437.600744893:main thread: Selector 3: 5437.600752783:main thread: FF FF FF FF X FF FF FF FF FF X FF FF FF FF FF FF FF FF FF FF FF FF 5437.600852964:main queue:Reg/w0: main queue: entry deleted, state 0, size now 0 entries 5437.600874750:main queue:Reg/w0: Called action, logging to builtin-file 5437.600907327:main queue:Reg/w0: (/var/log/syslog) From lorenzo at sancho.ccd.uniroma2.it Fri Jan 16 18:17:30 2009 From: lorenzo at sancho.ccd.uniroma2.it (Lorenzo M. Catucci) Date: Fri, 16 Jan 2009 18:17:30 +0100 (CET) Subject: [rsyslog] rsyslog still crashes In-Reply-To: <577465F99B41C842AAFBE9ED71E70ABA44F9CA@grfint2.intern.adiscon.com> References: <1232046859.22744.39.camel@localhost.localdomain><577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9C2@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9C8@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44F9CA@grfint2.intern.adiscon.com> Message-ID: On Fri, 16 Jan 2009, Rainer Gerhards wrote: RG> OK, maybe we can simplify the config, that would remove code pathes RG> from the potential bug candidate list. Could you comment out all the RG> $ActionQueue* settings? RG> I've just restored the #if 0 in runtime/msg.c; it seems the immediate crashes came from those two lines. Now logging. Servus, lorenzo +-------------------------+----------------------------------------------+ | Lorenzo M. Catucci | Centro di Calcolo e Documentazione | | catucci at ccd.uniroma2.it | Universit? degli Studi di Roma "Tor Vergata" | | | Via O. Raimondo 18 ** I-00173 ROMA ** ITALY | | Tel. +39 06 7259 2255 | Fax. +39 06 7259 2125 | +-------------------------+----------------------------------------------+ From rgerhards at hq.adiscon.com Fri Jan 16 18:21:34 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Fri, 16 Jan 2009 18:21:34 +0100 Subject: [rsyslog] rsyslog still crashes In-Reply-To: References: <1232046859.22744.39.camel@localhost.localdomain><577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9C2@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9C8@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9CA@grfint2.intern.adiscon.com> Message-ID: <577465F99B41C842AAFBE9ED71E70ABA44F9CB@grfint2.intern.adiscon.com> Ah, ok. Side-note: I got my machine up and it is running some test. Unfortunately no aborts so far, but is has only 4 cores... I hope something turns out... Rainer > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of Lorenzo M. Catucci > Sent: Friday, January 16, 2009 6:18 PM > To: rsyslog-users > Subject: Re: [rsyslog] rsyslog still crashes > > On Fri, 16 Jan 2009, Rainer Gerhards wrote: > > RG> OK, maybe we can simplify the config, that would remove code pathes > RG> from the potential bug candidate list. Could you comment out all > the > RG> $ActionQueue* settings? > RG> > > I've just restored the #if 0 in runtime/msg.c; it seems the immediate > crashes came from those two lines. Now logging. > > Servus, > > lorenzo > > > +-------------------------+-------------------------------------------- > --+ > | Lorenzo M. Catucci | Centro di Calcolo e Documentazione > | > | catucci at ccd.uniroma2.it | Universit? degli Studi di Roma "Tor > Vergata" | > | | Via O. Raimondo 18 ** I-00173 ROMA ** > ITALY | > | Tel. +39 06 7259 2255 | Fax. +39 06 7259 2125 > | > +-------------------------+-------------------------------------------- > --+ From lorenzo at sancho.ccd.uniroma2.it Fri Jan 16 18:29:26 2009 From: lorenzo at sancho.ccd.uniroma2.it (Lorenzo M. Catucci) Date: Fri, 16 Jan 2009 18:29:26 +0100 (CET) Subject: [rsyslog] rsyslog still crashes In-Reply-To: <577465F99B41C842AAFBE9ED71E70ABA44F9CB@grfint2.intern.adiscon.com> References: <1232046859.22744.39.camel@localhost.localdomain><577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9C2@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9C8@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9CA@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44F9CB@grfint2.intern.adiscon.com> Message-ID: On Fri, 16 Jan 2009, Rainer Gerhards wrote: RG> Ah, ok. Side-note: I got my machine up and it is running some test. RG> Unfortunately no aborts so far, but is has only 4 cores... I hope RG> something turns out... RG> I think the real problem is in keeping those cores very busy... I'd try to spawn something like 20 loggers each spawning a couple "workers" per second and logging startup/shutdown of any child. Maybe make each worker sleep for a random time before exiting. I don't have any Fedora/RedHat system; if nothing else, I'd suggest doing your tests on a debian/testing system too. Yours, lorenzo PS still running... +-------------------------+----------------------------------------------+ | Lorenzo M. Catucci | Centro di Calcolo e Documentazione | | catucci at ccd.uniroma2.it | Universit? degli Studi di Roma "Tor Vergata" | | | Via O. Raimondo 18 ** I-00173 ROMA ** ITALY | | Tel. +39 06 7259 2255 | Fax. +39 06 7259 2125 | +-------------------------+----------------------------------------------+ From rgerhards at hq.adiscon.com Fri Jan 16 18:30:51 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Fri, 16 Jan 2009 18:30:51 +0100 Subject: [rsyslog] rsyslog still crashes In-Reply-To: References: <1232046859.22744.39.camel@localhost.localdomain><577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9C2@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9C8@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9CA@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9CB@grfint2.intern.adiscon.com> Message-ID: <577465F99B41C842AAFBE9ED71E70ABA44F9CD@grfint2.intern.adiscon.com> > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of Lorenzo M. Catucci > Sent: Friday, January 16, 2009 6:29 PM > To: rsyslog-users > Subject: Re: [rsyslog] rsyslog still crashes > > On Fri, 16 Jan 2009, Rainer Gerhards wrote: > > RG> Ah, ok. Side-note: I got my machine up and it is running some test. > RG> Unfortunately no aborts so far, but is has only 4 cores... I hope > RG> something turns out... > RG> > > I think the real problem is in keeping those cores very busy... I'd try > to > spawn something like 20 loggers each spawning a couple "workers" per > second and logging startup/shutdown of any child. Maybe make each > worker > sleep for a random time before exiting. Good suggestion, thanks. > > I don't have any Fedora/RedHat system; if nothing else, I'd suggest > doing > your tests on a debian/testing system too. That's what I am running on that machine - with components downloaded today. Rainer From david at lang.hm Sat Jan 17 00:26:04 2009 From: david at lang.hm (david at lang.hm) Date: Fri, 16 Jan 2009 15:26:04 -0800 (PST) Subject: [rsyslog] Baclogged files to disk are pretty slow In-Reply-To: <4255c2570901160719o4aa3bc6bk9813225374bfc53c@mail.gmail.com> References: <21149_1232040593_n0FHSsbU031125_96AF20FDF4301D419B33CCE8E3A0132B0AB699D2@SAT4MX07.RACKSPACE.CORP> <4255c2570901151111n6696fbc9md66a30c9bc9b4a10@mail.gmail.com> <19646_1232060570_n0FN2mSc020768_96AF20FDF4301D419B33CCE8E3A0132B0ACECB55@SAT4MX07.RACKSPACE.CORP> <4255c2570901160719o4aa3bc6bk9813225374bfc53c@mail.gmail.com> Message-ID: On Fri, 16 Jan 2009, RB wrote: > On Thu, Jan 15, 2009 at 16:01, Daniel Anson wrote: >> I would hope that there is an easy solution as my next idea is to write >> some type of daemonized process that can insert messages from a pool of >> MySQL connections. I can achieve this in C but would rather hopefully >> find a solution inside of the configuration. > > Short of implementing the queue/worker configuration (no idea how), it > seems the only current option would be to implement something of the > sort, either by an update to the ommysql module (optimal, as it gets > your code supported by someone else for its lifetim) or by some > external program. multiple workers will help if mySQL can handle more transactions at a time if the hit in parallel. the fact that you are doing 4000/sec indicates that you are not doing a fsync for each insert, so it is unlikly to help (if you are fsync limited the data rates are probably gong to be closer to 100-200/sec depending on your drives) > I'd think an optimal external solution would be some sort of > relp2mysql bridge, but suspect that would end up reimplementing a good > chunk of rsyslog. actually, the optimal solution is to modify rsyslog to be able to handle multiple messages at once in the output queues. that is a major effort (2-4 man weeks) that will require a sponser. Once this is implemented I would expect the throughput to go up by 2-3 orders of magnatude for database inserts. David Lang From david at lang.hm Sat Jan 17 03:31:24 2009 From: david at lang.hm (david at lang.hm) Date: Fri, 16 Jan 2009 18:31:24 -0800 (PST) Subject: [rsyslog] rsyslog still crashes In-Reply-To: <1232046366.22744.34.camel@localhost.localdomain> References: <1232046366.22744.34.camel@localhost.localdomain> Message-ID: On Thu, 15 Jan 2009, Rainer Gerhards wrote: > On Fri, 2009-01-16 at 01:20 +0100, Michael Biebl wrote: >> Given the -c4 command line argument, I'd expect it to be 4.1.3. >> >> Sounds familiar to >> http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=509292 (which is >> 3.18.6). >> >> It seems to be a more general problem with multi core (= very fast??) systems. > > Yes, that is what my analysis so far points to. It's also part of the > problem, because I do not have very fast hardware to reproduce the issue > (and it is also not easy to reliably reproduce if you have...). > > I've gotten a couple of reports (I think most on the mailing list) on > such problems and all they have in common is 4+ core machines. > > I'll try to get hold based on what Lorenzo submits. In his environment, > the problem seems to occur most reliably (he probably has the fastest > machine...). > > Lorenzo: details follow soon. I just got some time to work on this sort of thing again. my test system is a 4-socket (dual core) opteron system with 16g of ram I've done a fair amount of stress testing of the system without lockups (around the time the 4.1 branch started) if you can describe a test setup I can see about reproducing it. David Lang From david at lang.hm Sat Jan 17 03:40:22 2009 From: david at lang.hm (david at lang.hm) Date: Fri, 16 Jan 2009 18:40:22 -0800 (PST) Subject: [rsyslog] rsyslog still crashes In-Reply-To: <577465F99B41C842AAFBE9ED71E70ABA44F9C9@grfint2.intern.adiscon.com> References: <1232046859.22744.39.camel@localhost.localdomain><577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9C2@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44F9C8@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44F9C9@grfint2.intern.adiscon.com> Message-ID: On Fri, 16 Jan 2009, Rainer Gerhards wrote: > Lorenzo and others: > > I hopefully got a system today where I can reproduce. I am setting it up right now. I also have written a stub wiki page with information useful to hunt this bug: one other thing that you can do for this sort of thing is to use the amazon cloud. to quote a message from Rob Landley to the linux-kernel mailing list > My friend Mark's been experimenting with the amazon "cloud" thing, > feeding in an image with a qemu instance and distcc+cross-compiler, and > running builds under that. Renting an 8-way ~2.5 ghz server with 7 > gigabytes of ram and 1.6 terabytes of disk is 80 cents/hour through them > plus another few cents/day for bandwidth and persistent storage and > such. That's likely to get cheaper as time goes on. > > We're still planning to buy a build server of our own to have something > in- house, but for running nightly builds it's almost to the point where > depreciation on the hardware is more than buying time from a server > farm. Just _one_ of those 8-way servers is enough hardware to build an > entire distro in an hour or so. > > What this really allows us to do is experiment with "how parallel can we > get our build"? Because renting ten 8-way servers in a cluster is > $8/hour, and distcc already scales trivially over that. Down the road > what Firmware Linux is working towards is multiple qemu instances > running in parallel with a central instance distributing builds to each > one, so each can do its own ./configure in parallel, distribute > compilation to the distccd instances as it has stuff to compile, and > then package up the resulting binary into one of those portage tarballs > and send it back to the central node to install on a network mount that > the lot of 'em can mount as build context, so the packages can get their > dependencies right. (You don't want your build taking place in a > network mount, but your OS being on one you never write to isn't so bad > as long as you have local storage to build in.) > > We'll probably leverage the heck out of Portage for this, and might wind > up modifying it heavily. Dunno yet. (We can even force dependencies on > portage so it doesn't need to calculate 'em, the central node can do > that and then say "you have these packages, _build_"...) > > But yeah, hobbyists with a laptop, network access, and a monthly budget > of $20 can do cluster builds these days. would it make sense to start a fund to pay for some time for you to use like this? David Lang > http://wiki.rsyslog.com/index.php/V3_Race_Condition_Hunt_Page > > Lorenzo, can you please double-check I have used the right config indeed. > > All others: if you can add scenarios/information, please do. I'll try to repro the problem as soon as the system is ready. Hope it will work... > > Rainer > >> -----Original Message----- >> From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- >> bounces at lists.adiscon.com] On Behalf Of Rainer Gerhards >> Sent: Friday, January 16, 2009 5:20 PM >> To: rsyslog-users >> Subject: Re: [rsyslog] rsyslog still crashes >> >> Lorenzo, >> >> one thing: can you change the actionqueuemode to "direct" just for a >> short period. I would be very interested to see what happens. >> >> Rainer >> >>> -----Original Message----- >>> From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- >>> bounces at lists.adiscon.com] On Behalf Of Lorenzo M. Catucci >>> Sent: Friday, January 16, 2009 5:10 PM >>> To: rsyslog-users >>> Subject: Re: [rsyslog] rsyslog still crashes >>> >>> On Fri, 16 Jan 2009, Lorenzo M. Catucci wrote: >>> >>> LMC> >>> LMC> The -n crash was completely silent; the -d run was chatty (as >>> expected); >>> LMC> with stdout redirected, it took a lot more time to crash, but >> here >>> are >>> LMC> both the logfile and the gdb backtrace. >>> LMC> >>> >>> As for the last crash, I found on the screen session the line: >>> >>> rsyslogd: queue.c:1393: queueChkDiscardMsg: Assertion `(unsigned) >>> ((obj_t*)(pUsr))->iObjCooCKiE == (unsigned) 0xBADEFEE' failed. >>> >>> since I forgot redirecting stderr too. >>> >>> Yours, >>> >>> lorenzo >>> >>> +-------------------------+------------------------------------------ >> -- >>> --+ >>> | Lorenzo M. Catucci | Centro di Calcolo e Documentazione >>> | >>> | catucci at ccd.uniroma2.it | Universit? degli Studi di Roma "Tor >>> Vergata" | >>> | | Via O. Raimondo 18 ** I-00173 ROMA ** >>> ITALY | >>> | Tel. +39 06 7259 2255 | Fax. +39 06 7259 2125 >>> | >>> +-------------------------+------------------------------------------ >> -- >>> --+ >> _______________________________________________ >> rsyslog mailing list >> http://lists.adiscon.net/mailman/listinfo/rsyslog >> http://www.rsyslog.com > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com From david at lang.hm Sat Jan 17 03:42:09 2009 From: david at lang.hm (david at lang.hm) Date: Fri, 16 Jan 2009 18:42:09 -0800 (PST) Subject: [rsyslog] rsyslog on AIX Message-ID: we are looking at using rsyslog on AIX and the sysadmins are reporting 'problems getting it to compile' (unfortunantly no details yet) has anyone tried this? David Lang From mbiebl at gmail.com Sat Jan 17 11:10:39 2009 From: mbiebl at gmail.com (Michael Biebl) Date: Sat, 17 Jan 2009 11:10:39 +0100 Subject: [rsyslog] Is rsyslog leaking memory? Message-ID: Hi, I'm running rsyslog 3.20.2 I noticed the following: # /etc/init.d/rsyslog restart VSZ RSS (as reported by ps) 27100 1184 # logger foo 27100 1196 # logger foo (1000x) 27100 1200 # logger foo (1000x) 27100 1204 # logger foo (1000x) 27100 1208 and so on. This made me wonder, if rsyslog is leaking memory somewhere. I also noticed, that for each loaded module, rsyslog resevers exactly 8 Mb of anoymous memory (pmap -d `pgrep rsyslog`) With a couple of loaded modules you easily get over 50Mb VSZ. Michael -- Why is it that all of the instruments seeking intelligent life in the universe are pointed away from Earth? From david at lang.hm Sun Jan 18 01:55:56 2009 From: david at lang.hm (david at lang.hm) Date: Sat, 17 Jan 2009 16:55:56 -0800 (PST) Subject: [rsyslog] Is rsyslog leaking memory? In-Reply-To: References: Message-ID: On Sat, 17 Jan 2009, Michael Biebl wrote: > Hi, > > I'm running rsyslog 3.20.2 > > I noticed the following: > # /etc/init.d/rsyslog restart > VSZ RSS (as reported by ps) > 27100 1184 > # logger foo > 27100 1196 > # logger foo (1000x) > 27100 1200 > # logger foo (1000x) > 27100 1204 > # logger foo (1000x) > 27100 1208 > > and so on. > > > This made me wonder, if rsyslog is leaking memory somewhere. I have run rsyslog through stress tests where I have sent it 1B log messages and do not think that there is a memory leak. what I think that you are seeing is that the default rsyslog memory queue only uses as much ram as it needs to hold the data (even though it's described as a array it seems to grow dynamicly, I'm not sure about it shrinking) when you log a bunch of messages via logger you push data into the array faster then it gets extracted, so it takes more memory (up until you hit the max size of the array, which I think is 1000 entries) > I also noticed, that for each loaded module, rsyslog resevers exactly > 8 Mb of anoymous memory (pmap -d `pgrep rsyslog`) > With a couple of loaded modules you easily get over 50Mb VSZ. I haven't tried doing stuff with different modules, so I don't know about this. David Lang From rgerhards at hq.adiscon.com Sun Jan 18 12:01:53 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Sun, 18 Jan 2009 12:01:53 +0100 Subject: [rsyslog] Is rsyslog leaking memory? In-Reply-To: References: Message-ID: <1232276513.22744.45.camel@localhost.localdomain> On Sat, 2009-01-17 at 16:55 -0800, david at lang.hm wrote: > On Sat, 17 Jan 2009, Michael Biebl wrote: > > > Hi, > > > > I'm running rsyslog 3.20.2 > > > > I noticed the following: > > # /etc/init.d/rsyslog restart > > VSZ RSS (as reported by ps) > > 27100 1184 > > # logger foo > > 27100 1196 > > # logger foo (1000x) > > 27100 1200 > > # logger foo (1000x) > > 27100 1204 > > # logger foo (1000x) > > 27100 1208 > > > > and so on. > > > > > > This made me wonder, if rsyslog is leaking memory somewhere. > > I have run rsyslog through stress tests where I have sent it 1B log > messages and do not think that there is a memory leak. I am using valgrind's excellent memory debugger routinely during development. That brings up leaks and invalid memory access rather quickly. In fact, code quality has much improved when I started to use valgrind routinely roughly a year ago. From time to time I also do specific tests for leaks, both using valgrind and the traditional analysis technics. >From what I have seen so far, I, too, doubt there is a leak. However, there are various levels of testing. For example, the postgres output module and the GSSAPI code is contributed and I do not even have a test environment. So these are not checked using that procedure. The libdbi code is only checked every now and then and not with all backends (e.g. no Oracle at hand ... and so on...). If I ever get over to a full testing suite (no collaborators found so far...), I'll probably be able to do more consitent testing of all modules. > > what I think that you are seeing is that the default rsyslog memory queue > only uses as much ram as it needs to hold the data (even though it's > described as a array it seems to grow dynamicly, I'm not sure about it > shrinking) If you use "fixedarray" mode, the pointer array is allocated statically, no matter how many messages are in the queue. HOWEVER, this is only the pointers, so quite few memory. Actual messages are dynamically allocated and freed when processed - in any mode. > > when you log a bunch of messages via logger you push data into the array > faster then it gets extracted, so it takes more memory (up until you hit > the max size of the array, which I think is 1000 entries) > > > I also noticed, that for each loaded module, rsyslog resevers exactly > > 8 Mb of anoymous memory (pmap -d `pgrep rsyslog`) > > With a couple of loaded modules you easily get over 50Mb VSZ. > > I haven't tried doing stuff with different modules, so I don't know about > this. I am not sure where it comes from, but I'd think into the dlload direction. Could also very well be the runtime stack for each thread (not dug into the details). Rainer From david at lang.hm Sun Jan 18 13:33:16 2009 From: david at lang.hm (david at lang.hm) Date: Sun, 18 Jan 2009 04:33:16 -0800 (PST) Subject: [rsyslog] Is rsyslog leaking memory? In-Reply-To: <1232276513.22744.45.camel@localhost.localdomain> References: <1232276513.22744.45.camel@localhost.localdomain> Message-ID: On Sun, 18 Jan 2009, Rainer Gerhards wrote: > On Sat, 2009-01-17 at 16:55 -0800, david at lang.hm wrote: > >> >> what I think that you are seeing is that the default rsyslog memory queue >> only uses as much ram as it needs to hold the data (even though it's >> described as a array it seems to grow dynamicly, I'm not sure about it >> shrinking) > > If you use "fixedarray" mode, the pointer array is allocated statically, > no matter how many messages are in the queue. HOWEVER, this is only the > pointers, so quite few memory. Actual messages are dynamically allocated > and freed when processed - in any mode. that makes sense. It would be interesting to see what would happen to the enqueue/dequeue timings if the message memory was staticly allocated from what I remember seeing of the memory footprint it does appear as if you allocate the max size for the message each time, not the minimum sized needed to hold the message if that shows a noticable difference it may be worth allocating the memory in chunks substantially larger than a single message David Lang From rgerhards at hq.adiscon.com Sun Jan 18 12:21:24 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Sun, 18 Jan 2009 12:21:24 +0100 Subject: [rsyslog] Is rsyslog leaking memory? In-Reply-To: References: <1232276513.22744.45.camel@localhost.localdomain> Message-ID: <1232277684.22744.48.camel@localhost.localdomain> On Sun, 2009-01-18 at 04:33 -0800, david at lang.hm wrote: > On Sun, 18 Jan 2009, Rainer Gerhards wrote: > > > On Sat, 2009-01-17 at 16:55 -0800, david at lang.hm wrote: > > > >> > >> what I think that you are seeing is that the default rsyslog memory queue > >> only uses as much ram as it needs to hold the data (even though it's > >> described as a array it seems to grow dynamicly, I'm not sure about it > >> shrinking) > > > > If you use "fixedarray" mode, the pointer array is allocated statically, > > no matter how many messages are in the queue. HOWEVER, this is only the > > pointers, so quite few memory. Actual messages are dynamically allocated > > and freed when processed - in any mode. > > that makes sense. It would be interesting to see what would happen to the > enqueue/dequeue timings if the message memory was staticly allocated > > from what I remember seeing of the memory footprint it does appear as if > you allocate the max size for the message each time, not the minimum sized > needed to hold the message > yes, that's right. This is done to prevent an additional copy to clean things up (realloc might work, too) and memory fragmentation. The later is really nasty, I've seen that some memory areas remain allocated for quite some while due to fragmentation. > if that shows a noticable difference it may be worth allocating the memory > in chunks substantially larger than a single message That's a good suggestion. The basic classes are able to trim strings. It may be worth putting a config option into it. The current approach works well for small queues, but obviously does provide sub-optimal performance as soon as the queues grow considerably. So it may even make sense to start trimming messages only after a certain amount of messages are in-queue. Rainer From rgerhards at hq.adiscon.com Sun Jan 18 12:26:51 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Sun, 18 Jan 2009 12:26:51 +0100 Subject: [rsyslog] rsyslog on AIX In-Reply-To: References: Message-ID: <1232278011.22744.50.camel@localhost.localdomain> On Fri, 2009-01-16 at 18:42 -0800, david at lang.hm wrote: > we are looking at using rsyslog on AIX and the sysadmins are reporting > 'problems getting it to compile' (unfortunantly no details yet) > > has anyone tried this? All I know is that it doesn't work. No idea on how hard it is to get this right. Some time ago I was interested in porting (and had the time...) but found neither (virtual) hardware/software nor anyone interested in it. So I dropped the idea. I'd still be very interested in a port, but now unfortunately have much less time... Rainer From david at lang.hm Mon Jan 19 02:30:58 2009 From: david at lang.hm (david at lang.hm) Date: Sun, 18 Jan 2009 17:30:58 -0800 (PST) Subject: [rsyslog] Is rsyslog leaking memory? In-Reply-To: <1232277684.22744.48.camel@localhost.localdomain> References: <1232276513.22744.45.camel@localhost.localdomain> <1232277684.22744.48.camel@localhost.localdomain> Message-ID: On Sun, 18 Jan 2009, Rainer Gerhards wrote: > On Sun, 2009-01-18 at 04:33 -0800, david at lang.hm wrote: >> On Sun, 18 Jan 2009, Rainer Gerhards wrote: >> >>> On Sat, 2009-01-17 at 16:55 -0800, david at lang.hm wrote: >>> >>>> >>>> what I think that you are seeing is that the default rsyslog memory queue >>>> only uses as much ram as it needs to hold the data (even though it's >>>> described as a array it seems to grow dynamicly, I'm not sure about it >>>> shrinking) >>> >>> If you use "fixedarray" mode, the pointer array is allocated statically, >>> no matter how many messages are in the queue. HOWEVER, this is only the >>> pointers, so quite few memory. Actual messages are dynamically allocated >>> and freed when processed - in any mode. >> >> that makes sense. It would be interesting to see what would happen to the >> enqueue/dequeue timings if the message memory was staticly allocated >> >> from what I remember seeing of the memory footprint it does appear as if >> you allocate the max size for the message each time, not the minimum sized >> needed to hold the message >> > yes, that's right. This is done to prevent an additional copy to clean > things up (realloc might work, too) and memory fragmentation. The later > is really nasty, I've seen that some memory areas remain allocated for > quite some while due to fragmentation. > >> if that shows a noticable difference it may be worth allocating the memory >> in chunks substantially larger than a single message > > That's a good suggestion. The basic classes are able to trim strings. It > may be worth putting a config option into it. The current approach works > well for small queues, but obviously does provide sub-optimal > performance as soon as the queues grow considerably. So it may even make > sense to start trimming messages only after a certain amount of messages > are in-queue. I'm not sure that we're saying the same thing. let me try again. what I was thinking was that instead of allocating memory for one message at a time, initially allocate memory for 100 messages, then if this needs to be extended increase the allocation by 50-100%. this minimizes the number of allocations needed and the fragmentation of system memory. just like the fixed-array queue option is significantly faster than the linked list queue option (I assume from a combination of having to chase pointers and allocate/deallocate memory), there may be similar benifits from doing the same thing for the message content itself. David Lang From rgerhards at hq.adiscon.com Sun Jan 18 16:45:56 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Sun, 18 Jan 2009 16:45:56 +0100 Subject: [rsyslog] rsyslog still crashes In-Reply-To: References: <1232046859.22744.39.camel@localhost.localdomain> <577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44F9C2@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44F9C8@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44F9CA@grfint2.intern.adiscon.com> Message-ID: <1232293556.2453.3.camel@rf10up.intern.adiscon.com> Hi Lorenzo, I've gone through the material once more. Indeed, it looks like the previous tests (with the #if 1) were not really useful. Sorry for that. Please let me know the outcome of this run here. Also, I thought about one shot we may give it at reducing complexity. I am not sure if it works out, but if it does, that would be a big benefit. Could you please try the following: Use the master branch (the one you previously used). Reduce rsyslog.conf to just the necessary inputs (ideally only imuxsock) and a SINGLE file writer, no further actions. Let that run and tell us if it aborts, too. If it does, we have outruled a lot of code and we can focus much better in our troubleshooting. On my box, I unfortunately had no success yet in reproducing the issue - even though I put a lot of stress on the machine. Will be trying more today, hopefully that brings up some results... Rainer On Fri, 2009-01-16 at 18:17 +0100, Lorenzo M. Catucci wrote: > On Fri, 16 Jan 2009, Rainer Gerhards wrote: > > RG> OK, maybe we can simplify the config, that would remove code pathes > RG> from the potential bug candidate list. Could you comment out all the > RG> $ActionQueue* settings? > RG> > > I've just restored the #if 0 in runtime/msg.c; it seems the immediate > crashes came from those two lines. Now logging. > > Servus, > > lorenzo > > > +-------------------------+----------------------------------------------+ > | Lorenzo M. Catucci | Centro di Calcolo e Documentazione | > | catucci at ccd.uniroma2.it | Universit? degli Studi di Roma "Tor Vergata" | > | | Via O. Raimondo 18 ** I-00173 ROMA ** ITALY | > | Tel. +39 06 7259 2255 | Fax. +39 06 7259 2125 | > +-------------------------+----------------------------------------------+ > _______________________________________________ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com From rgerhards at hq.adiscon.com Sun Jan 18 16:57:03 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Sun, 18 Jan 2009 16:57:03 +0100 Subject: [rsyslog] increasing alloc performance - was: Is rsyslog leaking memory? In-Reply-To: References: <1232276513.22744.45.camel@localhost.localdomain> <1232277684.22744.48.camel@localhost.localdomain> Message-ID: <1232294223.2453.9.camel@rf10up.intern.adiscon.com> On Sun, 2009-01-18 at 17:30 -0800, david at lang.hm wrote: > I'm not sure that we're saying the same thing. let me try again. You are right, we weren't... > > what I was thinking was that instead of allocating memory for one message > at a time, initially allocate memory for 100 messages, then if this needs > to be extended increase the allocation by 50-100%. this minimizes the > number of allocations needed and the fragmentation of system memory. > > just like the fixed-array queue option is significantly faster than the > linked list queue option (I assume from a combination of having to chase > pointers and allocate/deallocate memory), there may be similar benifits > from doing the same thing for the message content itself. I have to admit I am skeptic about this. The reason is that there are many non-fixed fields within the message and they are allocated as needed (with some initial size that fits most messages, but it may extend on an as-needed basis). So there is no real fixed size for any message. It also depends on template formatting and other factors. I think if I'd try to prealloc at least the initial chunks, I'd probably do pretty much the same that the malloc()/free() runtime does. That, however, will probably be less performant than the runtime is (at least I hope so, these parts of the code should be heavily tweaked). This is also an error-prone task. There may be a compromise in between (e.g. allocating a fixed chunk of message text together with the message blobs), but I still think the necessary complexity is not outweight by similar benefits. All in all, I think, we have seen that the in-user-space computing needs (and malloc counts as such) are not really the bottlenecks. Implementing e.g. a "bunch writer" (which enables submission of multiple messages at once to an action) seems to be (just) equally complex but promises far better results. In any case, I'd finally like to track down that dangling race before I do any further optimization. It looks like Lorenzo seems to have a relatively stable environment for reproduction and I'd like to take advantage of that. Rainer From david at lang.hm Mon Jan 19 09:29:35 2009 From: david at lang.hm (david at lang.hm) Date: Mon, 19 Jan 2009 00:29:35 -0800 (PST) Subject: [rsyslog] increasing alloc performance - was: Is rsyslog leaking memory? In-Reply-To: <1232294223.2453.9.camel@rf10up.intern.adiscon.com> References: <1232276513.22744.45.camel@localhost.localdomain> <1232277684.22744.48.camel@localhost.localdomain> <1232294223.2453.9.camel@rf10up.intern.adiscon.com> Message-ID: On Sun, 18 Jan 2009, Rainer Gerhards wrote: > On Sun, 2009-01-18 at 17:30 -0800, david at lang.hm wrote: >> >> what I was thinking was that instead of allocating memory for one message >> at a time, initially allocate memory for 100 messages, then if this needs >> to be extended increase the allocation by 50-100%. this minimizes the >> number of allocations needed and the fragmentation of system memory. >> >> just like the fixed-array queue option is significantly faster than the >> linked list queue option (I assume from a combination of having to chase >> pointers and allocate/deallocate memory), there may be similar benifits >> from doing the same thing for the message content itself. > > I have to admit I am skeptic about this. The reason is that there are > many non-fixed fields within the message and they are allocated as > needed (with some initial size that fits most messages, but it may > extend on an as-needed basis). So there is no real fixed size for any > message. It also depends on template formatting and other factors. > > I think if I'd try to prealloc at least the initial chunks, I'd probably > do pretty much the same that the malloc()/free() runtime does. That, > however, will probably be less performant than the runtime is (at least > I hope so, these parts of the code should be heavily tweaked). This is > also an error-prone task. > > There may be a compromise in between (e.g. allocating a fixed chunk of > message text together with the message blobs), but I still think the > necessary complexity is not outweight by similar benefits. > > All in all, I think, we have seen that the in-user-space computing needs > (and malloc counts as such) are not really the bottlenecks. Implementing > e.g. a "bunch writer" (which enables submission of multiple messages at > once to an action) seems to be (just) equally complex but promises far > better results. always possible. > In any case, I'd finally like to track down that dangling race before I > do any further optimization. It looks like Lorenzo seems to have a > relatively stable environment for reproduction and I'd like to take > advantage of that. agreed, tracking down a reproducable problem takes precidence over new improvements/tweaks any day. David Lang From rgerhards at hq.adiscon.com Mon Jan 19 10:17:18 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Mon, 19 Jan 2009 10:17:18 +0100 Subject: [rsyslog] rsyslog still crashes In-Reply-To: References: <1232046859.22744.39.camel@localhost.localdomain> <577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44F9C2@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44F9C8@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44F9C9@grfint2.intern.adiscon.com> Message-ID: <1232356638.2536.3.camel@rf10up.intern.adiscon.com> Hi David, On Fri, 2009-01-16 at 18:40 -0800, david at lang.hm wrote: > On Fri, 16 Jan 2009, Rainer Gerhards wrote: > > > Lorenzo and others: > > > > I hopefully got a system today where I can reproduce. I am setting it up right now. I also have written a stub wiki page with information useful to hunt this bug: > > one other thing that you can do for this sort of thing is to use the > amazon cloud. > > to quote a message from Rob Landley to the linux-kernel mailing list > > > My friend Mark's been experimenting with the amazon "cloud" thing, > > feeding in an image with a qemu instance and distcc+cross-compiler, and > > running builds under that. Renting an 8-way ~2.5 ghz server with 7 > > gigabytes of ram and 1.6 terabytes of disk is 80 cents/hour through them > > plus another few cents/day for bandwidth and persistent storage and > > such. That's likely to get cheaper as time goes on. > > > > We're still planning to buy a build server of our own to have something > > in- house, but for running nightly builds it's almost to the point where > > depreciation on the hardware is more than buying time from a server > > farm. Just _one_ of those 8-way servers is enough hardware to build an > > entire distro in an hour or so. > > > > What this really allows us to do is experiment with "how parallel can we > > get our build"? Because renting ten 8-way servers in a cluster is > > $8/hour, and distcc already scales trivially over that. Down the road > > what Firmware Linux is working towards is multiple qemu instances > > running in parallel with a central instance distributing builds to each > > one, so each can do its own ./configure in parallel, distribute > > compilation to the distccd instances as it has stuff to compile, and > > then package up the resulting binary into one of those portage tarballs > > and send it back to the central node to install on a network mount that > > the lot of 'em can mount as build context, so the packages can get their > > dependencies right. (You don't want your build taking place in a > > network mount, but your OS being on one you never write to isn't so bad > > as long as you have local storage to build in.) > > > > We'll probably leverage the heck out of Portage for this, and might wind > > up modifying it heavily. Dunno yet. (We can even force dependencies on > > portage so it doesn't need to calculate 'em, the central node can do > > that and then say "you have these packages, _build_"...) > > > > But yeah, hobbyists with a laptop, network access, and a monthly budget > > of $20 can do cluster builds these days. > > would it make sense to start a fund to pay for some time for you to use > like this? That's a very interesting idea, thanks for sharing. At present, however, I think I'll try to stick with Lorenzo's system, because it seems to be able to somewhat reliable reproduce the issue. My 4 core machine unfortunately runs flawlessly, so I suspect that it really depends on the mix of components, where a fast machine is a necessary perquisite, but not a sufficient one. Some other things seem need to go into the mix and I've unfortunately not yet identified them... But the could sounds like an interesting long-term idea, it would definitely be useful to be able to conduct some testing on high-end machines. Rainer From patrick.shen at net-m.de Mon Jan 19 10:21:19 2009 From: patrick.shen at net-m.de (Patrick Shen) Date: Mon, 19 Jan 2009 17:21:19 +0800 Subject: [rsyslog] A weird issue Message-ID: <4974460F.2040903@net-m.de> Hi all, Recently I encountered a weird problem. Let me explain below: I've a client which is using traditional syslog (NOT rsyslog) app for storing and forwarding logs to loghost. Here are some "snmpd" logs for example: ########################################################################################## Jan 19 10:03:09 athos snmpd[1104]: Connection from UDP: [192.168.23.7]:34289 Jan 19 10:03:09 athos snmpd[1104]: Received SNMP packet(s) from UDP: [192.168.23.7]:34289 Jan 19 10:04:10 athos snmpd[1104]: Connection from UDP: [192.168.23.7]:58181 Jan 19 10:04:10 athos snmpd[1104]: Received SNMP packet(s) from UDP: [192.168.23.7]:58181 Jan 19 10:04:10 athos snmpd[1104]: Connection from UDP: [192.168.23.7]:58181 *Jan 19 10:04:10 athos last message repeated 25 times* ########################################################################################## Please take into account the last line. And I've a loghost host for receiving by using rsyslog v3.20.2 and used following dynamic templates to store logs ########################################################################################## $template d_hosts,"/var/rsyslog/HOSTS/%hostname%/%$year%/%$month%/%syslogfacility-text%_%hostname%_%$year%_%$month%_%$day%.log" ########################################################################################## and also opened debug template by following configures in rsyslog.conf. ########################################################################################## $template DEBUG,"Debug line with all properties:\nFROMHOST: '%FROMHOST%', HOSTNAME: '%HOSTNAME%', PRI: %PRI%,\nsyslogtag '%syslogtag%', programname: '%programname%', APP-NAME: '%APP-NAME%', PROCID: '%PROCID%', MSGID: '%MSGID%', FACILITY-TEXT: '%syslogfacility-text%'\nTIMESTAMP: '%TIMESTAMP%', STRUCTURED-DATA: '%STRUCTURED-DATA%',\nmsg: '%msg%'\nrawmsg: '%rawmsg%'\n\n" *.* -/var/rsyslog/debug;DEBUG # or whatever file you like ########################################################################################## I'm monitoring on the server-side now, and checking the last line by raw message. ########################################################################################## Debug line with all properties: FROMHOST: 'athos', HOSTNAME: '*last*', PRI: 30, syslogtag 'message', programname: 'message', APP-NAME: 'message', PROCID: '-', MSGID: '-', FACILITY-TEXT: 'daemon' TIMESTAMP: 'Jan 19 09:59:09', STRUCTURED-DATA: '-', msg: ' repeated 25 times' rawmsg: '<30>last message repeated 25 times' ########################################################################################## Does anyone has any idea why HOSTNAME property is 'last'? (The timestamp is not important, because these messages occur often). Thanks, Patrick From rgerhards at hq.adiscon.com Mon Jan 19 11:00:27 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Mon, 19 Jan 2009 11:00:27 +0100 Subject: [rsyslog] A weird issue In-Reply-To: <4974460F.2040903@net-m.de> References: <4974460F.2040903@net-m.de> Message-ID: <1232359227.2536.6.camel@rf10up.intern.adiscon.com> On Mon, 2009-01-19 at 17:21 +0800, Patrick Shen wrote: > Hi all, > > Recently I encountered a weird problem. Let me explain below: > > I've a client which is using traditional syslog (NOT rsyslog) app for storing and forwarding > logs to loghost. > > Here are some "snmpd" logs for example: > ########################################################################################## > Jan 19 10:03:09 athos snmpd[1104]: Connection from UDP: [192.168.23.7]:34289 > Jan 19 10:03:09 athos snmpd[1104]: Received SNMP packet(s) from UDP: [192.168.23.7]:34289 > Jan 19 10:04:10 athos snmpd[1104]: Connection from UDP: [192.168.23.7]:58181 > Jan 19 10:04:10 athos snmpd[1104]: Received SNMP packet(s) from UDP: [192.168.23.7]:58181 > Jan 19 10:04:10 athos snmpd[1104]: Connection from UDP: [192.168.23.7]:58181 > *Jan 19 10:04:10 athos last message repeated 25 times* > ########################################################################################## > > Please take into account the last line. > > And I've a loghost host for receiving by using rsyslog v3.20.2 and used following dynamic templates to > store logs > ########################################################################################## > $template d_hosts,"/var/rsyslog/HOSTS/%hostname%/%$year%/%$month%/%syslogfacility-text%_%hostname%_%$year%_%$month%_%$day%.log" > ########################################################################################## > > and also opened debug template by following > configures in rsyslog.conf. > ########################################################################################## > $template DEBUG,"Debug line with all properties:\nFROMHOST: '%FROMHOST%', HOSTNAME: '%HOSTNAME%', PRI: %PRI%,\nsyslogtag '%syslogtag%', programname: '%programname%', APP-NAME: '%APP-NAME%', PROCID: > '%PROCID%', MSGID: '%MSGID%', FACILITY-TEXT: '%syslogfacility-text%'\nTIMESTAMP: '%TIMESTAMP%', STRUCTURED-DATA: '%STRUCTURED-DATA%',\nmsg: '%msg%'\nrawmsg: '%rawmsg%'\n\n" > *.* -/var/rsyslog/debug;DEBUG # or whatever file you like > ########################################################################################## > > I'm monitoring on the server-side now, and checking the last line by raw message. > ########################################################################################## > Debug line with all properties: > FROMHOST: 'athos', HOSTNAME: '*last*', PRI: 30, > syslogtag 'message', programname: 'message', APP-NAME: 'message', PROCID: '-', MSGID: '-', FACILITY-TEXT: 'daemon' > TIMESTAMP: 'Jan 19 09:59:09', STRUCTURED-DATA: '-', > msg: ' repeated 25 times' > rawmsg: '<30>last message repeated 25 times' > ########################################################################################## > > Does anyone has any idea why HOSTNAME property is 'last'? (The timestamp is not important, because these messages occur often). Yes, unfortunately ;) The reason simply is that sysklogd does emit malformed messages with the "last message repeated..." line. If you look at a packet capture, you'll see that they do not contain a hostname. What you see in your sysklogd log is a hostname that is locally appended. You can do a similar thing in rsyslog with the fromhost property - it does not contain the hostname but rather the system that send the message. In non-relay cases that should be the same, but in relay scenarios you see only the last hop (thus rsyslog by default uses RFC 3164 format). If you need the relay scenario, there is no way around putting rsyslog on the sending systems, too (or fixing sysklogd, which I guess you need to do yourself or it won't happen...). Rainer From rgerhards at hq.adiscon.com Mon Jan 19 11:10:50 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Mon, 19 Jan 2009 11:10:50 +0100 Subject: [rsyslog] rsyslog 3.20.3 released Message-ID: <577465F99B41C842AAFBE9ED71E70ABA44F9D6@grfint2.intern.adiscon.com> Hi all, Rsyslog 3.20.3, a member of the v3-stable branch, has been released today. It is a bug-fixing release that addresses a potential segfault that could happen if the $AllowedSenders configuration directive is used. It also addresses a doc bug, where the v3-compatibility document had an invalid directive name. This is a recommended update for all users of the v3-stable branch. Change Log: http://www.rsyslog.com/Article339.phtml Download: http://www.rsyslog.com/Downloads-req-viewdownloaddetails-lid-146.phtml I hope this release is useful. Feedback is appreciated. Best regards, Rainer Gerhards From patrick.shen at net-m.de Mon Jan 19 15:21:26 2009 From: patrick.shen at net-m.de (Patrick Shen) Date: Mon, 19 Jan 2009 22:21:26 +0800 Subject: [rsyslog] A weird issue In-Reply-To: <1232359227.2536.6.camel@rf10up.intern.adiscon.com> References: <4974460F.2040903@net-m.de> <1232359227.2536.6.camel@rf10up.intern.adiscon.com> Message-ID: <49748C66.7070102@net-m.de> Rainer Gerhards wrote: > On Mon, 2009-01-19 at 17:21 +0800, Patrick Shen wrote: >> Hi all, >> >> Recently I encountered a weird problem. Let me explain below: >> >> I've a client which is using traditional syslog (NOT rsyslog) app for storing and forwarding >> logs to loghost. >> >> Here are some "snmpd" logs for example: >> ########################################################################################## >> Jan 19 10:03:09 athos snmpd[1104]: Connection from UDP: [192.168.23.7]:34289 >> Jan 19 10:03:09 athos snmpd[1104]: Received SNMP packet(s) from UDP: [192.168.23.7]:34289 >> Jan 19 10:04:10 athos snmpd[1104]: Connection from UDP: [192.168.23.7]:58181 >> Jan 19 10:04:10 athos snmpd[1104]: Received SNMP packet(s) from UDP: [192.168.23.7]:58181 >> Jan 19 10:04:10 athos snmpd[1104]: Connection from UDP: [192.168.23.7]:58181 >> *Jan 19 10:04:10 athos last message repeated 25 times* >> ########################################################################################## >> >> Please take into account the last line. >> >> And I've a loghost host for receiving by using rsyslog v3.20.2 and used following dynamic templates to >> store logs >> ########################################################################################## >> $template d_hosts,"/var/rsyslog/HOSTS/%hostname%/%$year%/%$month%/%syslogfacility-text%_%hostname%_%$year%_%$month%_%$day%.log" >> ########################################################################################## >> >> and also opened debug template by following >> configures in rsyslog.conf. >> ########################################################################################## >> $template DEBUG,"Debug line with all properties:\nFROMHOST: '%FROMHOST%', HOSTNAME: '%HOSTNAME%', PRI: %PRI%,\nsyslogtag '%syslogtag%', programname: '%programname%', APP-NAME: '%APP-NAME%', PROCID: >> '%PROCID%', MSGID: '%MSGID%', FACILITY-TEXT: '%syslogfacility-text%'\nTIMESTAMP: '%TIMESTAMP%', STRUCTURED-DATA: '%STRUCTURED-DATA%',\nmsg: '%msg%'\nrawmsg: '%rawmsg%'\n\n" >> *.* -/var/rsyslog/debug;DEBUG # or whatever file you like >> ########################################################################################## >> >> I'm monitoring on the server-side now, and checking the last line by raw message. >> ########################################################################################## >> Debug line with all properties: >> FROMHOST: 'athos', HOSTNAME: '*last*', PRI: 30, >> syslogtag 'message', programname: 'message', APP-NAME: 'message', PROCID: '-', MSGID: '-', FACILITY-TEXT: 'daemon' >> TIMESTAMP: 'Jan 19 09:59:09', STRUCTURED-DATA: '-', >> msg: ' repeated 25 times' >> rawmsg: '<30>last message repeated 25 times' >> ########################################################################################## >> >> Does anyone has any idea why HOSTNAME property is 'last'? (The timestamp is not important, because these messages occur often). > > Yes, unfortunately ;) The reason simply is that sysklogd does emit > malformed messages with the "last message repeated..." line. If you look > at a packet capture, you'll see that they do not contain a hostname. > What you see in your sysklogd log is a hostname that is locally > appended. Ah, so simple. I'm surprised. Could you please recommend which app for packet capture? And I'd like to share another 2 log examples. ###################################################################################### Debug line with all properties: FROMHOST: 'helios', HOSTNAME: 'helios', PRI: 171, syslogtag '', programname: '', APP-NAME: '', PROCID: '-', MSGID: '-', FACILITY-TEXT: 'local5' TIMESTAMP: 'Jan 19 10:13:13', STRUCTURED-DATA: '-', msg: ' at net.netm.me.coim.GenericImportWorker.run(GenericImportWorker.java:47)' rawmsg: '<171> at net.netm.me.coim.GenericImportWorker.run(GenericImportWorker.java:47)' ###################################################################################### You could see some *spaces* between '<171>' and 'at net ...'. And HOSTNAME propety is "helios". ###################################################################################### Debug line with all properties: FROMHOST: 'helios', HOSTNAME: 'Caused', PRI: 171, syslogtag 'by:', programname: 'by', APP-NAME: 'by', PROCID: '-', MSGID: '-', FACILITY-TEXT: 'local5' TIMESTAMP: 'Jan 19 10:13:13', STRUCTURED-DATA: '-', msg: ' java.sql.BatchUpdateException: Batch entry 0 update item set itm_orderid=3722338, itm_masterorderid=0, refOrderId= 0, itm_name1=Bach: Weihnachtsoratorium, itm_name2=New London Consort, itm_author=NULL, itm_info=/var/APP/ME-utf8/content/ import/Universal-ClassicJazz/MusicDataInProgress/2000000338428, itm_info2=[NEW][ClassicJazz] [CONTENT-OK][CONTENT-320-OK] nullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnulln ullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnu ll[CHECK-MMC][CHECK-AGAIN], itm_lang=NULL, itm_isrc=NULL, itm_grid=NULL, itm_icpn=0028948002795, volume=NULL, track=0, it m_pricegroup=1880, itm_providerid=30000, itm_orderidprovider=0, itm_pricegroupprovider=1363, itm_itemidprovider=NULL, itm _viewable=1, itm_copyrightfree=F, itm_withdrmforwardlock=T, externalinfo=NULL, authorizedAge=0, meanEvaluation=0, numEval uations=0, licenseprovider_id=2131264, importSt' rawmsg: '<171>Caused by: java.sql.BatchUpdateException: Batch entry 0 update item set itm_orderid=3722338, itm_masterorde rid=0, refOrderId=0, itm_name1=Bach: Weihnachtsoratorium, itm_name2=New London Consort, itm_author=NULL, itm_info=/var/AP P/ME-utf8/content/import/Universal-ClassicJazz/MusicDataInProgress/2000000338428, itm_info2=[NEW][ClassicJazz] [CONTENT-O K][CONTENT-320-OK]nullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnul lnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnull nullnullnullnullnull[CHECK-MMC][CHECK-AGAIN], itm_lang=NULL, itm_isrc=NULL, itm_grid=NULL, itm_icpn=0028948002795, volume =NULL, track=0, itm_pricegroup=1880, itm_providerid=30000, itm_orderidprovider=0, itm_pricegroupprovider=1363, itm_itemid provider=NULL, itm_viewable=1, itm_copyrightfree=F, itm_withdrmforwardlock=T, externalinfo=NULL, authorizedAge=0, meanEva luation=0, numEvaluations=0, licenseprovider_id=2131264, importSt' ###################################################################################### But in above example: Word 'Caused' is between '<171>' and 'by ...'. So the HOSTNAME is accidentally set to 'Caused'. I'm wondering if it's a coincidence that if spaces exist between and messages in rawmsg and hostname is not provided, then HOSTNAME will be set correctly? > You can do a similar thing in rsyslog with the fromhost property - it > does not contain the hostname but rather the system that send the > message. In non-relay cases that should be the same, but in relay > scenarios you see only the last hop (thus rsyslog by default uses RFC > 3164 format). And I thought I could use 'FROMHOST' property, but I have another scenario. ###################################################################################### Debug line with all properties: FROMHOST: '172.20.101.6', HOSTNAME: 'icarus', PRI: 174, syslogtag 'httpd8330.sms:', programname: 'httpd8330.sms', APP-NAME: 'httpd8330.sms', PROCID: '-', MSGID: '-', FACILITY-TEXT: 'local5' TIMESTAMP: 'Jan 19 15:14:50', STRUCTURED-DATA: '-', msg: ' xxx.xxx.internal - - [19/Jan/2009:15:14:50 +0100] "GET /itransport/mbg/mbg/io/mbg?provider=TMOBILE_XTC_3ABO_LIVE&request-type=chargeSubscription&critialdata&originator-id=PGW_6686//S0002865748&service-type=web&payment-type=subscr&amount=299&subscription-amount=299&item-amount=299&msisdn=00491704127650&subscription-id=1662126457&subscription-type=2&reply-path=http://pgw:8330/sms/pgw/intern/ReportReception&sms-text=Dein+Abo+wurde+mit+2.99+Euro+gebucht. HTTP/1.1" 200 87#012' rawmsg: '<174>2009-01-19T15:14:50.923441+01:00 icarus httpd8330.sms: xxx.xxx.internal - - [19/Jan/2009:15:14:50 +0100] "GET /itransport/mbg/mbg/io/mbg?provider=TMOBILE_XTC_3ABO_LIVE&request-type=chargeSubscription&critialdata&originator-id=PGW_6686//S0002865748&service-type=web&payment-type=subscr&amount=299&subscription-amount=299&item-amount=299&msisdn=00491704127650&subscription-id=1662126457&subscription-type=2&reply-path=http://pgw:8330/sms/pgw/intern/ReportReception&sms-text=Dein+Abo+wurde+mit+2.99+Euro+gebucht. HTTP/1.1" 200 87#012' ###################################################################################### You could see in HOSTNAME field, it's correct set to 'icarus'. But in FROMHOST field is ip address. And I do have reverse zone for that ip in dns setting. Any ideas? > If you need the relay scenario, there is no way around putting rsyslog > on the sending systems, too (or fixing sysklogd, which I guess you need > to do yourself or it won't happen...). > > Rainer Thanks a lot for your information. Best regards, Patrick From jules at visionintel.com Mon Jan 19 15:23:27 2009 From: jules at visionintel.com (Jules Pagna Disso) Date: Mon, 19 Jan 2009 14:23:27 +0000 Subject: [rsyslog] client Message-ID: <69544300901190623g77d5f317n885a1e923f134f9f@mail.gmail.com> hi there, Is there an example of client sending alert to syslog? is it possible to create and send an alert from the command prompt to syslog? thanks, Jules From patrick.shen at net-m.de Mon Jan 19 15:48:11 2009 From: patrick.shen at net-m.de (Patrick Shen) Date: Mon, 19 Jan 2009 22:48:11 +0800 Subject: [rsyslog] client In-Reply-To: <69544300901190623g77d5f317n885a1e923f134f9f@mail.gmail.com> References: <69544300901190623g77d5f317n885a1e923f134f9f@mail.gmail.com> Message-ID: <497492AB.5030901@net-m.de> Jules Pagna Disso wrote: > hi there, > > Is there an example of client sending alert to syslog? > > is it possible to create and send an alert from the command prompt to > syslog? > > thanks, > Jules Do you mean 'logger' ? Try 'man logger'. Best regards, Patrick From lists at luigirosa.com Mon Jan 19 15:45:46 2009 From: lists at luigirosa.com (Luigi Rosa) Date: Mon, 19 Jan 2009 15:45:46 +0100 Subject: [rsyslog] client In-Reply-To: <69544300901190623g77d5f317n885a1e923f134f9f@mail.gmail.com> References: <69544300901190623g77d5f317n885a1e923f134f9f@mail.gmail.com> Message-ID: <4974921A.5040108@luigirosa.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Jules Pagna Disso said the following on 19/01/09 15:23: > Is there an example of client sending alert to syslog? You mean something like the logger utility? http://linux.about.com/library/cmd/blcmdl1_logger.htm Ciao, luigi - -- / +--[Luigi Rosa]-- \ She was a lovely girl. Our courtship was fast and furious. I was fast and she was furious. --Max Kauffmann -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iEYEARECAAYFAkl0khQACgkQ3kWu7Tfl6ZTTtwCgrgL4RTPoLiZoKaa0uw2mz9y/ KAYAnj/1BMfinxINNSgttd9TIOGfi/z4 =LxGV -----END PGP SIGNATURE----- From mrdemeanour at jackpot.uk.net Mon Jan 19 15:46:14 2009 From: mrdemeanour at jackpot.uk.net (Mr. Demeanour) Date: Mon, 19 Jan 2009 14:46:14 +0000 Subject: [rsyslog] client In-Reply-To: <69544300901190623g77d5f317n885a1e923f134f9f@mail.gmail.com> References: <69544300901190623g77d5f317n885a1e923f134f9f@mail.gmail.com> Message-ID: <49749236.6060108@jackpot.uk.net> Jules Pagna Disso wrote: > hi there, > > Is there an example of client sending alert to syslog? > > is it possible to create and send an alert from the command prompt to > syslog? Try: $ logger "Test log message" Regards, Jack. From rgerhards at hq.adiscon.com Mon Jan 19 14:45:41 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Mon, 19 Jan 2009 14:45:41 +0100 Subject: [rsyslog] A weird issue In-Reply-To: <49748C66.7070102@net-m.de> References: <4974460F.2040903@net-m.de> <1232359227.2536.6.camel@rf10up.intern.adiscon.com> <49748C66.7070102@net-m.de> Message-ID: <1232372741.2536.15.camel@rf10up.intern.adiscon.com> On Mon, 2009-01-19 at 22:21 +0800, Patrick Shen wrote: > >> Does anyone has any idea why HOSTNAME property is 'last'? (The timestamp is not important, because these messages occur often). > > > > Yes, unfortunately ;) The reason simply is that sysklogd does emit > > malformed messages with the "last message repeated..." line. If you look > > at a packet capture, you'll see that they do not contain a hostname. > > What you see in your sysklogd log is a hostname that is locally > > appended. > > Ah, so simple. I'm surprised. Could you please recommend which app for packet capture? Actually, I should have read your mail more careful. You already use rawmsg, which is the second best thing after the packet capture. But in this case, you'll see exactly the same thing (if you don't trust me, use WireShark, an excellent open source capture app). Look at this: rawmsg: '<171> at net.netm.me.coim.GenericImportWorker.run(GenericImportWorker.java:47)' Compare that the the header that is describe in RFC 3164 and you will see that there is nothing close to a real header inside that message. As the message is malformed, funny things can happen. In other words, results are unpredictable, and this is what you are seeing. > > And I'd like to share another 2 log examples. > > ###################################################################################### > Debug line with all properties: > FROMHOST: 'helios', HOSTNAME: 'helios', PRI: 171, > syslogtag '', programname: '', APP-NAME: '', PROCID: '-', MSGID: '-', FACILITY-TEXT: 'local5' > TIMESTAMP: 'Jan 19 10:13:13', STRUCTURED-DATA: '-', > msg: ' at net.netm.me.coim.GenericImportWorker.run(GenericImportWorker.java:47)' > rawmsg: '<171> at net.netm.me.coim.GenericImportWorker.run(GenericImportWorker.java:47)' > ###################################################################################### > > You could see some *spaces* between '<171>' and 'at net ...'. And HOSTNAME propety is "helios". > > > ###################################################################################### > Debug line with all properties: > FROMHOST: 'helios', HOSTNAME: 'Caused', PRI: 171, > syslogtag 'by:', programname: 'by', APP-NAME: 'by', PROCID: '-', MSGID: '-', FACILITY-TEXT: 'local5' > TIMESTAMP: 'Jan 19 10:13:13', STRUCTURED-DATA: '-', > msg: ' java.sql.BatchUpdateException: Batch entry 0 update item set itm_orderid=3722338, itm_masterorderid=0, refOrderId= > 0, itm_name1=Bach: Weihnachtsoratorium, itm_name2=New London Consort, itm_author=NULL, itm_info=/var/APP/ME-utf8/content/ > import/Universal-ClassicJazz/MusicDataInProgress/2000000338428, itm_info2=[NEW][ClassicJazz] [CONTENT-OK][CONTENT-320-OK] > nullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnulln > ullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnu > ll[CHECK-MMC][CHECK-AGAIN], itm_lang=NULL, itm_isrc=NULL, itm_grid=NULL, itm_icpn=0028948002795, volume=NULL, track=0, it > m_pricegroup=1880, itm_providerid=30000, itm_orderidprovider=0, itm_pricegroupprovider=1363, itm_itemidprovider=NULL, itm > _viewable=1, itm_copyrightfree=F, itm_withdrmforwardlock=T, externalinfo=NULL, authorizedAge=0, meanEvaluation=0, numEval > uations=0, licenseprovider_id=2131264, importSt' > rawmsg: '<171>Caused by: java.sql.BatchUpdateException: Batch entry 0 update item set itm_orderid=3722338, itm_masterorde > rid=0, refOrderId=0, itm_name1=Bach: Weihnachtsoratorium, itm_name2=New London Consort, itm_author=NULL, itm_info=/var/AP > P/ME-utf8/content/import/Universal-ClassicJazz/MusicDataInProgress/2000000338428, itm_info2=[NEW][ClassicJazz] [CONTENT-O > K][CONTENT-320-OK]nullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnul > lnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnull > nullnullnullnullnull[CHECK-MMC][CHECK-AGAIN], itm_lang=NULL, itm_isrc=NULL, itm_grid=NULL, itm_icpn=0028948002795, volume > =NULL, track=0, itm_pricegroup=1880, itm_providerid=30000, itm_orderidprovider=0, itm_pricegroupprovider=1363, itm_itemid > provider=NULL, itm_viewable=1, itm_copyrightfree=F, itm_withdrmforwardlock=T, externalinfo=NULL, authorizedAge=0, meanEva > luation=0, numEvaluations=0, licenseprovider_id=2131264, importSt' > ###################################################################################### > > But in above example: > Word 'Caused' is between '<171>' and 'by ...'. So the HOSTNAME is accidentally set to 'Caused'. > > I'm wondering if it's a coincidence that if spaces exist between and messages in rawmsg and hostname is not provided, > then HOSTNAME will be set correctly? that's probably the case with current code, but I don't guarantee that will stay. Again: invalid format => unpredictable results on all header fields > > > > You can do a similar thing in rsyslog with the fromhost property - it > > does not contain the hostname but rather the system that send the > > message. In non-relay cases that should be the same, but in relay > > scenarios you see only the last hop (thus rsyslog by default uses RFC > > 3164 format). > > And I thought I could use 'FROMHOST' property, but I have another scenario. > > ###################################################################################### > Debug line with all properties: > FROMHOST: '172.20.101.6', HOSTNAME: 'icarus', PRI: 174, > syslogtag 'httpd8330.sms:', programname: 'httpd8330.sms', APP-NAME: 'httpd8330.sms', PROCID: '-', MSGID: '-', FACILITY-TEXT: 'local5' > TIMESTAMP: 'Jan 19 15:14:50', STRUCTURED-DATA: '-', > msg: ' xxx.xxx.internal - - [19/Jan/2009:15:14:50 +0100] "GET > /itransport/mbg/mbg/io/mbg?provider=TMOBILE_XTC_3ABO_LIVE&request-type=chargeSubscription&critialdata&originator-id=PGW_6686//S0002865748&service-type=web&payment-type=subscr&amount=299&subscription-amount=299&item-amount=299&msisdn=00491704127650&subscription-id=1662126457&subscription-type=2&reply-path=http://pgw:8330/sms/pgw/intern/ReportReception&sms-text=Dein+Abo+wurde+mit+2.99+Euro+gebucht. > HTTP/1.1" 200 87#012' > rawmsg: '<174>2009-01-19T15:14:50.923441+01:00 icarus httpd8330.sms: xxx.xxx.internal - - [19/Jan/2009:15:14:50 +0100] "GET > /itransport/mbg/mbg/io/mbg?provider=TMOBILE_XTC_3ABO_LIVE&request-type=chargeSubscription&critialdata&originator-id=PGW_6686//S0002865748&service-type=web&payment-type=subscr&amount=299&subscription-amount=299&item-amount=299&msisdn=00491704127650&subscription-id=1662126457&subscription-type=2&reply-path=http://pgw:8330/sms/pgw/intern/ReportReception&sms-text=Dein+Abo+wurde+mit+2.99+Euro+gebucht. > HTTP/1.1" 200 87#012' > ###################################################################################### > that's a correctly formatted message > You could see in HOSTNAME field, it's correct set to 'icarus'. But in FROMHOST field is ip address. > And I do have reverse zone for that ip in dns setting. Any ideas? To get the name, you indeed need to enable remote lookups. One solution would be to permit different settings for different remote hosts, but that would be a feature request. Would make sense, but I am currently rather busy. If you add it to the bugzilla http://bugzilla.adiscon.com I'll see that I implement it when nothing of higher priority is in front of it. Rainer > > > If you need the relay scenario, there is no way around putting rsyslog > > on the sending systems, too (or fixing sysklogd, which I guess you need > > to do yourself or it won't happen...). > > > > Rainer > > Thanks a lot for your information. > > Best regards, > Patrick > > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com From patrick.shen at net-m.de Tue Jan 20 04:05:20 2009 From: patrick.shen at net-m.de (Patrick Shen) Date: Tue, 20 Jan 2009 11:05:20 +0800 Subject: [rsyslog] A weird issue In-Reply-To: <1232372741.2536.15.camel@rf10up.intern.adiscon.com> References: <4974460F.2040903@net-m.de> <1232359227.2536.6.camel@rf10up.intern.adiscon.com> <49748C66.7070102@net-m.de> <1232372741.2536.15.camel@rf10up.intern.adiscon.com> Message-ID: <49753F70.5050601@net-m.de> Rainer Gerhards wrote: > On Mon, 2009-01-19 at 22:21 +0800, Patrick Shen wrote: >> Ah, so simple. I'm surprised. Could you please recommend which app for packet capture? > > Actually, I should have read your mail more careful. You already use > rawmsg, which is the second best thing after the packet capture. But in > this case, you'll see exactly the same thing (if you don't trust me, use > WireShark, an excellent open source capture app). > > Look at this: > > rawmsg: '<171> at > net.netm.me.coim.GenericImportWorker.run(GenericImportWorker.java:47)' > > Compare that the the header that is describe in RFC 3164 and you will > see that there is nothing close to a real header inside that message. As > the message is malformed, funny things can happen. In other words, > results are unpredictable, and this is what you are seeing. > >> And I'd like to share another 2 log examples. >> >> ###################################################################################### >> Debug line with all properties: >> FROMHOST: 'helios', HOSTNAME: 'helios', PRI: 171, >> syslogtag '', programname: '', APP-NAME: '', PROCID: '-', MSGID: '-', FACILITY-TEXT: 'local5' >> TIMESTAMP: 'Jan 19 10:13:13', STRUCTURED-DATA: '-', >> msg: ' at net.netm.me.coim.GenericImportWorker.run(GenericImportWorker.java:47)' >> rawmsg: '<171> at net.netm.me.coim.GenericImportWorker.run(GenericImportWorker.java:47)' >> ###################################################################################### >> >> You could see some *spaces* between '<171>' and 'at net ...'. And HOSTNAME propety is "helios". >> >> >> ###################################################################################### >> Debug line with all properties: >> FROMHOST: 'helios', HOSTNAME: 'Caused', PRI: 171, >> syslogtag 'by:', programname: 'by', APP-NAME: 'by', PROCID: '-', MSGID: '-', FACILITY-TEXT: 'local5' >> TIMESTAMP: 'Jan 19 10:13:13', STRUCTURED-DATA: '-', >> msg: ' java.sql.BatchUpdateException: Batch entry 0 update item set itm_orderid=3722338, itm_masterorderid=0, refOrderId= >> 0, itm_name1=Bach: Weihnachtsoratorium, itm_name2=New London Consort, itm_author=NULL, itm_info=/var/APP/ME-utf8/content/ >> import/Universal-ClassicJazz/MusicDataInProgress/2000000338428, itm_info2=[NEW][ClassicJazz] [CONTENT-OK][CONTENT-320-OK] >> nullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnulln >> ullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnu >> ll[CHECK-MMC][CHECK-AGAIN], itm_lang=NULL, itm_isrc=NULL, itm_grid=NULL, itm_icpn=0028948002795, volume=NULL, track=0, it >> m_pricegroup=1880, itm_providerid=30000, itm_orderidprovider=0, itm_pricegroupprovider=1363, itm_itemidprovider=NULL, itm >> _viewable=1, itm_copyrightfree=F, itm_withdrmforwardlock=T, externalinfo=NULL, authorizedAge=0, meanEvaluation=0, numEval >> uations=0, licenseprovider_id=2131264, importSt' >> rawmsg: '<171>Caused by: java.sql.BatchUpdateException: Batch entry 0 update item set itm_orderid=3722338, itm_masterorde >> rid=0, refOrderId=0, itm_name1=Bach: Weihnachtsoratorium, itm_name2=New London Consort, itm_author=NULL, itm_info=/var/AP >> P/ME-utf8/content/import/Universal-ClassicJazz/MusicDataInProgress/2000000338428, itm_info2=[NEW][ClassicJazz] [CONTENT-O >> K][CONTENT-320-OK]nullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnul >> lnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnull >> nullnullnullnullnull[CHECK-MMC][CHECK-AGAIN], itm_lang=NULL, itm_isrc=NULL, itm_grid=NULL, itm_icpn=0028948002795, volume >> =NULL, track=0, itm_pricegroup=1880, itm_providerid=30000, itm_orderidprovider=0, itm_pricegroupprovider=1363, itm_itemid >> provider=NULL, itm_viewable=1, itm_copyrightfree=F, itm_withdrmforwardlock=T, externalinfo=NULL, authorizedAge=0, meanEva >> luation=0, numEvaluations=0, licenseprovider_id=2131264, importSt' >> ###################################################################################### >> >> But in above example: >> Word 'Caused' is between '<171>' and 'by ...'. So the HOSTNAME is accidentally set to 'Caused'. >> >> I'm wondering if it's a coincidence that if spaces exist between and messages in rawmsg and hostname is not provided, >> then HOSTNAME will be set correctly? > > that's probably the case with current code, but I don't guarantee that > will stay. Again: invalid format => unpredictable results on all header > fields OK, now I see the malformed format messages will cause unpredictable results in rsyslog. That's quite helpful. >> >> And I thought I could use 'FROMHOST' property, but I have another scenario. >> >> ###################################################################################### >> Debug line with all properties: >> FROMHOST: '172.20.101.6', HOSTNAME: 'icarus', PRI: 174, >> syslogtag 'httpd8330.sms:', programname: 'httpd8330.sms', APP-NAME: 'httpd8330.sms', PROCID: '-', MSGID: '-', FACILITY-TEXT: 'local5' >> TIMESTAMP: 'Jan 19 15:14:50', STRUCTURED-DATA: '-', >> msg: ' xxx.xxx.internal - - [19/Jan/2009:15:14:50 +0100] "GET >> /itransport/mbg/mbg/io/mbg?provider=TMOBILE_XTC_3ABO_LIVE&request-type=chargeSubscription&critialdata&originator-id=PGW_6686//S0002865748&service-type=web&payment-type=subscr&amount=299&subscription-amount=299&item-amount=299&msisdn=00491704127650&subscription-id=1662126457&subscription-type=2&reply-path=http://pgw:8330/sms/pgw/intern/ReportReception&sms-text=Dein+Abo+wurde+mit+2.99+Euro+gebucht. >> HTTP/1.1" 200 87#012' >> rawmsg: '<174>2009-01-19T15:14:50.923441+01:00 icarus httpd8330.sms: xxx.xxx.internal - - [19/Jan/2009:15:14:50 +0100] "GET >> /itransport/mbg/mbg/io/mbg?provider=TMOBILE_XTC_3ABO_LIVE&request-type=chargeSubscription&critialdata&originator-id=PGW_6686//S0002865748&service-type=web&payment-type=subscr&amount=299&subscription-amount=299&item-amount=299&msisdn=00491704127650&subscription-id=1662126457&subscription-type=2&reply-path=http://pgw:8330/sms/pgw/intern/ReportReception&sms-text=Dein+Abo+wurde+mit+2.99+Euro+gebucht. >> HTTP/1.1" 200 87#012' >> ###################################################################################### >> > that's a correctly formatted message > >> You could see in HOSTNAME field, it's correct set to 'icarus'. But in FROMHOST field is ip address. >> And I do have reverse zone for that ip in dns setting. Any ideas? > > To get the name, you indeed need to enable remote lookups. One solution > would be to permit different settings for different remote hosts, but > that would be a feature request. Would make sense, but I am currently > rather busy. If you add it to the bugzilla http://bugzilla.adiscon.com > I'll see that I implement it when nothing of higher priority is in front > of it. I've filed a bugzilla report [1] for your information. Anyway, one more question, if I use rsyslog at the client side, will it avoid malformed/invalid format message sending out? [1]: http://bugzilla.adiscon.com/show_bug.cgi?id=116 Thanks a lot for your help, Patrick From rgerhards at hq.adiscon.com Mon Jan 19 18:16:08 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Mon, 19 Jan 2009 18:16:08 +0100 Subject: [rsyslog] A weird issue In-Reply-To: <49753F70.5050601@net-m.de> References: <4974460F.2040903@net-m.de> <1232359227.2536.6.camel@rf10up.intern.adiscon.com> <49748C66.7070102@net-m.de> <1232372741.2536.15.camel@rf10up.intern.adiscon.com> <49753F70.5050601@net-m.de> Message-ID: <1232385368.2536.22.camel@rf10up.intern.adiscon.com> On Tue, 2009-01-20 at 11:05 +0800, Patrick Shen wrote: > >> But in above example: > >> Word 'Caused' is between '<171>' and 'by ...'. So the HOSTNAME is accidentally set to 'Caused'. > >> > >> I'm wondering if it's a coincidence that if spaces exist between and messages in rawmsg and hostname is not provided, > >> then HOSTNAME will be set correctly? > > > > that's probably the case with current code, but I don't guarantee that > > will stay. Again: invalid format => unpredictable results on all header > > fields > > OK, now I see the malformed format messages will cause unpredictable results in rsyslog. > That's quite helpful. > > >> > >> And I thought I could use 'FROMHOST' property, but I have another scenario. > >> > >> ###################################################################################### > >> Debug line with all properties: > >> FROMHOST: '172.20.101.6', HOSTNAME: 'icarus', PRI: 174, > >> syslogtag 'httpd8330.sms:', programname: 'httpd8330.sms', APP-NAME: 'httpd8330.sms', PROCID: '-', MSGID: '-', FACILITY-TEXT: 'local5' > >> TIMESTAMP: 'Jan 19 15:14:50', STRUCTURED-DATA: '-', > >> msg: ' xxx.xxx.internal - - [19/Jan/2009:15:14:50 +0100] "GET > >> /itransport/mbg/mbg/io/mbg?provider=TMOBILE_XTC_3ABO_LIVE&request-type=chargeSubscription&critialdata&originator-id=PGW_6686//S0002865748&service-type=web&payment-type=subscr&amount=299&subscription-amount=299&item-amount=299&msisdn=00491704127650&subscription-id=1662126457&subscription-type=2&reply-path=http://pgw:8330/sms/pgw/intern/ReportReception&sms-text=Dein+Abo+wurde+mit+2.99+Euro+gebucht. > >> HTTP/1.1" 200 87#012' > >> rawmsg: '<174>2009-01-19T15:14:50.923441+01:00 icarus httpd8330.sms: xxx.xxx.internal - - [19/Jan/2009:15:14:50 +0100] "GET > >> /itransport/mbg/mbg/io/mbg?provider=TMOBILE_XTC_3ABO_LIVE&request-type=chargeSubscription&critialdata&originator-id=PGW_6686//S0002865748&service-type=web&payment-type=subscr&amount=299&subscription-amount=299&item-amount=299&msisdn=00491704127650&subscription-id=1662126457&subscription-type=2&reply-path=http://pgw:8330/sms/pgw/intern/ReportReception&sms-text=Dein+Abo+wurde+mit+2.99+Euro+gebucht. > >> HTTP/1.1" 200 87#012' > >> ###################################################################################### > >> > > that's a correctly formatted message > > > >> You could see in HOSTNAME field, it's correct set to 'icarus'. But in FROMHOST field is ip address. > >> And I do have reverse zone for that ip in dns setting. Any ideas? > > > > To get the name, you indeed need to enable remote lookups. One solution > > would be to permit different settings for different remote hosts, but > > that would be a feature request. Would make sense, but I am currently > > rather busy. If you add it to the bugzilla http://bugzilla.adiscon.com > > I'll see that I implement it when nothing of higher priority is in front > > of it. > > I've filed a bugzilla report [1] for your information. Anyway, one more question, if I use rsyslog at > the client side, will it avoid malformed/invalid format message sending out? I have tweaked the feature request a bit so that it matches the actual request ;) As far as rsyslog on the client side is concerned, you need to do nothing. If you use the default templates, it emits correctly formatted messages. Rainer From rgerhards at hq.adiscon.com Tue Jan 20 14:00:00 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Tue, 20 Jan 2009 14:00:00 +0100 Subject: [rsyslog] Anyone in Computer Forensics? Message-ID: <577465F99B41C842AAFBE9ED71E70ABA44F9EF@grfint2.intern.adiscon.com> Hi all, are there some folks on this list who are working in the computer forensics space? I wonder how syslog, and rsyslog in specific, works in forensics. Most importantly, I am interested in what stops acceptance in the forensics field (or what nurtures it). I am interested in feedback to help shape the medium to long term schedule for rsyslog (including those initiatives that I should learn more about). Any feedback is appreciated. Thanks, Rainer From rgerhards at hq.adiscon.com Tue Jan 20 15:27:57 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Tue, 20 Jan 2009 15:27:57 +0100 Subject: [rsyslog] Is rsyslog leaking memory? In-Reply-To: References: Message-ID: <577465F99B41C842AAFBE9ED71E70ABA44F9F4@grfint2.intern.adiscon.com> FYI: Based on a forum thread, I just created this page: http://wiki.rsyslog.com/index.php/Reducing_memory_usage I think it actually describes the source of the 8MB memory blocks. Rainer > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of Michael Biebl > Sent: Saturday, January 17, 2009 11:11 AM > To: rsyslog-users > Subject: [rsyslog] Is rsyslog leaking memory? > > Hi, > > I'm running rsyslog 3.20.2 > > I noticed the following: > # /etc/init.d/rsyslog restart > VSZ RSS (as reported by ps) > 27100 1184 > # logger foo > 27100 1196 > # logger foo (1000x) > 27100 1200 > # logger foo (1000x) > 27100 1204 > # logger foo (1000x) > 27100 1208 > > and so on. > > > This made me wonder, if rsyslog is leaking memory somewhere. > > I also noticed, that for each loaded module, rsyslog resevers exactly > 8 Mb of anoymous memory (pmap -d `pgrep rsyslog`) > With a couple of loaded modules you easily get over 50Mb VSZ. > > > Michael > -- > Why is it that all of the instruments seeking intelligent life in the > universe are pointed away from Earth? > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com From aoz.syn at gmail.com Tue Jan 20 16:39:34 2009 From: aoz.syn at gmail.com (RB) Date: Tue, 20 Jan 2009 08:39:34 -0700 Subject: [rsyslog] Anyone in Computer Forensics? In-Reply-To: <577465F99B41C842AAFBE9ED71E70ABA44F9EF@grfint2.intern.adiscon.com> References: <577465F99B41C842AAFBE9ED71E70ABA44F9EF@grfint2.intern.adiscon.com> Message-ID: <4255c2570901200739v7b60da13qf8fbf425c83503f8@mail.gmail.com> On Tue, Jan 20, 2009 at 06:00, Rainer Gerhards wrote: > are there some folks on this list who are working in the computer > forensics space? I wonder how syslog, and rsyslog in specific, works in > forensics. Could you clarify what you're asking here? There are two clearly delineated portions of the computer forensics space: that which is analyzed and that which performs the analysis. Are you looking more to improve analysis of rsyslog instances or to integrate into back-end tools? > Most importantly, I am interested in what stops acceptance in > the forensics field (or what nurtures it). I am interested in feedback > to help shape the medium to long term schedule for rsyslog (including > those initiatives that I should learn more about). Law Enforcement. LE is by far the biggest driver in industry acceptance, nearly regardless of technology. The "primary" forensics tool, EnCase, is a perfect example: there are many arguably better products on the market, but because huge numbers of extremely non-technical police officers are comfortable with it (since Guidance gives steep LE discounts), it is by far the biggest player. There isn't a huge amount of logging to be done in the analysis space. Although centralized solutions are becoming more prevalent, most of the critical logs are being (or will be) stored with the encrypted/signed forensic data for non-repudiation. Even so, there is more effort going into improving analysis (carvers, documenting formats, etc.) than building up proper logging and storage. From david at lang.hm Tue Jan 20 20:54:13 2009 From: david at lang.hm (david at lang.hm) Date: Tue, 20 Jan 2009 11:54:13 -0800 (PST) Subject: [rsyslog] Anyone in Computer Forensics? In-Reply-To: <4255c2570901200739v7b60da13qf8fbf425c83503f8@mail.gmail.com> References: <577465F99B41C842AAFBE9ED71E70ABA44F9EF@grfint2.intern.adiscon.com> <4255c2570901200739v7b60da13qf8fbf425c83503f8@mail.gmail.com> Message-ID: On Tue, 20 Jan 2009, RB wrote: > On Tue, Jan 20, 2009 at 06:00, Rainer Gerhards wrote: >> are there some folks on this list who are working in the computer >> forensics space? I wonder how syslog, and rsyslog in specific, works in >> forensics. > > Could you clarify what you're asking here? There are two clearly > delineated portions of the computer forensics space: that which is > analyzed and that which performs the analysis. Are you looking more > to improve analysis of rsyslog instances or to integrate into back-end > tools? > >> Most importantly, I am interested in what stops acceptance in >> the forensics field (or what nurtures it). I am interested in feedback >> to help shape the medium to long term schedule for rsyslog (including >> those initiatives that I should learn more about). I think that what he is asking about is what makes logs acceptable or not acceptable when doing forensics, and what configurations of rsyslog would be acceptable. for example, rsyslog can be configured to use disk-based queues on redundant drives and RELP for network communication, and the result will be that rsyslog is _very_ reliable in terms of preserving messages that get to it (at the cost of performance, but you can throw hardware at it to deal with that) this is probably acceptable as a log for forensics type work. but what about the more normal settings? (tcp or udp network communications with memory-based queues). those settings can loose data, but won't under normal conditions (assuming the network isn't so busy that it drops UDP packets) David Lang From jules at visionintel.com Tue Jan 20 20:14:58 2009 From: jules at visionintel.com (Jules Pagna Disso) Date: Tue, 20 Jan 2009 19:14:58 +0000 Subject: [rsyslog] client In-Reply-To: <497492AB.5030901@net-m.de> References: <69544300901190623g77d5f317n885a1e923f134f9f@mail.gmail.com> <497492AB.5030901@net-m.de> Message-ID: <69544300901201114r7671a2f9t3b8b8f27797b1cd7@mail.gmail.com> hi there, thanks for the answer it helped and does what I wanted. Now, I am wonder if there is a sample code how to send log file from a c/c++ code to syslog deamon. thanks Jules 2009/1/19 Patrick Shen > Jules Pagna Disso wrote: > > hi there, > > > > Is there an example of client sending alert to syslog? > > > > is it possible to create and send an alert from the command prompt to > > syslog? > > > > thanks, > > Jules > > Do you mean 'logger' ? > > Try 'man logger'. > > Best regards, > Patrick > > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com > From danson at rackspace.com Tue Jan 20 20:27:45 2009 From: danson at rackspace.com (Daniel Anson) Date: Tue, 20 Jan 2009 13:27:45 -0600 Subject: [rsyslog] client In-Reply-To: <69544300901201114r7671a2f9t3b8b8f27797b1cd7@mail.gmail.com> References: <69544300901190623g77d5f317n885a1e923f134f9f@mail.gmail.com><497492AB.5030901@net-m.de> <69544300901201114r7671a2f9t3b8b8f27797b1cd7@mail.gmail.com> Message-ID: <3435_1232479837_n0KJUY8N004510_96AF20FDF4301D419B33CCE8E3A0132B0ACED7E8@SAT4MX07.RACKSPACE.CORP> I use this: >gcc -o syslog_write syslog_writer.c >./syslog_writer 300 <-- This is the number of messages it will write #include #include #include int main(int argc, char **argv) { int num_syslogs = atoi(argv[1]), i; openlog("syslog_writer", LOG_CONS | LOG_PID, LOG_LOCAL1); for(i=0; i < num_syslogs; i++) { syslog(LOG_NOTICE, "syslog_writer: log number %d", i); } return(1); } -----Original Message----- From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog-bounces at lists.adiscon.com] On Behalf Of Jules Pagna Disso Sent: Tuesday, January 20, 2009 1:15 PM To: rsyslog-users Subject: Re: [rsyslog] client hi there, thanks for the answer it helped and does what I wanted. Now, I am wonder if there is a sample code how to send log file from a c/c++ code to syslog deamon. thanks Jules 2009/1/19 Patrick Shen > Jules Pagna Disso wrote: > > hi there, > > > > Is there an example of client sending alert to syslog? > > > > is it possible to create and send an alert from the command prompt to > > syslog? > > > > thanks, > > Jules > > Do you mean 'logger' ? > > Try 'man logger'. > > Best regards, > Patrick > > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com > _______________________________________________ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com Confidentiality Notice: This e-mail message (including any attached or embedded documents) is intended for the exclusive and confidential use of the individual or entity to which this message is addressed, and unless otherwise expressly indicated, is confidential and privileged information of Rackspace. Any dissemination, distribution or copying of the enclosed material is prohibited. If you receive this transmission in error, please notify us immediately by e-mail at abuse at rackspace.com, and delete the original message. Your cooperation is appreciated. From aoz.syn at gmail.com Wed Jan 21 18:59:42 2009 From: aoz.syn at gmail.com (RB) Date: Wed, 21 Jan 2009 10:59:42 -0700 Subject: [rsyslog] Anyone in Computer Forensics? In-Reply-To: References: <577465F99B41C842AAFBE9ED71E70ABA44F9EF@grfint2.intern.adiscon.com> <4255c2570901200739v7b60da13qf8fbf425c83503f8@mail.gmail.com> Message-ID: <4255c2570901210959t130dd49oc0402ebe7d8c2b69@mail.gmail.com> On Tue, Jan 20, 2009 at 12:54, wrote: > I think that what he is asking about is what makes logs acceptable or not > acceptable when doing forensics, and what configurations of rsyslog would > be acceptable. That's still unclear as to whether the logging instances are being analyzed or they are part of the analysis process (i.e. logging investigator actions, "interesting" items, etc.). > for example, rsyslog can be configured to use disk-based queues on > redundant drives and RELP for network communication, and the result will > be that rsyslog is _very_ reliable in terms of preserving messages that > get to it (at the cost of performance, but you can throw hardware at it to > deal with that) > > this is probably acceptable as a log for forensics type work. > > but what about the more normal settings? (tcp or udp network > communications with memory-based queues). those settings can loose data, > but won't under normal conditions (assuming the network isn't so busy that > it drops UDP packets) Generally speaking, forensics prefers the "save everything, impossible to lose" approach. A single lost message probably won't break a given case, but the possibility is definitely there. RELP with disk queues on hardware-redundant drives would probably be a good start if you're looking to ease future analysis, but it is my opinion that networked logging of the forensic process is both unlikely and overkill, as most analysis processes want their logs integrated instead of held as a separate source. One item I have had on my wish-list for quite some time is the ability to log directly to a UDF VAT filesystem (incremental writes on write-once optical media). Poor man's WORM, if you will. It would enable physical assurance that log data is unmodified up to the point of compromise. Add in the idea of incremental checksums or signing, and you have an extremely controlled, verifiable log source. Of course, it doesn't have to be solved in rsyslog-space, but it'd definitely be useful. RB From david at lang.hm Wed Jan 21 20:55:25 2009 From: david at lang.hm (david at lang.hm) Date: Wed, 21 Jan 2009 11:55:25 -0800 (PST) Subject: [rsyslog] Anyone in Computer Forensics? In-Reply-To: <4255c2570901210959t130dd49oc0402ebe7d8c2b69@mail.gmail.com> References: <577465F99B41C842AAFBE9ED71E70ABA44F9EF@grfint2.intern.adiscon.com> <4255c2570901200739v7b60da13qf8fbf425c83503f8@mail.gmail.com> <4255c2570901210959t130dd49oc0402ebe7d8c2b69@mail.gmail.com> Message-ID: On Wed, 21 Jan 2009, RB wrote: > On Tue, Jan 20, 2009 at 12:54, wrote: >> I think that what he is asking about is what makes logs acceptable or not >> acceptable when doing forensics, and what configurations of rsyslog would >> be acceptable. > > That's still unclear as to whether the logging instances are being > analyzed or they are part of the analysis process (i.e. logging > investigator actions, "interesting" items, etc.). I think it's the logs being analysed, not logging investigator actions (other than the extent that things the investigators do would be logged if anyone did them) >> for example, rsyslog can be configured to use disk-based queues on >> redundant drives and RELP for network communication, and the result will >> be that rsyslog is _very_ reliable in terms of preserving messages that >> get to it (at the cost of performance, but you can throw hardware at it to >> deal with that) >> >> this is probably acceptable as a log for forensics type work. >> >> but what about the more normal settings? (tcp or udp network >> communications with memory-based queues). those settings can loose data, >> but won't under normal conditions (assuming the network isn't so busy that >> it drops UDP packets) > > Generally speaking, forensics prefers the "save everything, impossible > to lose" approach. A single lost message probably won't break a given > case, but the possibility is definitely there. this is the most paranoid/conservative view, and by this definition there are basicly no logs in existance that meet the forensics requirements > RELP with disk queues > on hardware-redundant drives would probably be a good start if you're > looking to ease future analysis, but it is my opinion that networked > logging of the forensic process is both unlikely and overkill, as most > analysis processes want their logs integrated instead of held as a > separate source. > > One item I have had on my wish-list for quite some time is the ability > to log directly to a UDF VAT filesystem (incremental writes on > write-once optical media). Poor man's WORM, if you will. It would > enable physical assurance that log data is unmodified up to the point > of compromise. Add in the idea of incremental checksums or signing, > and you have an extremely controlled, verifiable log source. Of > course, it doesn't have to be solved in rsyslog-space, but it'd > definitely be useful. frankly, if you really need write-only media, the best thing to do (volume permitting) is to dump to a printer. David Lang From aoz.syn at gmail.com Wed Jan 21 21:59:28 2009 From: aoz.syn at gmail.com (RB) Date: Wed, 21 Jan 2009 13:59:28 -0700 Subject: [rsyslog] Anyone in Computer Forensics? In-Reply-To: References: <577465F99B41C842AAFBE9ED71E70ABA44F9EF@grfint2.intern.adiscon.com> <4255c2570901200739v7b60da13qf8fbf425c83503f8@mail.gmail.com> <4255c2570901210959t130dd49oc0402ebe7d8c2b69@mail.gmail.com> Message-ID: <4255c2570901211259o3b54573dp746a4d8f41efee24@mail.gmail.com> On Wed, Jan 21, 2009 at 12:55, wrote: > this is the most paranoid/conservative view, and by this definition there > are basicly no logs in existance that meet the forensics requirements Rather than set an unattainable standard, my intent was to communicate the conservative approach forensics would rather take. Edge cases and mitigating controls are acceptable as long as they are well-documented - that's basic security practice. I would rather see a solution that has 100 well-documented lossy edge cases than one that claims to be lossless with no proofs to back it. > frankly, if you really need write-only media, the best thing to do (volume > permitting) is to dump to a printer. You may want to recalculate; even 6-point font on large (14.875x11.5") tractor-feed paper only fits ~80MB per 3500-sheet box. Or, put another way, 2 512-byte events per second will burn through a $70 case per day. Or 6.5 reams of US Letter per day. Extremely limited volume. From david at lang.hm Wed Jan 21 23:19:01 2009 From: david at lang.hm (david at lang.hm) Date: Wed, 21 Jan 2009 14:19:01 -0800 (PST) Subject: [rsyslog] Anyone in Computer Forensics? In-Reply-To: <4255c2570901211259o3b54573dp746a4d8f41efee24@mail.gmail.com> References: <577465F99B41C842AAFBE9ED71E70ABA44F9EF@grfint2.intern.adiscon.com> <4255c2570901200739v7b60da13qf8fbf425c83503f8@mail.gmail.com> <4255c2570901210959t130dd49oc0402ebe7d8c2b69@mail.gmail.com> <4255c2570901211259o3b54573dp746a4d8f41efee24@mail.gmail.com> Message-ID: On Wed, 21 Jan 2009, RB wrote: > On Wed, Jan 21, 2009 at 12:55, wrote: >> this is the most paranoid/conservative view, and by this definition there >> are basicly no logs in existance that meet the forensics requirements > > Rather than set an unattainable standard, my intent was to communicate > the conservative approach forensics would rather take. Edge cases and > mitigating controls are acceptable as long as they are well-documented > - that's basic security practice. I would rather see a solution that > has 100 well-documented lossy edge cases than one that claims to be > lossless with no proofs to back it. the problem is that so many forensics people list the perfect situation and tell people that anything less won't stand up in court. like everything else, it's a reliability/performance/cost trade-off but we really aren't answering the initial question here (or rather we are demonstrating that there isn't a clear answer to the question) >> franklk, if you really need write-only media, the best thing to do (volume >> permitting) is to dump to a printer. > > You may want to recalculate; even 6-point font on large (14.875x11.5") > tractor-feed paper only fits ~80MB per 3500-sheet box. Or, put > another way, 2 512-byte events per second will burn through a $70 case > per day. Or 6.5 reams of US Letter per day. Extremely limited > volume. that's why I said volume permitting (and for your most critical logs the volume is probably fairly low) David Lang From rgerhards at hq.adiscon.com Wed Jan 21 22:21:08 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Wed, 21 Jan 2009 22:21:08 +0100 Subject: [rsyslog] Anyone in Computer Forensics? In-Reply-To: <4255c2570901211259o3b54573dp746a4d8f41efee24@mail.gmail.com> References: <577465F99B41C842AAFBE9ED71E70ABA44F9EF@grfint2.intern.adiscon.com><4255c2570901200739v7b60da13qf8fbf425c83503f8@mail.gmail.com><4255c2570901210959t130dd49oc0402ebe7d8c2b69@mail.gmail.com> <4255c2570901211259o3b54573dp746a4d8f41efee24@mail.gmail.com> Message-ID: <577465F99B41C842AAFBE9ED71E70ABA44FA0B@grfint2.intern.adiscon.com> Hi all, Sorry for posting the question and then being offline. I had a meeting and was after that a bit more swamped than I expected ;) Thanks for the good answers so far. My question was vague, but that reflected that I actually do not exactly know what to ask for. While I took a look at forensics every now and then, this is not an area where I have really any deep expertise. However, I should have stated that I am primarily interested on the event detection/gathering, transmission and storage part of the picture. That's where rsyslog can play a role (that limits the "event detection" process to listening to whoever wants to talk to it). The analysis part is beyond my scope right now (and probably will be for quite some time). As I said, I do not have an immediate need, but would like to understand the needs a bit better (and you have already provided good advise so far :)). The root cause of my question is that I would like to refine my medium, may be long term vision. While I think I can not implement any of the outcome, it helps my tune the implementation of things I do in a way that facilitates forensic needs (at least in cases where I have a choice). Without that information, I would probably do things in ways that will require much more effort once I get to "forensics-readiness". I hope this clarifies and sorry for not replying sooner. I will probably be a bit swamped 'til the end of the week, but will try to be more responsive now :) Thanks again for all that fine information, please keep it flowing. It is very useful. Rainer > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com > [mailto:rsyslog-bounces at lists.adiscon.com] On Behalf Of RB > Sent: Wednesday, January 21, 2009 9:59 PM > To: rsyslog-users > Subject: Re: [rsyslog] Anyone in Computer Forensics? > > On Wed, Jan 21, 2009 at 12:55, wrote: > > this is the most paranoid/conservative view, and by this > definition there > > are basicly no logs in existance that meet the forensics > requirements > > Rather than set an unattainable standard, my intent was to communicate > the conservative approach forensics would rather take. Edge cases and > mitigating controls are acceptable as long as they are well-documented > - that's basic security practice. I would rather see a solution that > has 100 well-documented lossy edge cases than one that claims to be > lossless with no proofs to back it. > > > frankly, if you really need write-only media, the best > thing to do (volume > > permitting) is to dump to a printer. > > You may want to recalculate; even 6-point font on large (14.875x11.5") > tractor-feed paper only fits ~80MB per 3500-sheet box. Or, put > another way, 2 512-byte events per second will burn through a $70 case > per day. Or 6.5 reams of US Letter per day. Extremely limited > volume. > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com > From milton at calnek.com Thu Jan 22 02:24:48 2009 From: milton at calnek.com (Milton Calnek) Date: Wed, 21 Jan 2009 19:24:48 -0600 Subject: [rsyslog] Multiple devices with same ip address. Message-ID: <4977CAE0.1040403@calnek.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi, I'm running a test lab with gear where every piece of gear under test has the same ip address. I have separated them via vlans, but I want to be able to send syslog from these devices to a central host... but with everything having the same ip address, there doesn't seem to be a way easily separate the logs. I see how to log based on ip, but not MAC nor interface. Before I invest in the development time, I was wondering if you folks have any suggestions? Thanks. - -- Milton Calnek BSc, A/Slt(Ret.) milton at calnek.com 306-717-8737 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.5 (GNU/Linux) Comment: Using GnuPG with CentOS - http://enigmail.mozdev.org iD8DBQFJd8rgHgnbf2T2QqMRArhdAKCCisNIrs+ohNoq2AUiaaiZJdT6SwCfSS3u 4r5JOPJn6SBPWlzMXUBjfQE= =eVoR -----END PGP SIGNATURE----- -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From rsyslog at lists.bod.org Thu Jan 22 03:31:39 2009 From: rsyslog at lists.bod.org (Paul Chambers) Date: Wed, 21 Jan 2009 18:31:39 -0800 Subject: [rsyslog] Multiple devices with same ip address. In-Reply-To: <4977CAE0.1040403@calnek.com> References: <4977CAE0.1040403@calnek.com> Message-ID: <4977DA8B.3010309@lists.bod.org> Couldn't you use NAT on the vlan interfaces? that way traffic on each interface could be mapped to a different IP address as seen by the logging machine. -- Paul Milton Calnek wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Hi, > > I'm running a test lab with gear where every piece of gear > under test has the same ip address. > > I have separated them via vlans, but I want to be able to send syslog > from these devices to a central host... but with everything having the > same ip address, there doesn't seem to be a way easily separate the logs. > I see how to log based on ip, but not MAC nor interface. > > Before I invest in the development time, I was wondering if you folks > have any suggestions? > > Thanks. > - -- > Milton Calnek BSc, A/Slt(Ret.) > milton at calnek.com > 306-717-8737 > > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.4.5 (GNU/Linux) > Comment: Using GnuPG with CentOS - http://enigmail.mozdev.org > > iD8DBQFJd8rgHgnbf2T2QqMRArhdAKCCisNIrs+ohNoq2AUiaaiZJdT6SwCfSS3u > 4r5JOPJn6SBPWlzMXUBjfQE= > =eVoR > -----END PGP SIGNATURE----- > > From milton at calnek.com Thu Jan 22 04:26:25 2009 From: milton at calnek.com (Milton Calnek) Date: Wed, 21 Jan 2009 21:26:25 -0600 Subject: [rsyslog] Multiple devices with same ip address. In-Reply-To: <4977DA8B.3010309@lists.bod.org> References: <4977CAE0.1040403@calnek.com> <4977DA8B.3010309@lists.bod.org> Message-ID: <4977E761.7070903@calnek.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Paul Chambers wrote: > Couldn't you use NAT on the vlan interfaces? that way traffic on each > interface could be mapped to a different IP address as seen by the > logging machine. I tried that. It didn't work for me. I don't remember the details just now, but it had something to do with the order things happen on the linux IP stack. If you can suggest a set of commands, I'll try it out. Thanks. - -- Milton Calnek BSc, A/Slt(Ret.) milton at calnek.com 306-717-8737 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.5 (GNU/Linux) Comment: Using GnuPG with CentOS - http://enigmail.mozdev.org iD8DBQFJd+dhHgnbf2T2QqMRArc9AKCf1tk2gW5XGOM4cCNevVj8QKwV5gCdHKAT 8OETLsF4Csv6d4/gFVlLtjU= =23Dv -----END PGP SIGNATURE----- -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From rsyslog at lists.bod.org Thu Jan 22 05:19:07 2009 From: rsyslog at lists.bod.org (Paul Chambers) Date: Wed, 21 Jan 2009 20:19:07 -0800 Subject: [rsyslog] Multiple devices with same ip address. In-Reply-To: <4977E761.7070903@calnek.com> References: <4977CAE0.1040403@calnek.com> <4977DA8B.3010309@lists.bod.org> <4977E761.7070903@calnek.com> Message-ID: <4977F3BB.6080205@lists.bod.org> Hard to give you specifics without a lot more information (and time's scarce, sorry). Something that helped me understand how netfilter handles packets, and the order the various tables/chains happen, is the documentation for ebtables, specifically: http://ebtables.sourceforge.net/br_fw_ia/br_fw_ia.html I'd be amazed if it's not possible to masquerade/source-NAT each vlan interface to a unique IP addresses. Between netfilter and ebtables, there's an enormous amount of flexibility. -- Paul Milton Calnek wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > > > Paul Chambers wrote: > >> Couldn't you use NAT on the vlan interfaces? that way traffic on each >> interface could be mapped to a different IP address as seen by the >> logging machine. >> > > I tried that. It didn't work for me. I don't remember the details just now, > but it had something to do with the order things happen on the linux IP stack. > > If you can suggest a set of commands, I'll try it out. > > Thanks. > - -- > Milton Calnek BSc, A/Slt(Ret.) > milton at calnek.com > 306-717-8737 > > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.4.5 (GNU/Linux) > Comment: Using GnuPG with CentOS - http://enigmail.mozdev.org > > iD8DBQFJd+dhHgnbf2T2QqMRArc9AKCf1tk2gW5XGOM4cCNevVj8QKwV5gCdHKAT > 8OETLsF4Csv6d4/gFVlLtjU= > =23Dv > -----END PGP SIGNATURE----- > > From david at lang.hm Thu Jan 22 07:48:49 2009 From: david at lang.hm (david at lang.hm) Date: Wed, 21 Jan 2009 22:48:49 -0800 (PST) Subject: [rsyslog] Multiple devices with same ip address. In-Reply-To: <4977CAE0.1040403@calnek.com> References: <4977CAE0.1040403@calnek.com> Message-ID: On Wed, 21 Jan 2009, Milton Calnek wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Hi, > > I'm running a test lab with gear where every piece of gear > under test has the same ip address. > > I have separated them via vlans, but I want to be able to send syslog > from these devices to a central host... but with everything having the > same ip address, there doesn't seem to be a way easily separate the logs. > I see how to log based on ip, but not MAC nor interface. > > Before I invest in the development time, I was wondering if you folks > have any suggestions? if you are running rsyslog on the systems under test, try changing the template that rsyslog uses to sent the messages out from each system puts something unique in it's logs. David Lang From rgerhards at hq.adiscon.com Thu Jan 22 08:46:48 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Thu, 22 Jan 2009 08:46:48 +0100 Subject: [rsyslog] Multiple devices with same ip address. In-Reply-To: References: <4977CAE0.1040403@calnek.com> Message-ID: <577465F99B41C842AAFBE9ED71E70ABA44FA0D@grfint2.intern.adiscon.com> David is right, this is probably the best way to do it. Even if the sender's in question are not powered by rsyslog, it most often is possible to put something unique into the messages. If there are few devices (<= 8), you can also use the local syslog facilities to identify the instances (almost all senders allow to configure that). In any case, you can then use the unique identifier to sort out messages to different bins on the receiver. HTH Rainer > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of david at lang.hm > Sent: Thursday, January 22, 2009 7:49 AM > To: rsyslog-users > Subject: Re: [rsyslog] Multiple devices with same ip address. > > On Wed, 21 Jan 2009, Milton Calnek wrote: > > > -----BEGIN PGP SIGNED MESSAGE----- > > Hash: SHA1 > > > > Hi, > > > > I'm running a test lab with gear where every piece of gear > > under test has the same ip address. > > > > I have separated them via vlans, but I want to be able to send syslog > > from these devices to a central host... but with everything having > the > > same ip address, there doesn't seem to be a way easily separate the > logs. > > I see how to log based on ip, but not MAC nor interface. > > > > Before I invest in the development time, I was wondering if you folks > > have any suggestions? > > if you are running rsyslog on the systems under test, try changing the > template that rsyslog uses to sent the messages out from > each system puts something unique in it's logs. > > David Lang > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com From rgerhards at hq.adiscon.com Thu Jan 22 16:58:24 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Thu, 22 Jan 2009 16:58:24 +0100 Subject: [rsyslog] rsyslog still crashes In-Reply-To: References: <1232046859.22744.39.camel@localhost.localdomain><577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9C2@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9C8@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9CA@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9CB@grfint2.intern.adiscon.com> Message-ID: <577465F99B41C842AAFBE9ED71E70ABA44FA18@grfint2.intern.adiscon.com> Hi folks, just an update on this matter. Lorenzo needed to change his system setup after some problems. We are in contact and expect to conduct further testing soon (hopefully the bug will reappear). Even better news is that I have been able to reproduce the bug 4 times in my lab today. It's not as easy as I would hope, but at least I can get results with some patience. I am also experimenting a bit with Twitter and actually found it useful to keep track of the troubleshooting process. Those of your interested can follow it at http://twitter.com/rgerhards I don't promise (yet) to keep it current at all times, but I will use it during the troubleshooting effort. Rainer > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of Lorenzo M. Catucci > Sent: Friday, January 16, 2009 6:29 PM > To: rsyslog-users > Subject: Re: [rsyslog] rsyslog still crashes > > On Fri, 16 Jan 2009, Rainer Gerhards wrote: > > RG> Ah, ok. Side-note: I got my machine up and it is running some test. > RG> Unfortunately no aborts so far, but is has only 4 cores... I hope > RG> something turns out... > RG> > > I think the real problem is in keeping those cores very busy... I'd try > to > spawn something like 20 loggers each spawning a couple "workers" per > second and logging startup/shutdown of any child. Maybe make each > worker > sleep for a random time before exiting. > > I don't have any Fedora/RedHat system; if nothing else, I'd suggest > doing > your tests on a debian/testing system too. > > Yours, > > lorenzo > > PS still running... > > > +-------------------------+-------------------------------------------- > --+ > | Lorenzo M. Catucci | Centro di Calcolo e Documentazione > | > | catucci at ccd.uniroma2.it | Universit? degli Studi di Roma "Tor > Vergata" | > | | Via O. Raimondo 18 ** I-00173 ROMA ** > ITALY | > | Tel. +39 06 7259 2255 | Fax. +39 06 7259 2125 > | > +-------------------------+-------------------------------------------- > --+ From lorenzo at sancho.ccd.uniroma2.it Thu Jan 22 17:19:15 2009 From: lorenzo at sancho.ccd.uniroma2.it (Lorenzo M. Catucci) Date: Thu, 22 Jan 2009 17:19:15 +0100 (CET) Subject: [rsyslog] rsyslog still crashes In-Reply-To: <577465F99B41C842AAFBE9ED71E70ABA44FA18@grfint2.intern.adiscon.com> References: <577465F99B41C842AAFBE9ED71E70ABA44FA18@grfint2.intern.adiscon.com> Message-ID: On Thu, 22 Jan 2009, Rainer Gerhards wrote: RG> Hi folks, RG> RG> just an update on this matter. Lorenzo needed to change his system RG> setup after some problems. We are in contact and expect to conduct RG> further testing soon (hopefully the bug will reappear). RG> Some administration chores the last couple of days; almost finished, big hopes for the week-end!!! RG> RG> Even better news is that I have been able to reproduce the bug 4 times RG> in my lab today. It's not as easy as I would hope, but at least I can RG> get results with some patience. I am also experimenting a bit with RG> Twitter and actually found it useful to keep track of the RG> troubleshooting process. Those of your interested can follow it at RG> This is really great news! Really, since rsyslog is been running this well since a long time on "normal" systems, and I've been (almost) alone in experiencing the crashes, the critters should have been hiding very well! See you soon, lorenzo +-------------------------+----------------------------------------------+ | Lorenzo M. Catucci | Centro di Calcolo e Documentazione | | catucci at ccd.uniroma2.it | Universit? degli Studi di Roma "Tor Vergata" | | | Via O. Raimondo 18 ** I-00173 ROMA ** ITALY | | Tel. +39 06 7259 2255 | Fax. +39 06 7259 2125 | +-------------------------+----------------------------------------------+ From rgerhards at hq.adiscon.com Thu Jan 22 18:53:44 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Thu, 22 Jan 2009 18:53:44 +0100 Subject: [rsyslog] rsyslog still crashes In-Reply-To: References: <577465F99B41C842AAFBE9ED71E70ABA44FA18@grfint2.intern.adiscon.com> Message-ID: <577465F99B41C842AAFBE9ED71E70ABA44FA1A@grfint2.intern.adiscon.com> OK, an update, full history at http://twitter.com/rgerhards It looks like there is some trouble with GCC atomic operation support. Has anyone seen this race on a non-Debian platform? I am asking because that may narrow down (or not ;)) the issue. Of course, I am not sure if atomic operations are really the root cause. However, replacing them is not very practical at some places and definitely time-consuming. So I'd like to have some feedback before I take that route. Does anyone know if there is a problem with atomic operation support in Debian (no bashing, honest question ;))? Rainer > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of Lorenzo M. Catucci > Sent: Thursday, January 22, 2009 5:19 PM > To: rsyslog-users > Subject: Re: [rsyslog] rsyslog still crashes > > On Thu, 22 Jan 2009, Rainer Gerhards wrote: > > RG> Hi folks, > RG> > RG> just an update on this matter. Lorenzo needed to change his system > RG> setup after some problems. We are in contact and expect to conduct > RG> further testing soon (hopefully the bug will reappear). > RG> > > Some administration chores the last couple of days; almost finished, > big hopes for the week-end!!! > > RG> > RG> Even better news is that I have been able to reproduce the bug 4 > times > RG> in my lab today. It's not as easy as I would hope, but at least I > can > RG> get results with some patience. I am also experimenting a bit with > RG> Twitter and actually found it useful to keep track of the > RG> troubleshooting process. Those of your interested can follow it at > RG> > > This is really great news! Really, since rsyslog is been running this > well > since a long time on "normal" systems, and I've been (almost) alone in > experiencing the crashes, the critters should have been hiding very > well! > > See you soon, > > lorenzo > > > +-------------------------+-------------------------------------------- > --+ > | Lorenzo M. Catucci | Centro di Calcolo e Documentazione > | > | catucci at ccd.uniroma2.it | Universit? degli Studi di Roma "Tor > Vergata" | > | | Via O. Raimondo 18 ** I-00173 ROMA ** > ITALY | > | Tel. +39 06 7259 2255 | Fax. +39 06 7259 2125 > | > +-------------------------+-------------------------------------------- > --+ From mbiebl at gmail.com Thu Jan 22 19:46:30 2009 From: mbiebl at gmail.com (Michael Biebl) Date: Thu, 22 Jan 2009 19:46:30 +0100 Subject: [rsyslog] rsyslog still crashes In-Reply-To: <577465F99B41C842AAFBE9ED71E70ABA44FA1A@grfint2.intern.adiscon.com> References: <577465F99B41C842AAFBE9ED71E70ABA44FA18@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44FA1A@grfint2.intern.adiscon.com> Message-ID: 2009/1/22 Rainer Gerhards : > OK, an update, full history at http://twitter.com/rgerhards > > It looks like there is some trouble with GCC atomic operation support. Has anyone seen this race on a non-Debian platform? I am asking because that may narrow down (or not ;)) the issue. Of course, I am not sure if atomic operations are really the root cause. However, replacing them is not very practical at some places and definitely time-consuming. So I'd like to have some feedback before I take that route. > > Does anyone know if there is a problem with atomic operation support in Debian (no bashing, honest question ;))? This would be a compiler (GCC) problem then, right? I'm not aware of any such problem. FWIW Debian is using GCC 4.3 in lenny/sid I've checked the bugs reported against the Debian gcc package [1] and the Debian specific patches on top of gcc [2], but I didn't find anything obvious. Rainer, if you have a more specific question, I could forward that question to the Debian GCC maintainers. Cheers, Michael [1] http://bugs.debian.org/cgi-bin/pkgreport.cgi?src=gcc-4.3&repeatmerged=no [2] http://patch-tracking.debian.net/package/gcc-4.3/4.3.2-1.1 -- Why is it that all of the instruments seeking intelligent life in the universe are pointed away from Earth? From rgerhards at hq.adiscon.com Thu Jan 22 21:18:19 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Thu, 22 Jan 2009 21:18:19 +0100 Subject: [rsyslog] rsyslog still crashes In-Reply-To: References: <577465F99B41C842AAFBE9ED71E70ABA44FA18@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44FA1A@grfint2.intern.adiscon.com> Message-ID: <577465F99B41C842AAFBE9ED71E70ABA44FA1B@grfint2.intern.adiscon.com> > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com > [mailto:rsyslog-bounces at lists.adiscon.com] On Behalf Of Michael Biebl > Sent: Thursday, January 22, 2009 7:47 PM > To: rsyslog-users > Subject: Re: [rsyslog] rsyslog still crashes > > 2009/1/22 Rainer Gerhards : > > OK, an update, full history at http://twitter.com/rgerhards > > > > It looks like there is some trouble with GCC atomic > operation support. Has anyone seen this race on a non-Debian > platform? I am asking because that may narrow down (or not > ;)) the issue. Of course, I am not sure if atomic operations > are really the root cause. However, replacing them is not > very practical at some places and definitely time-consuming. > So I'd like to have some feedback before I take that route. > > > > Does anyone know if there is a problem with atomic > operation support in Debian (no bashing, honest question ;))? > > This would be a compiler (GCC) problem then, right? Excatly > > I'm not aware of any such problem. FWIW Debian is using GCC > 4.3 in lenny/sid > I've checked the bugs reported against the Debian gcc package [1] and > the Debian specific patches on top of gcc [2], > but I didn't find anything obvious. > > Rainer, if you have a more specific question, I could forward that > question to the Debian GCC maintainers. Thanks, Michael. But I think before we ask other's for their time, I'll try to do my homework. So far, I am just guessing. As I now seem to be able to repro the problem, I can look further into it. Tomorrow, I'll first check what it takes to replace the atomic operations by mutex calls. I think that's quite some work, but hopefully I am wrong. Thanks to the info you provided, this seems to be useful work. I keep you posted. Rainer > > Cheers, > Michael > > [1] > http://bugs.debian.org/cgi-bin/pkgreport.cgi?src=gcc-4.3&repea > tmerged=no > [2] http://patch-tracking.debian.net/package/gcc-4.3/4.3.2-1.1 > > -- > Why is it that all of the instruments seeking intelligent life in the > universe are pointed away from Earth? > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com > From danson at rackspace.com Mon Jan 26 21:51:13 2009 From: danson at rackspace.com (Daniel Anson) Date: Mon, 26 Jan 2009 14:51:13 -0600 Subject: [rsyslog] UNIX timestamp Message-ID: <7161_1233003195_n0QKr81o012376_96AF20FDF4301D419B33CCE8E3A0132B0AE1CFD1@SAT4MX07.RACKSPACE.CORP> Is there a convention in rsyslog whereby I can get a UNIX timestamp instead of the other RFC time standards? Daniel M. Anson Linux Systems Engineer Rackspace danson at rackspace.com Office: (210)312-5114 Confidentiality Notice: This e-mail message (including any attached or embedded documents) is intended for the exclusive and confidential use of the individual or entity to which this message is addressed, and unless otherwise expressly indicated, is confidential and privileged information of Rackspace. Any dissemination, distribution or copying of the enclosed material is prohibited. If you receive this transmission in error, please notify us immediately by e-mail at abuse at rackspace.com, and delete the original message. Your cooperation is appreciated. From hks.private at gmail.com Mon Jan 26 22:10:06 2009 From: hks.private at gmail.com ((private) HKS) Date: Mon, 26 Jan 2009 16:10:06 -0500 Subject: [rsyslog] UNIX timestamp In-Reply-To: <7161_1233003195_n0QKr81o012376_96AF20FDF4301D419B33CCE8E3A0132B0AE1CFD1@SAT4MX07.RACKSPACE.CORP> References: <7161_1233003195_n0QKr81o012376_96AF20FDF4301D419B33CCE8E3A0132B0AE1CFD1@SAT4MX07.RACKSPACE.CORP> Message-ID: On Mon, Jan 26, 2009 at 3:51 PM, Daniel Anson wrote: > Is there a convention in rsyslog whereby I can get a UNIX timestamp > instead of the other RFC time standards? > > > > Daniel M. Anson > Linux Systems Engineer > Rackspace > danson at rackspace.com > Office: (210)312-5114 Unfortunately, no. You can find a serious discussion about it at http://kb.monitorware.com/post14653.html, but in a word, it's complicated. -HKS > > > > > Confidentiality Notice: This e-mail message (including any attached or > embedded documents) is intended for the exclusive and confidential use of the > individual or entity to which this message is addressed, and unless otherwise > expressly indicated, is confidential and privileged information of Rackspace. > Any dissemination, distribution or copying of the enclosed material is prohibited. > If you receive this transmission in error, please notify us immediately by e-mail > at abuse at rackspace.com, and delete the original message. > Your cooperation is appreciated. > > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com > From danson at rackspace.com Mon Jan 26 22:16:18 2009 From: danson at rackspace.com (Daniel Anson) Date: Mon, 26 Jan 2009 15:16:18 -0600 Subject: [rsyslog] UNIX timestamp In-Reply-To: References: <7161_1233003195_n0QKr81o012376_96AF20FDF4301D419B33CCE8E3A0132B0AE1CFD1@SAT4MX07.RACKSPACE.CORP> Message-ID: <15897_1233004899_n0QLLcFR018661_96AF20FDF4301D419B33CCE8E3A0132B0AE1CFEB@SAT4MX07.RACKSPACE.CORP> I figured as much but I thought I would ask. In essence, writing a UNIX timestamp would go against the RFC standard especially if an rsyslog server were set up as a relay. I am using MySQL UNIX_TIMESTAMP() function to get what I need but thought this may be available locally in rsyslog. Thx for the reply, Daniel -----Original Message----- From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog-bounces at lists.adiscon.com] On Behalf Of (private) HKS Sent: Monday, January 26, 2009 3:10 PM To: rsyslog-users Subject: Re: [rsyslog] UNIX timestamp On Mon, Jan 26, 2009 at 3:51 PM, Daniel Anson wrote: > Is there a convention in rsyslog whereby I can get a UNIX timestamp > instead of the other RFC time standards? > > > > Daniel M. Anson > Linux Systems Engineer > Rackspace > danson at rackspace.com > Office: (210)312-5114 Unfortunately, no. You can find a serious discussion about it at http://kb.monitorware.com/post14653.html, but in a word, it's complicated. -HKS > > > > > Confidentiality Notice: This e-mail message (including any attached or > embedded documents) is intended for the exclusive and confidential use of the > individual or entity to which this message is addressed, and unless otherwise > expressly indicated, is confidential and privileged information of Rackspace. > Any dissemination, distribution or copying of the enclosed material is prohibited. > If you receive this transmission in error, please notify us immediately by e-mail > at abuse at rackspace.com, and delete the original message. > Your cooperation is appreciated. > > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com > _______________________________________________ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com Confidentiality Notice: This e-mail message (including any attached or embedded documents) is intended for the exclusive and confidential use of the individual or entity to which this message is addressed, and unless otherwise expressly indicated, is confidential and privileged information of Rackspace. Any dissemination, distribution or copying of the enclosed material is prohibited. If you receive this transmission in error, please notify us immediately by e-mail at abuse at rackspace.com, and delete the original message. Your cooperation is appreciated. From sur5r at sur5r.net Tue Jan 27 19:07:09 2009 From: sur5r at sur5r.net (Jakob Haufe) Date: Tue, 27 Jan 2009 19:07:09 +0100 Subject: [rsyslog] Is rsyslog leaking memory? References: <1232276513.22744.45.camel@localhost.localdomain> Message-ID: <20090127190709.40a2b81b@mp-atlantis3.ziti.uni-heidelberg.de> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Sun, 18 Jan 2009 12:01:53 +0100 Rainer Gerhards wrote: > From what I have seen so far, I, too, doubt there is a leak. However, > there are various levels of testing. For example, the postgres output > module and the GSSAPI code is contributed and I do not even have a > test environment. So these are not checked using that procedure. The > libdbi code is only checked every now and then and not with all > backends (e.g. no Oracle at hand ... and so on...). If I ever get > over to a full testing suite (no collaborators found so far...), I'll > probably be able to do more consitent testing of all modules. As I'm the one who wrote (or rather ported) the postgres module, I would be willing to help debugging/valgrinding it. Unfortunately, I have not yet completely understood how the files tests/ work. To be honest, I have just started looking at it. What would you suggest as a way to test ompgqsl in particular? Simply run rsyslogd with valgrind and throw messages against it? Regarding GSSAPI: As I'm a big fan of Kerberos I will definitely give it a try as soon as I have some spare time, maybe I can help in valgrinding it, too. Regards, Jakob (aka sur5r) -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (GNU/Linux) iEYEARECAAYFAkl/TU0ACgkQ1YAhDic+ada31QCgu1f54fx4XMNpLjrASZ2fGIJ8 V8sAoKD8hRx7tuRzpwkajg5PPCDkwnLY =luw3 -----END PGP SIGNATURE----- From rsyslog at clark-communications.com Wed Jan 28 02:19:45 2009 From: rsyslog at clark-communications.com (Don Jackson) Date: Tue, 27 Jan 2009 17:19:45 -0800 Subject: [rsyslog] UPDATE: sysutils/rsyslog-3.20.3 Message-ID: Port updated to the recent 3.20.3 release of rsyslog. Tested on OpenBSD 4.4, amd64 and i386. It would be great if someone would commit this to the OpenBSD ports tree. $ cat ./pkg/DESCR A syslogd replacement -------------- next part -------------- From rgerhards at hq.adiscon.com Wed Jan 28 18:32:04 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Wed, 28 Jan 2009 18:32:04 +0100 Subject: [rsyslog] rsyslog still crashes In-Reply-To: <577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com> References: <1232046859.22744.39.camel@localhost.localdomain><577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com> Message-ID: <577465F99B41C842AAFBE9ED71E70ABA44FA8F@grfint2.intern.adiscon.com> Hi all, thanks to Lorenzo's help, we made good progress. It is too much to post inside a mail, please have a look at my analysis of the bug: http://blog.gerhards.net/2009/01/rsyslog-data-race-analysis.html The short story is that we have at least improved the situation very much and I hope to have fixes for all branches within the next couple of days. Rainer > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of Rainer Gerhards > Sent: Friday, January 16, 2009 3:22 PM > To: rsyslog-users > Subject: Re: [rsyslog] rsyslog still crashes > > Lorenzo, > > I have created a new branch "raceDebug" and done a first commit to it. > The change is very lightweight. Please pull, compile as usual and give > it a try. It spits out some info to stdout from time to time > (hopefully). I am not sure if it aborts, depending on the output it may > or may not. Even if we get messages, they are probably not enough to > pinpoint the bug, but I wanted to do something very light to see if the > bug stays. > > Feedback appreciated. > > Rainer From david at lang.hm Thu Jan 29 09:36:41 2009 From: david at lang.hm (david at lang.hm) Date: Thu, 29 Jan 2009 00:36:41 -0800 (PST) Subject: [rsyslog] rsyslog still crashes In-Reply-To: <577465F99B41C842AAFBE9ED71E70ABA44FA8F@grfint2.intern.adiscon.com> References: <1232046859.22744.39.camel@localhost.localdomain><577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44FA8F@grfint2.intern.adiscon.com> Message-ID: On Wed, 28 Jan 2009, Rainer Gerhards wrote: > Hi all, > > thanks to Lorenzo's help, we made good progress. It is too much to post > inside a mail, please have a look at my analysis of the bug: > > http://blog.gerhards.net/2009/01/rsyslog-data-race-analysis.html > > The short story is that we have at least improved the situation very > much and I hope to have fixes for all branches within the next couple of > days. I just finished reading through this excellant write-up one small thing. you quote the spec Accesses to cacheable memory that are split across bus widths, cache lines, and page boundaries are not guaranteed to be atomic and then conclude that So aligned word-access does not guarantee (not even enhance the chance) of atomicity. I read that to mean that the alignment requirements are more complicated, not that alignment is useless. you should also look at the code that's generated by -Os, with the heavily cached systems that we have nowdays it's common that the code being smaller (and therefor more of the code fitting into the L1 cache) is more of an advantage than the optimizations that -O3 provides. congradulations on tracking down a nasty and subtle issue. David Lang > Rainer > >> -----Original Message----- >> From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- >> bounces at lists.adiscon.com] On Behalf Of Rainer Gerhards >> Sent: Friday, January 16, 2009 3:22 PM >> To: rsyslog-users >> Subject: Re: [rsyslog] rsyslog still crashes >> >> Lorenzo, >> >> I have created a new branch "raceDebug" and done a first commit to it. >> The change is very lightweight. Please pull, compile as usual and give >> it a try. It spits out some info to stdout from time to time >> (hopefully). I am not sure if it aborts, depending on the output it > may >> or may not. Even if we get messages, they are probably not enough to >> pinpoint the bug, but I wanted to do something very light to see if > the >> bug stays. >> >> Feedback appreciated. >> >> Rainer > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com > From rgerhards at hq.adiscon.com Thu Jan 29 10:42:48 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Thu, 29 Jan 2009 10:42:48 +0100 Subject: [rsyslog] rsyslog still crashes In-Reply-To: <577465F99B41C842AAFBE9ED71E70ABA44FA8F@grfint2.intern.adiscon.com> References: <1232046859.22744.39.camel@localhost.localdomain><577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44FA8F@grfint2.intern.adiscon.com> Message-ID: <577465F99B41C842AAFBE9ED71E70ABA44FA9A@grfint2.intern.adiscon.com> Hi all, I had another interesting discussion with Lorenzo today. Those of you interested in details my find the chatlog interesting: http://blog.gerhards.net/2009/01/some-more-on-rsyslog-data-race.html Rainer > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of Rainer Gerhards > Sent: Wednesday, January 28, 2009 6:32 PM > To: rsyslog-users > Subject: Re: [rsyslog] rsyslog still crashes > > Hi all, > > thanks to Lorenzo's help, we made good progress. It is too much to post > inside a mail, please have a look at my analysis of the bug: > > http://blog.gerhards.net/2009/01/rsyslog-data-race-analysis.html > > The short story is that we have at least improved the situation very > much and I hope to have fixes for all branches within the next couple > of > days. > > Rainer > > > -----Original Message----- > > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > > bounces at lists.adiscon.com] On Behalf Of Rainer Gerhards > > Sent: Friday, January 16, 2009 3:22 PM > > To: rsyslog-users > > Subject: Re: [rsyslog] rsyslog still crashes > > > > Lorenzo, > > > > I have created a new branch "raceDebug" and done a first commit to > it. > > The change is very lightweight. Please pull, compile as usual and > give > > it a try. It spits out some info to stdout from time to time > > (hopefully). I am not sure if it aborts, depending on the output it > may > > or may not. Even if we get messages, they are probably not enough to > > pinpoint the bug, but I wanted to do something very light to see if > the > > bug stays. > > > > Feedback appreciated. > > > > Rainer > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com From david at lang.hm Thu Jan 29 12:06:03 2009 From: david at lang.hm (david at lang.hm) Date: Thu, 29 Jan 2009 03:06:03 -0800 (PST) Subject: [rsyslog] rsyslog still crashes In-Reply-To: <577465F99B41C842AAFBE9ED71E70ABA44FA9A@grfint2.intern.adiscon.com> References: <1232046859.22744.39.camel@localhost.localdomain><577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44FA8F@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44FA9A@grfint2.intern.adiscon.com> Message-ID: On Thu, 29 Jan 2009, Rainer Gerhards wrote: > Hi all, > > I had another interesting discussion with Lorenzo today. Those of you > interested in details my find the chatlog interesting: > > http://blog.gerhards.net/2009/01/some-more-on-rsyslog-data-race.html so, distilling this down I think I am reading the following. 1. mixing mutex and atomic operations is a problem, one or the other is safe 2. reliable duplication of the problem requires fast machine multiple cores _not_ sharing L1 cache (early Intel 4-core machines or multi-socket machines) a complex rsyslog config that uses multiple thread heavily high traffic log volume to heavily load rsyslog high system load external to rsyslog increases the chancesof the race question, have you tried enabling/disabling preemption in the kernel on these systems to see if that affects the probability of having a problem? I'm eagerly waiting for the fixes to appear in the 4.1 branch to test them out. David Lang > Rainer > >> -----Original Message----- >> From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- >> bounces at lists.adiscon.com] On Behalf Of Rainer Gerhards >> Sent: Wednesday, January 28, 2009 6:32 PM >> To: rsyslog-users >> Subject: Re: [rsyslog] rsyslog still crashes >> >> Hi all, >> >> thanks to Lorenzo's help, we made good progress. It is too much to > post >> inside a mail, please have a look at my analysis of the bug: >> >> http://blog.gerhards.net/2009/01/rsyslog-data-race-analysis.html >> >> The short story is that we have at least improved the situation very >> much and I hope to have fixes for all branches within the next couple >> of >> days. >> >> Rainer >> >>> -----Original Message----- >>> From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- >>> bounces at lists.adiscon.com] On Behalf Of Rainer Gerhards >>> Sent: Friday, January 16, 2009 3:22 PM >>> To: rsyslog-users >>> Subject: Re: [rsyslog] rsyslog still crashes >>> >>> Lorenzo, >>> >>> I have created a new branch "raceDebug" and done a first commit to >> it. >>> The change is very lightweight. Please pull, compile as usual and >> give >>> it a try. It spits out some info to stdout from time to time >>> (hopefully). I am not sure if it aborts, depending on the output it >> may >>> or may not. Even if we get messages, they are probably not enough to >>> pinpoint the bug, but I wanted to do something very light to see if >> the >>> bug stays. >>> >>> Feedback appreciated. >>> >>> Rainer >> _______________________________________________ >> rsyslog mailing list >> http://lists.adiscon.net/mailman/listinfo/rsyslog >> http://www.rsyslog.com > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com > From rgerhards at hq.adiscon.com Thu Jan 29 11:08:04 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Thu, 29 Jan 2009 11:08:04 +0100 Subject: [rsyslog] rsyslog still crashes In-Reply-To: References: <1232046859.22744.39.camel@localhost.localdomain><577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44FA8F@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44FA9A@grfint2.intern.adiscon.com> Message-ID: <577465F99B41C842AAFBE9ED71E70ABA44FA9B@grfint2.intern.adiscon.com> A full answer follows soon, but in essence you got it :) I will be working on the 4.1 version today, thus the brief reply ;) > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of david at lang.hm > Sent: Thursday, January 29, 2009 12:06 PM > To: rsyslog-users > Subject: Re: [rsyslog] rsyslog still crashes > > On Thu, 29 Jan 2009, Rainer Gerhards wrote: > > > Hi all, > > > > I had another interesting discussion with Lorenzo today. Those of you > > interested in details my find the chatlog interesting: > > > > http://blog.gerhards.net/2009/01/some-more-on-rsyslog-data-race.html > > so, distilling this down I think I am reading the following. > > 1. mixing mutex and atomic operations is a problem, one or the other is > safe > > 2. reliable duplication of the problem requires > > fast machine > multiple cores _not_ sharing L1 cache (early Intel 4-core machines or > multi-socket machines) > a complex rsyslog config that uses multiple thread heavily > high traffic log volume to heavily load rsyslog > high system load external to rsyslog increases the chancesof the race > > question, have you tried enabling/disabling preemption in the kernel on > these systems to see if that affects the probability of having a > problem? > > I'm eagerly waiting for the fixes to appear in the 4.1 branch to test > them > out. > > David Lang > > > > Rainer > > > >> -----Original Message----- > >> From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > >> bounces at lists.adiscon.com] On Behalf Of Rainer Gerhards > >> Sent: Wednesday, January 28, 2009 6:32 PM > >> To: rsyslog-users > >> Subject: Re: [rsyslog] rsyslog still crashes > >> > >> Hi all, > >> > >> thanks to Lorenzo's help, we made good progress. It is too much to > > post > >> inside a mail, please have a look at my analysis of the bug: > >> > >> http://blog.gerhards.net/2009/01/rsyslog-data-race-analysis.html > >> > >> The short story is that we have at least improved the situation very > >> much and I hope to have fixes for all branches within the next > couple > >> of > >> days. > >> > >> Rainer > >> > >>> -----Original Message----- > >>> From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > >>> bounces at lists.adiscon.com] On Behalf Of Rainer Gerhards > >>> Sent: Friday, January 16, 2009 3:22 PM > >>> To: rsyslog-users > >>> Subject: Re: [rsyslog] rsyslog still crashes > >>> > >>> Lorenzo, > >>> > >>> I have created a new branch "raceDebug" and done a first commit to > >> it. > >>> The change is very lightweight. Please pull, compile as usual and > >> give > >>> it a try. It spits out some info to stdout from time to time > >>> (hopefully). I am not sure if it aborts, depending on the output it > >> may > >>> or may not. Even if we get messages, they are probably not enough > to > >>> pinpoint the bug, but I wanted to do something very light to see if > >> the > >>> bug stays. > >>> > >>> Feedback appreciated. > >>> > >>> Rainer > >> _______________________________________________ > >> rsyslog mailing list > >> http://lists.adiscon.net/mailman/listinfo/rsyslog > >> http://www.rsyslog.com > > _______________________________________________ > > rsyslog mailing list > > http://lists.adiscon.net/mailman/listinfo/rsyslog > > http://www.rsyslog.com > > > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com From mrdemeanour at jackpot.uk.net Thu Jan 29 12:12:41 2009 From: mrdemeanour at jackpot.uk.net (Mr. Demeanour) Date: Thu, 29 Jan 2009 11:12:41 +0000 Subject: [rsyslog] rsyslog still crashes In-Reply-To: <577465F99B41C842AAFBE9ED71E70ABA44FA8F@grfint2.intern.adiscon.com> References: <1232046859.22744.39.camel@localhost.localdomain><577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44FA8F@grfint2.intern.adiscon.com> Message-ID: <49818F29.7070000@jackpot.uk.net> Rainer Gerhards wrote: > Hi all, > > thanks to Lorenzo's help, we made good progress. It is too much to post > inside a mail, please have a look at my analysis of the bug: > > http://blog.gerhards.net/2009/01/rsyslog-data-race-analysis.html > > The short story is that we have at least improved the situation very > much and I hope to have fixes for all branches within the next couple of > days. Bravo, Rainer! That is the most challenging and tricky to nail of all kinds of bug, and I'm very impressed. -- Jack. From friedl at hq.adiscon.com Thu Jan 29 17:16:57 2009 From: friedl at hq.adiscon.com (Florian Riedl) Date: Thu, 29 Jan 2009 17:16:57 +0100 Subject: [rsyslog] rsyslog 4.1.4 (devel) released Message-ID: <577465F99B41C842AAFBE9ED71E70ABA44FABC@grfint2.intern.adiscon.com> Hi all, rsyslog 4.1.4, a member of the development branch, has been released today. rsyslog 4.1.4, a member of the development branch, has been released today. It is primarily a stability update. Most importantly, this version addresses a potential segfault which occurred rather seldom and primarily on very fast and busy systems. The only other change is a fix for the $PreserveFQDN config directive, which did not properly affect locally emitted messages. This is a recommended update for all users of the development branch. Download http://www.rsyslog.com/Downloads-req-viewdownloaddetails-lid-147.phtml Changelog http://www.rsyslog.com/Article341.phtml As always, feedback is appreciated. Florian Riedl -- Support ======= Improving rsyslog is costly, but you can help! We are looking for organizations that find rsyslog useful and wish to contribute back. You can contribute by reporting bugs, improve the software, or donate money or equipment. Commercial support contracts for rsyslog are available, and they help finance continued maintenance. Adiscon GmbH, a privately held German company, is currently funding rsyslog development. We are always looking for interesting development projects. For details on how to help, please see http://www.rsyslog.com/doc-how2help.html . From rgerhards at hq.adiscon.com Thu Jan 29 17:36:41 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Thu, 29 Jan 2009 17:36:41 +0100 Subject: [rsyslog] rsyslog still crashes In-Reply-To: References: <1232046859.22744.39.camel@localhost.localdomain> <577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44FA8F@grfint2.intern.adiscon.com> Message-ID: <1233247001.19733.14.camel@rf10up.intern.adiscon.com> On Thu, 2009-01-29 at 00:36 -0800, david at lang.hm wrote: > On Wed, 28 Jan 2009, Rainer Gerhards wrote: > > > Hi all, > > > > thanks to Lorenzo's help, we made good progress. It is too much to post > > inside a mail, please have a look at my analysis of the bug: > > > > http://blog.gerhards.net/2009/01/rsyslog-data-race-analysis.html > > > > The short story is that we have at least improved the situation very > > much and I hope to have fixes for all branches within the next couple of > > days. > > I just finished reading through this excellant write-up > > one small thing. > > you quote the spec > > Accesses to cacheable memory that are split across bus widths, cache > lines, and page boundaries are not guaranteed to be atomic > > and then conclude that > > So aligned word-access does not guarantee (not even enhance the chance) of > atomicity. > > I read that to mean that the alignment requirements are more complicated, > not that alignment is useless. I should probably have quoted more of Intel's manual. But in essence you need to read at least the first full two pages to get the in-depth idea. The issue is not alignment requirements. As hardware gets more and more parallel, and caches get to more and more levels, and on-chip cores coexist with those from other sockets ... keeping memory coherent is a costly job. In early CPUs, Intel made memory access atomic if some alignment requirements were met. That was cheap. In new CPUs that atomicity is expensive. On the other hand, most data access do not need atomicity. So why incur the cost for many operations when only few need it? In the end result, Intel has remove guaranteed atomicity from those memory accesses. In order to get atomicity, the program must tell the CPU *explicitly* that it wants that feature. To do so, a "LOCK" prefix (opcode) must be placed before the actual opcode (note that this is only supported for some operations). So you get the best of two world: fast execution time for the majority of code and atomicity where you need it (but it then incurs the cost). The bottom line is that what was an atomic operation on an old CPU is no longer an atomic operation on a new CPU. If you need that, you need to include that extra "LOCK" opcode. As I briefly said in the blogpost, I have not check old Intel manuals. So I do not know if they formerly guaranteed, as part of the instruction set architecture, that these operations were atomic. I guess they did not. If so, I as a programmer made some assumptions about the micro-architecture that no longer hold true. My fault... But even if it is Intel's fault, the C programming language does not guarantee atomicity nor does the compiler guarantee a specific translation to machine code. So I, working on the C level, used assumptions that were not valid (and as I said I knew it was dangerous, but it worked too well for too long... ;)) > > you should also look at the code that's generated by -Os, with the heavily > cached systems that we have nowdays it's common that the code being > smaller (and therefor more of the code fitting into the L1 cache) is more > of an advantage than the optimizations that -O3 provides. That's a good reminder. I've just checked the gcc docs. There are some things that I do not like about -Os, especially as it disables proper alignment of many structures, including code. That can lead to sub-optimal cache performance. On the other hand -O3 does things like loop unrolling, which definitely is a bad idea with modern cache systems. My preliminarily conclusion is that -O2 is probably best, and may be tuned by turning on and off specific optimizations via their specific compiler switches. > > congradulations on tracking down a nasty and subtle issue. Thanks - but let's first see if this was the only issue and if things run smooth everywhere. But it looks very promising. Rainer > > David Lang > > > > Rainer > > > >> -----Original Message----- > >> From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > >> bounces at lists.adiscon.com] On Behalf Of Rainer Gerhards > >> Sent: Friday, January 16, 2009 3:22 PM > >> To: rsyslog-users > >> Subject: Re: [rsyslog] rsyslog still crashes > >> > >> Lorenzo, > >> > >> I have created a new branch "raceDebug" and done a first commit to it. > >> The change is very lightweight. Please pull, compile as usual and give > >> it a try. It spits out some info to stdout from time to time > >> (hopefully). I am not sure if it aborts, depending on the output it > > may > >> or may not. Even if we get messages, they are probably not enough to > >> pinpoint the bug, but I wanted to do something very light to see if > > the > >> bug stays. > >> > >> Feedback appreciated. > >> > >> Rainer > > _______________________________________________ > > rsyslog mailing list > > http://lists.adiscon.net/mailman/listinfo/rsyslog > > http://www.rsyslog.com > > > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com From david at lang.hm Fri Jan 30 04:51:28 2009 From: david at lang.hm (david at lang.hm) Date: Thu, 29 Jan 2009 19:51:28 -0800 (PST) Subject: [rsyslog] rsyslog still crashes In-Reply-To: <1233247001.19733.14.camel@rf10up.intern.adiscon.com> References: <1232046859.22744.39.camel@localhost.localdomain> <577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44FA8F@grfint2.intern.adiscon.com> <1233247001.19733.14.camel@rf10up.intern.adiscon.com> Message-ID: On Thu, 29 Jan 2009, Rainer Gerhards wrote: > On Thu, 2009-01-29 at 00:36 -0800, david at lang.hm wrote: >> On Wed, 28 Jan 2009, Rainer Gerhards wrote: >> >>> Hi all, >>> >>> thanks to Lorenzo's help, we made good progress. It is too much to post >>> inside a mail, please have a look at my analysis of the bug: >>> >>> http://blog.gerhards.net/2009/01/rsyslog-data-race-analysis.html >>> >>> The short story is that we have at least improved the situation very >>> much and I hope to have fixes for all branches within the next couple of >>> days. >> >> I just finished reading through this excellant write-up >> >> one small thing. >> >> you quote the spec >> >> Accesses to cacheable memory that are split across bus widths, cache >> lines, and page boundaries are not guaranteed to be atomic >> >> and then conclude that >> >> So aligned word-access does not guarantee (not even enhance the chance) of >> atomicity. >> >> I read that to mean that the alignment requirements are more complicated, >> not that alignment is useless. > > I should probably have quoted more of Intel's manual. But in essence you > need to read at least the first full two pages to get the in-depth idea. > The issue is not alignment requirements. As hardware gets more and more > parallel, and caches get to more and more levels, and on-chip cores > coexist with those from other sockets ... keeping memory coherent is a > costly job. > > In early CPUs, Intel made memory access atomic if some alignment > requirements were met. That was cheap. In new CPUs that atomicity is > expensive. On the other hand, most data access do not need atomicity. So > why incur the cost for many operations when only few need it? In the end > result, Intel has remove guaranteed atomicity from those memory > accesses. In order to get atomicity, the program must tell the CPU > *explicitly* that it wants that feature. To do so, a "LOCK" prefix > (opcode) must be placed before the actual opcode (note that this is only > supported for some operations). So you get the best of two world: fast > execution time for the majority of code and atomicity where you need it > (but it then incurs the cost). > > The bottom line is that what was an atomic operation on an old CPU is no > longer an atomic operation on a new CPU. If you need that, you need to > include that extra "LOCK" opcode. > > As I briefly said in the blogpost, I have not check old Intel manuals. > So I do not know if they formerly guaranteed, as part of the instruction > set architecture, that these operations were atomic. I guess they did > not. If so, I as a programmer made some assumptions about the > micro-architecture that no longer hold true. My fault... But even if it > is Intel's fault, the C programming language does not guarantee > atomicity nor does the compiler guarantee a specific translation to > machine code. So I, working on the C level, used assumptions that were > not valid (and as I said I knew it was dangerous, but it worked too well > for too long... ;)) the new C0x standard will add atomic ops and guarentees (some of which are not nessasarily provided by the chip, but have to be provided by the compiler/library instead), so watch for it, but test the performance of them before you trust them >> >> you should also look at the code that's generated by -Os, with the heavily >> cached systems that we have nowdays it's common that the code being >> smaller (and therefor more of the code fitting into the L1 cache) is more >> of an advantage than the optimizations that -O3 provides. > > That's a good reminder. I've just checked the gcc docs. There are some > things that I do not like about -Os, especially as it disables proper > alignment of many structures, including code. That can lead to > sub-optimal cache performance. I know the linux kernel has many things where the alignment is critical for proper functioning, but they are still able to support -Os, so there is some way to specify alignment even for -Os > On the other hand -O3 does things like loop unrolling, which definitely > is a bad idea with modern cache systems. > > My preliminarily conclusion is that -O2 is probably best, and may be > tuned by turning on and off specific optimizations via their specific > compiler switches. this has been the prevailing wisdom for many years, but I've seen myself many cases where -Os has ended up being faster in the real world, in spite of the various things that -O2 does 'better' is it the case that -Os would break things? or just that you think it's alignment may not be as good? David Lang >> congradulations on tracking down a nasty and subtle issue. > > Thanks - but let's first see if this was the only issue and if things > run smooth everywhere. But it looks very promising. > > Rainer >> >> David Lang >> >> >>> Rainer >>> >>>> -----Original Message----- >>>> From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- >>>> bounces at lists.adiscon.com] On Behalf Of Rainer Gerhards >>>> Sent: Friday, January 16, 2009 3:22 PM >>>> To: rsyslog-users >>>> Subject: Re: [rsyslog] rsyslog still crashes >>>> >>>> Lorenzo, >>>> >>>> I have created a new branch "raceDebug" and done a first commit to it. >>>> The change is very lightweight. Please pull, compile as usual and give >>>> it a try. It spits out some info to stdout from time to time >>>> (hopefully). I am not sure if it aborts, depending on the output it >>> may >>>> or may not. Even if we get messages, they are probably not enough to >>>> pinpoint the bug, but I wanted to do something very light to see if >>> the >>>> bug stays. >>>> >>>> Feedback appreciated. >>>> >>>> Rainer >>> _______________________________________________ >>> rsyslog mailing list >>> http://lists.adiscon.net/mailman/listinfo/rsyslog >>> http://www.rsyslog.com >>> >> _______________________________________________ >> rsyslog mailing list >> http://lists.adiscon.net/mailman/listinfo/rsyslog >> http://www.rsyslog.com > > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com > From david at lang.hm Fri Jan 30 05:56:55 2009 From: david at lang.hm (david at lang.hm) Date: Thu, 29 Jan 2009 20:56:55 -0800 (PST) Subject: [rsyslog] rsyslog still crashes In-Reply-To: <1233247001.19733.14.camel@rf10up.intern.adiscon.com> References: <1232046859.22744.39.camel@localhost.localdomain> <577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44FA8F@grfint2.intern.adiscon.com> <1233247001.19733.14.camel@rf10up.intern.adiscon.com> Message-ID: On Thu, 29 Jan 2009, Rainer Gerhards wrote: >> >> congradulations on tracking down a nasty and subtle issue. > > Thanks - but let's first see if this was the only issue and if things > run smooth everywhere. But it looks very promising. > bad news, on my system the HUP doesn't always reopen the files now. high speed box receiving messages via UDP, idle except for a gzip compressing the files (which are rotated once a min), the system runs fine for a few min (higher performance than before, it's now writing ~93,000 messages/sec instead of ~78,000 messages/sec), but it sometimes mangles handling a HUP and gets stuck. I have to do a kill -9 to kill and restart it. this is with the new HUP behavior. David Lang From david at lang.hm Fri Jan 30 06:13:07 2009 From: david at lang.hm (david at lang.hm) Date: Thu, 29 Jan 2009 21:13:07 -0800 (PST) Subject: [rsyslog] rsyslog still crashes In-Reply-To: References: <1232046859.22744.39.camel@localhost.localdomain> <577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44FA8F@grfint2.intern.adiscon.com> <1233247001.19733.14.camel@rf10up.intern.adiscon.com> Message-ID: On Thu, 29 Jan 2009, david at lang.hm wrote: > On Thu, 29 Jan 2009, Rainer Gerhards wrote: > >>> >>> congradulations on tracking down a nasty and subtle issue. >> >> Thanks - but let's first see if this was the only issue and if things >> run smooth everywhere. But it looks very promising. >> > > bad news, on my system the HUP doesn't always reopen the files now. > > high speed box receiving messages via UDP, idle except for a gzip > compressing the files (which are rotated once a min), the system runs fine > for a few min (higher performance than before, it's now writing ~93,000 > messages/sec instead of ~78,000 messages/sec), but it sometimes mangles > handling a HUP and gets stuck. I have to do a kill -9 to kill and restart > it. > > this is with the new HUP behavior. interesting note on memory useage. I'm using the default fixed array queue type on this box with a 1K max message length. if I hammer the box with a steady ~120K messages/sec (while it can write 93K/sec) the queue builds up to where it takes ~12G of ram. at this point the throughput takes a nose dive (not just dropping inbound packets, but also the number of packets written is much less) if I kill the sender, it starts emptying it's queue (interestingly, not quite as fast as if it is also recieving some messages), but the memory isn't freed up until I start sending it messages again. David Lang From rgerhards at hq.adiscon.com Thu Jan 29 19:34:50 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Thu, 29 Jan 2009 19:34:50 +0100 Subject: [rsyslog] rsyslog still crashes In-Reply-To: References: <1232046859.22744.39.camel@localhost.localdomain> <577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44FA8F@grfint2.intern.adiscon.com> <1233247001.19733.14.camel@rf10up.intern.adiscon.com> Message-ID: <1233254090.19733.22.camel@rf10up.intern.adiscon.com> On Thu, 2009-01-29 at 21:13 -0800, david at lang.hm wrote: > On Thu, 29 Jan 2009, david at lang.hm wrote: > interesting note on memory useage. > > I'm using the default fixed array queue type on this box with a 1K max > message length. if I hammer the box with a steady ~120K messages/sec > (while it can write 93K/sec) the queue builds up to where it takes ~12G of > ram. at this point the throughput takes a nose dive (not just dropping > inbound packets, but also the number of packets written is much less) > > if I kill the sender, it starts emptying it's queue (interestingly, not > quite as fast as if it is also recieving some messages), but the memory > isn't freed up until I start sending it messages again. This actually is expected behavior - and it has lots to do with "last message repeated n time". In order to implement that functionality, I need to hold on the the last message until a new one comes in (so that I can compare new to old). As such, a message that is fully processed can not immediately be freed. This happens, when the next message comes in - whenever this be. Note that each output has separate "last message..." status, so each action keeps a copy of the previous message until a new one arrives. What now happens is that when the queue builds up, malloc extends the data segment size. It is fair to assume that the last message received - on a very busy system will probably end up at a high location in the data segment (but note it is just a probability - it may even receive a very low location, if that was just freed immediately before). When the queue is now drained, we free everything but this message. As the message is still referenced for "last m...", it can not be freed. As it has a high address, the data segment size can not be reduced. As such, rsyslog still holds the whole data segement, with it containing almost no actually allocated memory. I do not know if the runtime system has a way to tell the OS it now uses a "sparse data segement", but I guess it doesn't do that. When the next message comes in (hours later?), the previous message can be freed, and the runtime can then reduce the data segment size (which should result in a sharp decrease of memory usage seen). This is one of the reasons I don't like "last message...". I hope this clarifies. Rainer From rgerhards at hq.adiscon.com Thu Jan 29 20:40:33 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Thu, 29 Jan 2009 20:40:33 +0100 Subject: [rsyslog] rsyslog still crashes In-Reply-To: References: <1232046859.22744.39.camel@localhost.localdomain> <577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44FA8F@grfint2.intern.adiscon.com> <1233247001.19733.14.camel@rf10up.intern.adiscon.com> Message-ID: <1233258033.19733.27.camel@rf10up.intern.adiscon.com> Hi David, thanks for this note, but I think it is not related to the fix (I'll think a bit harder about that, but so far I can not find any connection between the two). The way the HUP is done is sub-optimal. Under typical load (one hup a day), you don't see any issue. If you hup very frequently (like the once a min you do) and have heavy traffic, that's another story. To solve that case, some rework on the hup internals, actually even on the interface definition, is needed. I'd hold all such work unless I found a solution to the race bug - because it would have made the environment even more different. Now that I have at least one issue, I think I can go ahead and begin to introduce more intrusive changes again. In any case, I'll have a more in-depth look at the hup handlers. The new non-restart type of hup should be almost resistant against the issue you report. Rainer On Thu, 2009-01-29 at 20:56 -0800, david at lang.hm wrote: > On Thu, 29 Jan 2009, Rainer Gerhards wrote: > > >> > >> congradulations on tracking down a nasty and subtle issue. > > > > Thanks - but let's first see if this was the only issue and if things > > run smooth everywhere. But it looks very promising. > > > > bad news, on my system the HUP doesn't always reopen the files now. > > high speed box receiving messages via UDP, idle except for a gzip > compressing the files (which are rotated once a min), the system runs fine > for a few min (higher performance than before, it's now writing ~93,000 > messages/sec instead of ~78,000 messages/sec), but it sometimes mangles > handling a HUP and gets stuck. I have to do a kill -9 to kill and restart > it. > > this is with the new HUP behavior. > > David Lang > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com From rgerhards at hq.adiscon.com Thu Jan 29 21:25:27 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Thu, 29 Jan 2009 21:25:27 +0100 Subject: [rsyslog] rsyslog still crashes In-Reply-To: References: <1232046859.22744.39.camel@localhost.localdomain> <577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44FA8F@grfint2.intern.adiscon.com> <1233247001.19733.14.camel@rf10up.intern.adiscon.com> Message-ID: <1233260727.19733.71.camel@rf10up.intern.adiscon.com> On Thu, 2009-01-29 at 19:51 -0800, david at lang.hm wrote: > the new C0x standard will add atomic ops and guarentees (some of which are > not nessasarily provided by the chip, but have to be provided by the > compiler/library instead), so watch for it, but test the performance of > them before you trust them This is very important work, especially if you think about future advances in hardware design. However, I think we will be years away from the point where one can actually use this and hope to be somewhat portable. Same for performance: early implementation will probably be sub-optimal (though it should be fairly simple to map current compiler-specific options for atomic ops to the new standard once... but we know what happens when new standards come out...). > > On the other hand -O3 does things like loop unrolling, which definitely > > is a bad idea with modern cache systems. > > > > My preliminarily conclusion is that -O2 is probably best, and may be > > tuned by turning on and off specific optimizations via their specific > > compiler switches. > > this has been the prevailing wisdom for many years, but I've seen myself > many cases where -Os has ended up being faster in the real world, in spite > of the various things that -O2 does 'better' I think the phrase "it depends on the scenario" is very important here. > is it the case that -Os would break things? or just that you think it's > alignment may not be as good? It does not break things. The alignment for any structures that are passed as part of the API should be properly contained in the header files. However, I have not specifically tested this. The point is just that, at least on some machines, non-aligned addresses severely hit cache performance. So optimizing for size, and as a side-effect generating unaligned data accesses, can be a real performance drawback. It may well cost more performance than the improved L1 (or trace cache) performance offers. In any case, if we go down to that level, I think there are better places to test and optimize - not to mention that on the upper layer (OS calls!) there is still room for improvement. On of my favorite CPU-level optimizations is the "exception system" that is currently in use in rsyslog. Thanks to your message, I've finally written down some information on it. I've done that on the forum, so that I can easily keep a permanent record of the discussion (and in an easier-to-follow form than with the mail archive): http://kb.monitorware.com/optimizing-exception-handling-t8911.html Feedback is appreciated. Rainer From rgerhards at hq.adiscon.com Fri Jan 30 14:34:07 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Fri, 30 Jan 2009 14:34:07 +0100 Subject: [rsyslog] rsyslog 4.1.4 - one (small) bug left Message-ID: <577465F99B41C842AAFBE9ED71E70ABA44FADC@grfint2.intern.adiscon.com> Hi all, I have now basically ported the race bugfix to all branches (verification and double-check still in the works). While doing this, I noticed that one small issues slipped my attention with yesterday's 4.1.4 version. If compiled with atomics, I unlock an already unlocked mutex (which is destroyed with the very next statement) in msgDestruct. That should not have any really bad effects (but you never know...). The master branch is now updated, so you may want to pull a fixed version from there. I will not do a new release just for this reason - it'll be included in the next version. Please note that git as of now already contains all the race fix for all branches, but mostly untested. Just in case if you'd like to get them quickly. I will keep you posted. Rainer From rgerhards at hq.adiscon.com Fri Jan 30 16:47:55 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Fri, 30 Jan 2009 16:47:55 +0100 Subject: [rsyslog] hang on HUP - was: rsyslog still crashes In-Reply-To: References: <1232046859.22744.39.camel@localhost.localdomain> <577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44FA8F@grfint2.intern.adiscon.com> <1233247001.19733.14.camel@rf10up.intern.adiscon.com> Message-ID: <1233330475.19733.88.camel@rf10up.intern.adiscon.com> On Thu, 2009-01-29 at 20:56 -0800, david at lang.hm wrote: > high speed box receiving messages via UDP, idle except for a gzip > compressing the files (which are rotated once a min), the system runs fine > for a few min (higher performance than before, it's now writing ~93,000 > messages/sec instead of ~78,000 messages/sec), but it sometimes mangles > handling a HUP and gets stuck. I have to do a kill -9 to kill and restart > it. > > this is with the new HUP behavior. I cross-checked the HUP processing. So far, I do not see why it hangs (and if it is related to the HUP processing). Can you reproduce it with debug log running. I guess no, but if so, could you provide me a log with ~1000 log lines before the hang? If debug log is no option, a stack trace from the abort would be great. Rainer From david at lang.hm Fri Jan 30 18:19:21 2009 From: david at lang.hm (david at lang.hm) Date: Fri, 30 Jan 2009 09:19:21 -0800 (PST) Subject: [rsyslog] rsyslog still crashes In-Reply-To: <1233258033.19733.27.camel@rf10up.intern.adiscon.com> References: <1232046859.22744.39.camel@localhost.localdomain> <577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44FA8F@grfint2.intern.adiscon.com> <1233247001.19733.14.camel@rf10up.intern.adiscon.com> <1233258033.19733.27.camel@rf10up.intern.adiscon.com> Message-ID: On Thu, 29 Jan 2009, Rainer Gerhards wrote: > Hi David, > > thanks for this note, but I think it is not related to the fix (I'll > think a bit harder about that, but so far I can not find any connection > between the two). > > The way the HUP is done is sub-optimal. Under typical load (one hup a > day), you don't see any issue. If you hup very frequently (like the once > a min you do) and have heavy traffic, that's another story. To solve > that case, some rework on the hup internals, actually even on the > interface definition, is needed. I'd hold all such work unless I found a > solution to the race bug - because it would have made the environment > even more different. Now that I have at least one issue, I think I can > go ahead and begin to introduce more intrusive changes again. > > In any case, I'll have a more in-depth look at the hup handlers. The new > non-restart type of hup should be almost resistant against the issue you > report. I was using the new non-restart type. I'll be doing more testing today and over the weekend. it's posible that I ended up with mixed versions with the modules again (just before going home last night I deleted them all and then did the install to make sure) David Lang > Rainer > > On Thu, 2009-01-29 at 20:56 -0800, david at lang.hm wrote: >> On Thu, 29 Jan 2009, Rainer Gerhards wrote: >> >>>> >>>> congradulations on tracking down a nasty and subtle issue. >>> >>> Thanks - but let's first see if this was the only issue and if things >>> run smooth everywhere. But it looks very promising. >>> >> >> bad news, on my system the HUP doesn't always reopen the files now. >> >> high speed box receiving messages via UDP, idle except for a gzip >> compressing the files (which are rotated once a min), the system runs fine >> for a few min (higher performance than before, it's now writing ~93,000 >> messages/sec instead of ~78,000 messages/sec), but it sometimes mangles >> handling a HUP and gets stuck. I have to do a kill -9 to kill and restart >> it. >> >> this is with the new HUP behavior. >> >> David Lang >> _______________________________________________ >> rsyslog mailing list >> http://lists.adiscon.net/mailman/listinfo/rsyslog >> http://www.rsyslog.com > > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com > From david at lang.hm Fri Jan 30 18:28:56 2009 From: david at lang.hm (david at lang.hm) Date: Fri, 30 Jan 2009 09:28:56 -0800 (PST) Subject: [rsyslog] rsyslog 4.1.4 - one (small) bug left In-Reply-To: <577465F99B41C842AAFBE9ED71E70ABA44FADC@grfint2.intern.adiscon.com> References: <577465F99B41C842AAFBE9ED71E70ABA44FADC@grfint2.intern.adiscon.com> Message-ID: On Fri, 30 Jan 2009, Rainer Gerhards wrote: > Hi all, > > I have now basically ported the race bugfix to all branches > (verification and double-check still in the works). While doing this, I > noticed that one small issues slipped my attention with yesterday's > 4.1.4 version. If compiled with atomics, I unlock an already unlocked > mutex (which is destroyed with the very next statement) in msgDestruct. > That should not have any really bad effects (but you never know...). The > master branch is now updated, so you may want to pull a fixed version > from there. I will not do a new release just for this reason - it'll be > included in the next version. so 4.1.4 should be using the atomics for queue management not mutexes? David Lang > Please note that git as of now already contains all the race fix for all > branches, but mostly untested. Just in case if you'd like to get them > quickly. > > I will keep you posted. > > Rainer > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com > From rgerhards at hq.adiscon.com Fri Jan 30 17:28:47 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Fri, 30 Jan 2009 17:28:47 +0100 Subject: [rsyslog] rsyslog 4.1.4 - one (small) bug left In-Reply-To: References: <577465F99B41C842AAFBE9ED71E70ABA44FADC@grfint2.intern.adiscon.com> Message-ID: <577465F99B41C842AAFBE9ED71E70ABA44FAE1@grfint2.intern.adiscon.com> > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of david at lang.hm > Sent: Friday, January 30, 2009 6:29 PM > To: rsyslog-users > Subject: Re: [rsyslog] rsyslog 4.1.4 - one (small) bug left > > On Fri, 30 Jan 2009, Rainer Gerhards wrote: > > > Hi all, > > > > I have now basically ported the race bugfix to all branches > > (verification and double-check still in the works). While doing this, > I > > noticed that one small issues slipped my attention with yesterday's > > 4.1.4 version. If compiled with atomics, I unlock an already unlocked > > mutex (which is destroyed with the very next statement) in > msgDestruct. > > That should not have any really bad effects (but you never know...). > The > > master branch is now updated, so you may want to pull a fixed version > > from there. I will not do a new release just for this reason - it'll > be > > included in the next version. > > so 4.1.4 should be using the atomics for queue management not mutexes? It depends... If atomics are available, they are the preferred method. If not available, the code falls back to mutexes. Rainer From theinric at redhat.com Mon Jan 5 15:52:38 2009 From: theinric at redhat.com (Tomas Heinrich) Date: Mon, 05 Jan 2009 15:52:38 +0100 Subject: [rsyslog] suggested tweak to rsyslog In-Reply-To: <577465F99B41C842AAFBE9ED71E70ABA44F916@grfint2.intern.adiscon.com> References: <1229626907.12594.19.camel@localhost.localdomain><1229627751.12594.23.camel@localhost.localdomain> <577465F99B41C842AAFBE9ED71E70ABA44F916@grfint2.intern.adiscon.com> Message-ID: <49621EB6.9010504@redhat.com> On 12/19/2008 12:57 PM, Rainer Gerhards wrote: > David, > > one thing I can do rather quickly. Maybe it's good enough. I've done a > tester, which lacks proper configuration, but I would appreciate > feedback on it: > > http://git.adiscon.com/?p=rsyslog.git;a=commitdiff;h=a185665be4cf6997525 > 89d81ef6e396dd61f68b6 > > Details in git commit comment. > > Rainer Hi, I think there's a small bug in the new code: - snprintf((char*)szRepMsg, sizeof(szRepMsg), "message repeated %d times: [%.800]", + snprintf((char*)szRepMsg, sizeof(szRepMsg), "message repeated %d times: [%.800s]", Tomas From theinric at redhat.com Tue Jan 6 18:02:37 2009 From: theinric at redhat.com (Tomas Heinrich) Date: Tue, 06 Jan 2009 18:02:37 +0100 Subject: [rsyslog] redundant message in log files Message-ID: <49638EAD.5080104@redhat.com> Hi, we've received a bug report [1] regarding a message that started to appear in the log files. The bug first appeared in version 3.21.5. This patch [2] should fix it. Tomas [1] https://bugzilla.redhat.com/show_bug.cgi?id=478612 [2] http://pastebin.ca/1301001 From mikel at irontec.com Sun Jan 11 21:41:11 2009 From: mikel at irontec.com (Mikel Jimenez Fernandez) Date: Sun, 11 Jan 2009 21:41:11 +0100 Subject: [rsyslog] [Fwd: Re: milliseconds timestamp] Message-ID: <496A5967.1050805@irontec.com> Dear Andre and Rainer Any progress in this? Thanks From rgerhards at hq.adiscon.com Mon Jan 12 08:56:10 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Mon, 12 Jan 2009 08:56:10 +0100 Subject: [rsyslog] [Fwd: Re: milliseconds timestamp] In-Reply-To: <496A5967.1050805@irontec.com> References: <496A5967.1050805@irontec.com> Message-ID: <577465F99B41C842AAFBE9ED71E70ABA44F964@grfint2.intern.adiscon.com> Hi, please quote what exactly you are looking for, I am no longer able to trace the question back to an issue ;) Rainer > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of Mikel Jimenez Fernandez > Sent: Sunday, January 11, 2009 9:41 PM > To: rsyslog-users > Subject: [rsyslog] [Fwd: Re: milliseconds timestamp] > > Dear Andre and Rainer > > Any progress in this? > > Thanks From rgerhards at hq.adiscon.com Mon Jan 12 11:11:10 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Mon, 12 Jan 2009 11:11:10 +0100 Subject: [rsyslog] suggested tweak to rsyslog In-Reply-To: <49621EB6.9010504@redhat.com> References: <1229626907.12594.19.camel@localhost.localdomain><1229627751.12594.23.camel@localhost.localdomain> <577465F99B41C842AAFBE9ED71E70ABA44F916@grfint2.intern.adiscon.com> <49621EB6.9010504@redhat.com> Message-ID: <577465F99B41C842AAFBE9ED71E70ABA44F96B@grfint2.intern.adiscon.com> Hi Tomas, thanks for the patch, looks like I have forgotten a commit ;) David and others: do you find this functionality useful? If I do not receive any further comments, I'll conclude it is not and will not further work on it. Thanks all, Rainer > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of Tomas Heinrich > Sent: Monday, January 05, 2009 3:53 PM > To: rsyslog-users > Subject: Re: [rsyslog] suggested tweak to rsyslog > > On 12/19/2008 12:57 PM, Rainer Gerhards wrote: > > David, > > > > one thing I can do rather quickly. Maybe it's good enough. I've done > a > > tester, which lacks proper configuration, but I would appreciate > > feedback on it: > > > > > http://git.adiscon.com/?p=rsyslog.git;a=commitdiff;h=a185665be4cf699752 > 5 > > 89d81ef6e396dd61f68b6 > > > > Details in git commit comment. > > > > Rainer > > Hi, > > I think there's a small bug in the new code: > > - snprintf((char*)szRepMsg, sizeof(szRepMsg), "message repeated %d > times: [%.800]", > + snprintf((char*)szRepMsg, sizeof(szRepMsg), "message repeated %d > times: [%.800s]", > > Tomas > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com From pieter.thysebaert at intec.ugent.be Wed Jan 14 13:37:31 2009 From: pieter.thysebaert at intec.ugent.be (pieter.thysebaert at intec.ugent.be) Date: Wed, 14 Jan 2009 13:37:31 +0100 (CET) Subject: [rsyslog] Property filter - output formatting Message-ID: <29691.212.190.198.36.1231936651.squirrel@webserver6.intec.ugent.be> Hello, I've started exploring rsyslog 3.20.2 As I have been toying around and looking at the example configurations, I have not been able to solve the following problem: how can I use a property filter to select an output file AND format the output using a defined template For instance: $template testtemplate,"%msg%" :syslogtag, contains, "test" /tmp/test.log;testtemplate Doesn't seem to be a supported syntax (it works when I leave off the ;testtemplate). I'm sorry if this is obvious, but how can I filter based on properties AND specify output formatting at the same time? Thanks, Pieter From rgerhards at hq.adiscon.com Wed Jan 14 00:08:27 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Wed, 14 Jan 2009 00:08:27 +0100 Subject: [rsyslog] Property filter - output formatting In-Reply-To: <29691.212.190.198.36.1231936651.squirrel@webserver6.intec.ugent.be> References: <29691.212.190.198.36.1231936651.squirrel@webserver6.intec.ugent.be> Message-ID: <1231888107.22744.19.camel@localhost.localdomain> Hi Pieter, I just tried this out in lab. For me, it works. If I generate a message with logger -t test my message the message is properly dispatched. I guess that the problem actually is the tag, which I guess does not contain what you think it does (a frequent problem with many senders). Try this template $template testtemplate,"tag: '%syslogtag%', rawmsg: '%rawmsg%'\n" *.* /some/file;testtemplate and let us know the result. HTH Rainer On Wed, 2009-01-14 at 13:37 +0100, pieter.thysebaert at intec.ugent.be wrote: > Hello, > > I've started exploring rsyslog 3.20.2 > > As I have been toying around and looking at the example configurations, I > have not been able to solve the following problem: > > how can I use a property filter to select an output file AND format the > output using a defined template > > For instance: > > $template testtemplate,"%msg%" > > :syslogtag, contains, "test" /tmp/test.log;testtemplate > > Doesn't seem to be a supported syntax (it works when I leave off the > ;testtemplate). > > I'm sorry if this is obvious, but how can I filter based on properties AND > specify output formatting at the same time? > > Thanks, > Pieter > > > > > > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com From rgerhards at hq.adiscon.com Wed Jan 14 17:14:45 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Wed, 14 Jan 2009 17:14:45 +0100 Subject: [rsyslog] rsyslog on LinkedIn Message-ID: <577465F99B41C842AAFBE9ED71E70ABA44F996@grfint2.intern.adiscon.com> Hi all, please pardon the shameless self-promotion. I have just created a rsyslog group on LinkedIn: http://www.linkedin.com/e/gis/1761607 It is an experiment. I've seen so many project creating groups on that platform that I wonder if if would make sense to create one for rsyslog. My intent is not to replace any of our technical and discussion forums, but open a new networking opportunity for those that are interested. I do not yet know if that's a good idea or not, but why not give it a try? ;) Back to our regular programming... Rainer From ray at jhax.net Wed Jan 14 12:50:48 2009 From: ray at jhax.net (Ray Whitmer) Date: Wed, 14 Jan 2009 04:50:48 -0700 Subject: [rsyslog] Use of application-level acks in RELP. Message-ID: <20090114045048.o2wpiannk4okcgw4@webmail.xmission.com> In my research of rsyslog to determine its suitability for a particular situation I have some questions left unanswered. I need relatively-guaranteed delivery. I will continue to review the available info including source code to see if I can answer the questions, but I hope it may be productive to ask questions here. In the documentation, you describe the situation where syslog silently loses tcp messages, not because the tcp protocol permits it but because the send function returns after delivering the message to a local buffer before it is actually delivered. But there is a more-fundamental reason an application-level ack is required. An application can fail (someone trips over the power cord) between when the application receives the data and when it records it. 1. Does rsyslog send the ack in the RELP protocol occur after the message has been safely recorded in whatever queue has been configured or forwarded on so its delivery status is as safe as it will get (of course how safe depends upon options chosen), or was it only intended to solve the case of TCP buffering-based unreliability? 2. Presumably there is a client API that speaks RELP. Can it be configured to return an error to the client if there is no ACK (i.e. if the log it sent did not make it into the configured safe location which could be on a disk-based queue), or does it only retry? Where is this API? Certainly the TCP caching case you mention in your pages is one a user is more likely to be able to reproduce, but that is all the more reason for me to be concerned that the less-reproducible situations that could cause a message to occasionally become lost are handled correctly. From rgerhards at hq.adiscon.com Thu Jan 15 09:16:36 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Thu, 15 Jan 2009 09:16:36 +0100 Subject: [rsyslog] Use of application-level acks in RELP. In-Reply-To: <20090114045048.o2wpiannk4okcgw4@webmail.xmission.com> References: <20090114045048.o2wpiannk4okcgw4@webmail.xmission.com> Message-ID: <1232007397.22744.27.camel@localhost.localdomain> Hi Ray, thanks for your excellent questions. I've also made a blog post out of them, as I think this needs some better visibility (and can be used for future reference). Just if you are curios: http://blog.gerhards.net/2009/01/use-of-application-level-acks-in-relp.html (no need to read, all answers are inline below) On Wed, 2009-01-14 at 04:50 -0700, Ray Whitmer wrote: > In my research of rsyslog to determine its suitability for a > particular situation I have some questions left unanswered. I need > relatively-guaranteed delivery. I will continue to review the > available info including source code to see if I can answer the > questions, but I hope it may be productive to ask questions here. > > In the documentation, you describe the situation where syslog silently > loses tcp messages, not because the tcp protocol permits it but > because the send function returns after delivering the message to a > local buffer before it is actually delivered. > > But there is a more-fundamental reason an application-level ack is > required. An application can fail (someone trips over the power cord) > between when the application receives the data and when it records it. > > 1. Does rsyslog send the ack in the RELP protocol occur after the > message has been safely recorded in whatever queue has been configured > or forwarded on so its delivery status is as safe as it will get (of > course how safe depends upon options chosen), or was it only intended > to solve the case of TCP buffering-based unreliability? RELP is designed to provide end-to-end reliability. The TCP buffering issue is just highlighted because it is so subtle that most people tend to overlook it. An application abort seems to be more obvious and RELP handles that. HOWEVER, that does not mean messages are necessarily recorded when the ACK is sent. It depends on the configuration. In RELP, the acknowledgment is sent after the reception callback has been called. This can be seen in the relevant RELP module. For rsyslog's imrelp, this means the callback returns after the message has been enqueued in the main message queue. It now depends on how that queue is configured. By default, messages are buffered in main memory. So when rsyslog aborts for some reason (or is terminated by user request) before this message is being processed, it is lost - while the sender still got a positive ACK. This is how things are done by default, and it is useful for many scenarios. Of course, it does not provide the audit-grade reliability that RELP aims for. But the default config needs to take care of the usual use case and this is not audit-grade reliablity (just think of the numerous home systems that run rsyslog and should do so in the least intrusive way). If you are serious about your logs, you need to configure the engine to be fully reliable. The most important thing is a good understanding of the queue engine. You need to read and understand the rsyslog queue ( http://www.rsyslog.com/doc-queues.html ) docs, as they form the basis on which reliability can be built. The other thing you need to know is your exact requirements. Asking for reliability is easy, implementing it is not. The more you near 100% reliability (which you will never reach for one reason or the other) the more complex scenarios get. I am sure the original post knows quite well what he want, but I am often approached by people who just want to have it "totally reliable" ... but don't want to spent the fortune it requires (really - ever thought about the redundant data centers, power plants, satellite and sea links et all you need for that?). So it is absolutely vital to have good requirements, which also includes of when loss is acceptable, and at what cost this comes. Once you have these requirements, a rsyslog configuration that matches them can be designed. At this point, I'd like to note that it may also be useful to consider rsyslog professional services ( http://www.rsyslog.com/doc-professional_support.html ) as it provides valuable aid during design and probably deployment of a solution (I can't go into the full depth of enterprise requirements here). To go back to the original question: RELP has almost everything that is needed, but configuring the whole system in an audit-grade way requires (ample) work. > 2. Presumably there is a client API that speaks RELP. Can it be > configured to return an error to the client if there is no ACK (i.e. > if the log it sent did not make it into the configured safe location > which could be on a disk-based queue), or does it only retry? Where is > this API? The API is in librelp ( http://www.librelp.com/ ). But actually this is not what you are looking for. In rsyslog, an output module (here: omrelp) provides the status back to the caller. Then, configuration decides what happens. Messages may be discarded, sent to a different destination or retried. With omrelp, I think we have some hardcoded ways to preserve the message, but I have no time yet to look this up in detail. In any case, RELP will not loose messages but may duplicate few of them (within the current unacked window) if the remote peer simply dies. Again, this requires proper configuration of the rsyslog components. Even with that, you may loose messages if the local rsyslogd dies (not terminates, but dies for some unexpected reason, e.g. a segfault, kill -9 or whatever) but still has messages in a not persisted queue. Again, this can be mitigated by proper configuration, but that must be designed. Also, it is very costly in terms of performance. A good reading on the subtleties can be in the rsyslog mailing list archive (http://lists.adiscon.net/pipermail/rsyslog/2008-October/001224.html ). I suggest to have a look at it. > > Certainly the TCP caching case you mention in your pages is one a user > is more likely to be able to reproduce, but that is all the more > reason for me to be concerned that the less-reproducible situations > that could cause a message to occasionally become lost are handled > correctly. I don't think app-abort is less reproducable kill -9 `cat /var/run/rsyslog.pid` will do nicely. Actually, from feedback I received, many users seem to understand the implications of a program/system abort. But far fewer understand the issues inherent in TCP. Thus I am focusing so much on the later. But of course, everything needs to be considered. Read the thread about the reliable queue (really!). It goes great lengths, but still does not offer a full solution. Getting things reliable (or secure) is very, very challenging and requires in-depth knowledge. So I am glad you asked and provided an opportunity for this to be written :) Rainer From rgerhards at hq.adiscon.com Thu Jan 15 13:00:37 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Thu, 15 Jan 2009 13:00:37 +0100 Subject: [rsyslog] redundant message in log files In-Reply-To: <49638EAD.5080104@redhat.com> References: <49638EAD.5080104@redhat.com> Message-ID: <1232020837.22744.28.camel@localhost.localdomain> Thanks, this one now finally is corrected, too (still catching up with vacation mail ;)). Will release it as part of 3.21.10. Rainer On Tue, 2009-01-06 at 18:02 +0100, Tomas Heinrich wrote: > Hi, > > we've received a bug report [1] regarding a message that started to > appear in the log files. The bug first appeared in version 3.21.5. > This patch [2] should fix it. > > Tomas > > [1] https://bugzilla.redhat.com/show_bug.cgi?id=478612 > [2] http://pastebin.ca/1301001 > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com From fjianella at gmail.com Thu Jan 15 15:45:53 2009 From: fjianella at gmail.com (Frank Ianella) Date: Thu, 15 Jan 2009 09:45:53 -0500 Subject: [rsyslog] uclibc compile failure Message-ID: <9f1ad2df0901150645u5cd90986k6b92a473beb73257@mail.gmail.com> hello all compiling stable and dev versions of rsyslog against uclibc-0.9.30 results in the following error: /home/build/project/sources/rsyslog-3.21.9/tools/syslogd.c:2995: undefined reference to `rpl_malloc' rsyslogd-syslogd.o: In function `legacyOptsEnq': /home/build/project/sources/rsyslog-3.21.9/tools/syslogd.c:1742: undefined reference to `rpl_malloc' rsyslogd-syslogd.o: In function `crunch_list': /home/build/project/sources/rsyslog-3.21.9/tools/syslogd.c:490: undefined reference to `rpl_malloc' /home/build/project/sources/rsyslog-3.21.9/tools/syslogd.c:502: undefined reference to `rpl_malloc' /home/build/project/sources/rsyslog-3.21.9/tools/syslogd.c:512: undefined reference to `rpl_malloc' rsyslogd-syslogd.o:/home/build/project/sources/rsyslog-3.21.9/tools/syslogd.c:1319: more undefined references to `rpl_malloc' follow ../runtime/.libs/librsyslog.a(librsyslog_la-wtp.o): In function `wtpStartWrkr': /home/build/project/sources/rsyslog-3.21.9/runtime/wtp.c:487: undefined reference to `pthread_yield' ../runtime/.libs/librsyslog.a(librsyslog_la-wtp.o): In function `wtpConstructFinalize': /home/build/project/sources/rsyslog-3.21.9/runtime/wtp.c:109: undefined reference to `rpl_malloc' ../runtime/.libs/librsyslog.a(librsyslog_la-wti.o): In function `wtiSetDbgHdr': /home/build/project/sources/rsyslog-3.21.9/runtime/wti.c:456: undefined reference to `rpl_malloc' ../runtime/.libs/librsyslog.a(librsyslog_la-wti.o): In function `wtiWorker': /home/build/project/sources/rsyslog-3.21.9/runtime/wti.c:370: undefined reference to `pthread_yield' ../runtime/.libs/librsyslog.a(librsyslog_la-queue.o): In function `queueAddLinkedList': /home/build/project/sources/rsyslog-3.21.9/runtime/queue.c:528: undefined reference to `rpl_malloc' /home/build/project/sources/rsyslog-3.21.9/runtime/queue.c:528: undefined reference to `rpl_malloc' ../runtime/.libs/librsyslog.a(librsyslog_la-queue.o): In function `qConstructFixedArray': /home/build/project/sources/rsyslog-3.21.9/runtime/queue.c:459: undefined reference to `rpl_malloc' ../runtime/.libs/librsyslog.a(librsyslog_la-queue.o): In function `queueSetFilePrefix': /home/build/project/sources/rsyslog-3.21.9/runtime/queue.c:2081: undefined reference to `rpl_malloc' ../runtime/.libs/librsyslog.a(librsyslog_la-queue.o): In function `queueStart': /home/build/project/sources/rsyslog-3.21.9/runtime/queue.c:1794: undefined reference to `rpl_malloc' ../runtime/.libs/librsyslog.a(librsyslog_la-threads.o):/home/build/project/sources/rsyslog-3.21.9/runtime/../threads.c:60: more undefined references to `rpl_malloc' follow I recompiled uclibc with MALLOC_GLIBC_COMPAT=y but the result was the same. The only reference to this that I can find is in the rsyslog bug tracker but the patch listed there does not allow it to compile. Just wondering if anybody has a working patch or suggestion. TIA -Frank From danson at rackspace.com Thu Jan 15 18:28:57 2009 From: danson at rackspace.com (Daniel Anson) Date: Thu, 15 Jan 2009 11:28:57 -0600 Subject: [rsyslog] Baclogged files to disk are pretty slow Message-ID: <21149_1232040593_n0FHSsbU031125_96AF20FDF4301D419B33CCE8E3A0132B0AB699D2@SAT4MX07.RACKSPACE.CORP> I have been dealing with this problem for a few days now and perhaps I will be able to solicit some advice or help. Here is the issue. I have an rsyslog relay writing to a remote database server and caching to disk. The write to the database uses a MySQL stored procedure that can write about 4000 records per second. The rsyslog.conf parts are set up like so: $ModLoad immark $ModLoadd imudp $UDPServerAddress 172.16.12.138 $UDPServerRun 514 $ModLoad imtcp $ModLoad imuxsock $ModLoad imklog $ModLoad ommysql.so $template template1,"CALL SAT2_RSYSLOG_EVENT_INSERT('%timestamp:::date-mysql%', '%timegenerated:::date-mysql%', '%syslogfacility%', '%syslogpriority%', '%hostname%', '%syslogtag%', '%msg%')", sql $WorkDirectory /rsyslog/work $ActionQueueType LinkedList # use asynchronous processing $ActionQueueFileName dbq # set file name, also enables disk mode $ActionResumeRetryCount -1 # infinite retries on insert failure *.* >172.16.2.238,rsyslog,syslogwriter,topsecret;template1 If I turn off the database, in this case I turned it off for almost a day, it backlogs nearly a 1 GB worth of information. The problem is that it takes nearly 6 hours to catch back up from this. While catching up, it only uses about 1% of the proc. Bandwidth is not an issue as the fibre link is only about 50% saturated. Is there a way to force rsyslogd to consume more of the proc and move faster. I have placed a -20 nice value on the process in hopes that would help but it really has not. Is there a way to force rsyslogd to use a pool of MySQL connections or intiate a new connection each time a record is written? Daniel M. Anson Linux Systems Engineer Rackspace Managed Hosting danson at rackspace.com Confidentiality Notice: This e-mail message (including any attached or embedded documents) is intended for the exclusive and confidential use of the individual or entity to which this message is addressed, and unless otherwise expressly indicated, is confidential and privileged information of Rackspace. Any dissemination, distribution or copying of the enclosed material is prohibited. If you receive this transmission in error, please notify us immediately by e-mail at abuse at rackspace.com, and delete the original message. Your cooperation is appreciated. From rgerhards at hq.adiscon.com Thu Jan 15 18:45:09 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Thu, 15 Jan 2009 18:45:09 +0100 Subject: [rsyslog] Baclogged files to disk are pretty slow In-Reply-To: <21149_1232040593_n0FHSsbU031125_96AF20FDF4301D419B33CCE8E3A0132B0AB699D2@SAT4MX07.RACKSPACE.CORP> References: <21149_1232040593_n0FHSsbU031125_96AF20FDF4301D419B33CCE8E3A0132B0AB699D2@SAT4MX07.RACKSPACE.CORP> Message-ID: <577465F99B41C842AAFBE9ED71E70ABA44F9A8@grfint2.intern.adiscon.com> Mhhh... with the current design, it submits messages individually to the database. I think what you experience is simply the turn-around from the database call (no other idea what it could be). It doesn't use more CPU because the database layer seems not to return any faster. There has been some discussion on batching multiple statements together, but this is non-trivial. I lost funding and things like this need a corporate sponsor now (they are not of importance for the non-commercial user field...). You could try to run the action on its own queue and with multiple workers. That could (could!) improve performance. But it is just a guess. Do you have any chance to see how long the query takes inside the SQL engine? Rainer > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of Daniel Anson > Sent: Thursday, January 15, 2009 6:29 PM > To: rsyslog at lists.adiscon.com > Subject: [rsyslog] Baclogged files to disk are pretty slow > > I have been dealing with this problem for a few days now and perhaps I > will be able to solicit some advice or help. Here is the issue. I > have > an rsyslog relay writing to a remote database server and caching to > disk. The write to the database uses a MySQL stored procedure that can > write about 4000 records per second. The rsyslog.conf parts are set up > like so: > > $ModLoad immark > $ModLoadd imudp > $UDPServerAddress 172.16.12.138 > $UDPServerRun 514 > $ModLoad imtcp > $ModLoad imuxsock > $ModLoad imklog > $ModLoad ommysql.so > > $template template1,"CALL > SAT2_RSYSLOG_EVENT_INSERT('%timestamp:::date-mysql%', > '%timegenerated:::date-mysql%', '%syslogfacility%', '%syslogpriority%', > '%hostname%', '%syslogtag%', '%msg%')", sql > > $WorkDirectory /rsyslog/work > $ActionQueueType LinkedList # use asynchronous processing > $ActionQueueFileName dbq # set file name, also enables disk mode > $ActionResumeRetryCount -1 # infinite retries on insert failure > > *.* >172.16.2.238,rsyslog,syslogwriter,topsecret;template1 > > If I turn off the database, in this case I turned it off for almost a > day, it backlogs nearly a 1 GB worth of information. The problem is > that it takes nearly 6 hours to catch back up from this. While > catching > up, it only uses about 1% of the proc. Bandwidth is not an issue as > the > fibre link is only about 50% saturated. Is there a way to force > rsyslogd to consume more of the proc and move faster. I have placed a > -20 nice value on the process in hopes that would help but it really > has > not. Is there a way to force rsyslogd to use a pool of MySQL > connections or intiate a new connection each time a record is written? > > > Daniel M. Anson > Linux Systems Engineer > Rackspace Managed Hosting > danson at rackspace.com > > > > > Confidentiality Notice: This e-mail message (including any attached or > embedded documents) is intended for the exclusive and confidential use > of the > individual or entity to which this message is addressed, and unless > otherwise > expressly indicated, is confidential and privileged information of > Rackspace. > Any dissemination, distribution or copying of the enclosed material is > prohibited. > If you receive this transmission in error, please notify us immediately > by e-mail > at abuse at rackspace.com, and delete the original message. > Your cooperation is appreciated. > > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com From lorenzo at sancho.ccd.uniroma2.it Thu Jan 15 18:58:37 2009 From: lorenzo at sancho.ccd.uniroma2.it (Lorenzo M. Catucci) Date: Thu, 15 Jan 2009 18:58:37 +0100 (CET) Subject: [rsyslog] rsyslog still crashes Message-ID: I've just tried again rsyslog on my 8 core mail server, and got the very same crash from september/october. I've restarted the server under valgrind control, and all seems to be running well... A good 2009 to all! Yours, lorenzo +-------------------------+----------------------------------------------+ | Lorenzo M. Catucci | Centro di Calcolo e Documentazione | | catucci at ccd.uniroma2.it | Universit? degli Studi di Roma "Tor Vergata" | | | Via O. Raimondo 18 ** I-00173 ROMA ** ITALY | | Tel. +39 06 7259 2255 | Fax. +39 06 7259 2125 | +-------------------------+----------------------------------------------+ -------------- next part -------------- GNU gdb 6.8-debian Copyright (C) 2008 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-linux-gnu"... Reading symbols from /usr/lib/libz.so.1...done. Loaded symbols for /usr/lib/libz.so.1 Reading symbols from /lib/libpthread.so.0...Reading symbols from /usr/lib/debug/lib/libpthread-2.7.so...done. done. Loaded symbols for /lib/libpthread.so.0 Reading symbols from /lib/libdl.so.2...Reading symbols from /usr/lib/debug/lib/libdl-2.7.so...done. done. Loaded symbols for /lib/libdl.so.2 Reading symbols from /lib/librt.so.1...Reading symbols from /usr/lib/debug/lib/librt-2.7.so...done. done. Loaded symbols for /lib/librt.so.1 Reading symbols from /lib/libc.so.6...Reading symbols from /usr/lib/debug/lib/libc-2.7.so...done. done. Loaded symbols for /lib/libc.so.6 Reading symbols from /lib/ld-linux-x86-64.so.2...Reading symbols from /usr/lib/debug/lib/ld-2.7.so...done. done. Loaded symbols for /lib64/ld-linux-x86-64.so.2 Reading symbols from /usr/lib/rsyslog/lmnet.so...done. Loaded symbols for /usr/lib/rsyslog/lmnet.so Reading symbols from /lib/libnss_files.so.2...Reading symbols from /usr/lib/debug/lib/libnss_files-2.7.so...done. done. Loaded symbols for /lib/libnss_files.so.2 Reading symbols from /usr/lib/rsyslog/imuxsock.so...done. Loaded symbols for /usr/lib/rsyslog/imuxsock.so Reading symbols from /usr/lib/rsyslog/imklog.so...done. Loaded symbols for /usr/lib/rsyslog/imklog.so Reading symbols from /lib/libnss_compat.so.2...Reading symbols from /usr/lib/debug/lib/libnss_compat-2.7.so...done. done. Loaded symbols for /lib/libnss_compat.so.2 Reading symbols from /lib/libnsl.so.1...Reading symbols from /usr/lib/debug/lib/libnsl-2.7.so...done. done. Loaded symbols for /lib/libnsl.so.1 Reading symbols from /lib/libnss_nis.so.2...Reading symbols from /usr/lib/debug/lib/libnss_nis-2.7.so...done. done. Loaded symbols for /lib/libnss_nis.so.2 Reading symbols from /usr/lib/rsyslog/lmnetstrms.so...done. Loaded symbols for /usr/lib/rsyslog/lmnetstrms.so Reading symbols from /usr/lib/rsyslog/lmtcpclt.so...done. Loaded symbols for /usr/lib/rsyslog/lmtcpclt.so Reading symbols from /usr/lib/rsyslog/lmnsd_ptcp.so...done. Loaded symbols for /usr/lib/rsyslog/lmnsd_ptcp.so Core was generated by `rsyslogd -c4'. Program terminated with signal 6, Aborted. [New process 22774] [New process 22776] [New process 22775] [New process 22773] [New process 22772] #0 0x00002b6037978ed5 in raise () from /lib/libc.so.6 (gdb) Thread 5 (process 22772): #0 0x00002b6037a0fce2 in select () from /lib/libc.so.6 #1 0x000000000040db53 in mainThread () at syslogd.c:2704 #2 0x000000000040ee56 in realMain (argc=, argv=) at syslogd.c:3631 #3 0x00002b60379651a6 in __libc_start_main () from /lib/libc.so.6 #4 0x000000000040a219 in _start () Thread 4 (process 22773): #0 0x00002b6037327fad in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib/libpthread.so.0 #1 0x0000000000432f5f in wtiWorker (pThis=0x685140) at wti.c:406 #2 0x000000000043172a in wtpWorker (arg=0x685140) at wtp.c:425 #3 0x00002b6037323fc7 in start_thread () from /lib/libpthread.so.0 #4 0x00002b6037a165ad in clone () from /lib/libc.so.6 #5 0x0000000000000000 in ?? () Thread 3 (process 22775): #0 0x00002b6037a0fce2 in select () from /lib/libc.so.6 #1 0x00002b60380b59fd in runInput (pThrd=) at imuxsock.c:280 #2 0x00000000004436ff in thrdStarter (arg=0x6a5c80) at ../threads.c:139 #3 0x00002b6037323fc7 in start_thread () from /lib/libpthread.so.0 #4 0x00002b6037a165ad in clone () from /lib/libc.so.6 #5 0x0000000000000000 in ?? () Thread 2 (process 22776): #0 0x00002b603732a7db in read () from /lib/libpthread.so.0 #1 0x00002b60382ba1ef in klogLogKMsg () at linux.c:449 #2 0x00002b60382b9594 in runInput (pThrd=0x6aafc0) at imklog.c:224 #3 0x00000000004436ff in thrdStarter (arg=0x6aafc0) at ../threads.c:139 #4 0x00002b6037323fc7 in start_thread () from /lib/libpthread.so.0 #5 0x00002b6037a165ad in clone () from /lib/libc.so.6 #6 0x0000000000000000 in ?? () Thread 1 (process 22774): #0 0x00002b6037978ed5 in raise () from /lib/libc.so.6 #1 0x00002b603797a3f3 in abort () from /lib/libc.so.6 #2 0x0000000000423657 in sigsegvHdlr (signum=6) at debug.c:759 #3 #4 0x00002b6037978ed5 in raise () from /lib/libc.so.6 #5 0x00002b603797a3f3 in abort () from /lib/libc.so.6 #6 0x00002b6037971dc9 in __assert_fail () from /lib/libc.so.6 #7 0x000000000041ce78 in msgDestruct (ppThis=0x68ace8) at msg.c:330 #8 0x0000000000443036 in actionCallAction (pAction=0x68ac70, pMsg=0x6b2010) at ../action.c:774 #9 0x000000000040b2c7 in processMsgDoActions (pData=0x68ac70, pParam=0x41000e90) at syslogd.c:1140 #10 0x000000000041de78 in llExecFunc (pThis=0x68aae0, pFunc=0x40b270 , pParam=0x41000e90) at linkedlist.c:391 #11 0x000000000040add9 in msgConsumer (notNeeded=, pUsr=) at syslogd.c:1183 #12 0x000000000043c4f7 in queueConsumerReg (pThis=0x68ff20, pWti=0x6a3bb0, iCancelStateSave=) at queue.c:1598 #13 0x0000000000432fd0 in wtiWorker (pThis=0x6a3bb0) at wti.c:416 #14 0x000000000043172a in wtpWorker (arg=0x6a3bb0) at wtp.c:425 #15 0x00002b6037323fc7 in start_thread () from /lib/libpthread.so.0 #16 0x00002b6037a165ad in clone () from /lib/libc.so.6 #17 0x0000000000000000 in ?? () (gdb) quit From hks.private at gmail.com Thu Jan 15 19:44:45 2009 From: hks.private at gmail.com ((private) HKS) Date: Thu, 15 Jan 2009 13:44:45 -0500 Subject: [rsyslog] rsyslog still crashes In-Reply-To: References: Message-ID: On Thu, Jan 15, 2009 at 12:58 PM, Lorenzo M. Catucci wrote: > I've just tried again rsyslog on my 8 core mail server, and got the very > same crash from september/october. I've restarted the server under valgrind > control, and all seems to be running well... > > A good 2009 to all! > > Yours, > > lorenzo Version you're using? -HKS From aoz.syn at gmail.com Thu Jan 15 20:11:09 2009 From: aoz.syn at gmail.com (RB) Date: Thu, 15 Jan 2009 12:11:09 -0700 Subject: [rsyslog] Baclogged files to disk are pretty slow In-Reply-To: <21149_1232040593_n0FHSsbU031125_96AF20FDF4301D419B33CCE8E3A0132B0AB699D2@SAT4MX07.RACKSPACE.CORP> References: <21149_1232040593_n0FHSsbU031125_96AF20FDF4301D419B33CCE8E3A0132B0AB699D2@SAT4MX07.RACKSPACE.CORP> Message-ID: <4255c2570901151111n6696fbc9md66a30c9bc9b4a10@mail.gmail.com> Many more measurements are needed before declaring a conclusive cause, but on the surface it seems that your bottleneck is not rsyslog or the sending server but the database itself. Comments below. On Thu, Jan 15, 2009 at 10:28, Daniel Anson wrote: > I have been dealing with this problem for a few days now and perhaps I > will be able to solicit some advice or help. Here is the issue. I have > an rsyslog relay writing to a remote database server and caching to > disk. The write to the database uses a MySQL stored procedure that can > write about 4000 records per second. The rsyslog.conf parts are set up Is that 4000 TPS burst or sustained speed? > If I turn off the database, in this case I turned it off for almost a > day, it backlogs nearly a 1 GB worth of information. The problem is Roughly how many records? > that it takes nearly 6 hours to catch back up from this. While catching > up, it only uses about 1% of the proc. Bandwidth is not an issue as the What's the processor and disk load look like on your MySQL server? > fibre link is only about 50% saturated. Is there a way to force Presuming 50% is your bps, what was your PPS? Depending on how large your average event/transaction are, you may never see 100% due to small packets. > not. Is there a way to force rsyslogd to use a pool of MySQL > connections or intiate a new connection each time a record is written? Ranier confirmed my suspicion that rsyslog executes a single transaction per event, which is (as he also notes) sub-optimal for performance. Batching really should be about the same logic as the MARK functionality: every N foo, output "bar". Multiple actions per transaction (batching) is a classic query tuning technique and can be approached many ways, but you probably need to verify your database I/O is indeed the bottleneck. From danson at rackspace.com Fri Jan 16 00:01:19 2009 From: danson at rackspace.com (Daniel Anson) Date: Thu, 15 Jan 2009 17:01:19 -0600 Subject: [rsyslog] Baclogged files to disk are pretty slow In-Reply-To: <4255c2570901151111n6696fbc9md66a30c9bc9b4a10@mail.gmail.com> References: <21149_1232040593_n0FHSsbU031125_96AF20FDF4301D419B33CCE8E3A0132B0AB699D2@SAT4MX07.RACKSPACE.CORP> <4255c2570901151111n6696fbc9md66a30c9bc9b4a10@mail.gmail.com> Message-ID: <19646_1232060570_n0FN2mSc020768_96AF20FDF4301D419B33CCE8E3A0132B0ACECB55@SAT4MX07.RACKSPACE.CORP> A few things about the MySQL server itself, I have eliminated bandwidth, proc speed, disk I/O as potential bottlenecks. The obvious bottleneck is the MySQL server. For a temporary solution, I have placed an rsyslog relay on the MySQL server. So: Client_message -> local_datacenter_relay -> remote_datacenter_relay -> MySQL_server The messages are traveling much faster (kudos to the socket programming there) as the remote relay writes to a local MySQL server. I do not believe this to be an optimal solution. In an earlier email, Rainer mentions and I quote: "You could try to run the action on its own queue and with multiple workers. That could (could!) improve performance. But it is just a guess. Do you have any chance to see how long the query takes inside the SQL engine?" MySQL will run about 4000 inserts per second (constant speed). I am willing to try what Rainer suggests; however, I am unsure how to direct specific actions to act on a queue. Any help s appreciated. I know I could add the two following lines and create worker threads: $ActionQueueWorkerThreads 20 $MainMsgQueueWorkerThreads 20 Would I have to add additional lines to the config. My config once again looks like so: $ModLoad immark $ModLoadd imudp $UDPServerAddress 172.16.12.138 $UDPServerRun 514 $ModLoad imtcp $ModLoad imuxsock $ModLoad imklog $ModLoad ommysql.so $template template1,"CALL SAT2_RSYSLOG_EVENT_INSERT('%timestamp:::date-mysql%', '%timegenerated:::date-mysql%', '%syslogfacility%', syslogpriority%', '%hostname%', '%syslogtag%', '%msg%')", sql $WorkDirectory /rsyslog/work $ActionQueueType LinkedList # use asynchronous processing $ActionQueueFileName dbq # set file name, also enables disk mode $ActionResumeRetryCount -1 # infinite retries on insert failure *.* >172.16.2.238,rsyslog,syslogwriter,topsecret;template1 I would hope that there is an easy solution as my next idea is to write some type of daemonized process that can insert messages from a pool of MySQL connections. I can achieve this in C but would rather hopefully find a solution inside of the configuration. Daniel M. Anson Linux Systems Engineer Rackspace Managed Hosting danson at rackspace.com -----Original Message----- From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog-bounces at lists.adiscon.com] On Behalf Of RB Sent: Thursday, January 15, 2009 1:11 PM To: rsyslog-users Subject: Re: [rsyslog] Baclogged files to disk are pretty slow Many more measurements are needed before declaring a conclusive cause, but on the surface it seems that your bottleneck is not rsyslog or the sending server but the database itself. Comments below. On Thu, Jan 15, 2009 at 10:28, Daniel Anson wrote: > I have been dealing with this problem for a few days now and perhaps I > will be able to solicit some advice or help. Here is the issue. I have > an rsyslog relay writing to a remote database server and caching to > disk. The write to the database uses a MySQL stored procedure that can > write about 4000 records per second. The rsyslog.conf parts are set up Is that 4000 TPS burst or sustained speed? > If I turn off the database, in this case I turned it off for almost a > day, it backlogs nearly a 1 GB worth of information. The problem is Roughly how many records? > that it takes nearly 6 hours to catch back up from this. While catching > up, it only uses about 1% of the proc. Bandwidth is not an issue as the What's the processor and disk load look like on your MySQL server? > fibre link is only about 50% saturated. Is there a way to force Presuming 50% is your bps, what was your PPS? Depending on how large your average event/transaction are, you may never see 100% due to small packets. > not. Is there a way to force rsyslogd to use a pool of MySQL > connections or intiate a new connection each time a record is written? Ranier confirmed my suspicion that rsyslog executes a single transaction per event, which is (as he also notes) sub-optimal for performance. Batching really should be about the same logic as the MARK functionality: every N foo, output "bar". Multiple actions per transaction (batching) is a classic query tuning technique and can be approached many ways, but you probably need to verify your database I/O is indeed the bottleneck. _______________________________________________ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com Confidentiality Notice: This e-mail message (including any attached or embedded documents) is intended for the exclusive and confidential use of the individual or entity to which this message is addressed, and unless otherwise expressly indicated, is confidential and privileged information of Rackspace. Any dissemination, distribution or copying of the enclosed material is prohibited. If you receive this transmission in error, please notify us immediately by e-mail at abuse at rackspace.com, and delete the original message. Your cooperation is appreciated. From mbiebl at gmail.com Fri Jan 16 01:20:22 2009 From: mbiebl at gmail.com (Michael Biebl) Date: Fri, 16 Jan 2009 01:20:22 +0100 Subject: [rsyslog] rsyslog still crashes In-Reply-To: References: Message-ID: 2009/1/15 (private) HKS : > On Thu, Jan 15, 2009 at 12:58 PM, Lorenzo M. Catucci > wrote: >> I've just tried again rsyslog on my 8 core mail server, and got the very >> same crash from september/october. I've restarted the server under valgrind >> control, and all seems to be running well... >> >> A good 2009 to all! >> >> Yours, >> >> lorenzo > > > Version you're using? Given the -c4 command line argument, I'd expect it to be 4.1.3. Sounds familiar to http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=509292 (which is 3.18.6). It seems to be a more general problem with multi core (= very fast??) systems. Cheers, Michael -- Why is it that all of the instruments seeking intelligent life in the universe are pointed away from Earth? From lorenzo at sancho.ccd.uniroma2.it Fri Jan 16 01:37:14 2009 From: lorenzo at sancho.ccd.uniroma2.it (Lorenzo M. Catucci) Date: Fri, 16 Jan 2009 01:37:14 +0100 (CET) Subject: [rsyslog] rsyslog still crashes In-Reply-To: References: Message-ID: On Thu, 15 Jan 2009, (private) HKS wrote: pH> pH> Version you're using? pH> git origin/master branch as of today. Sorry for forgetting to mention! +-------------------------+----------------------------------------------+ | Lorenzo M. Catucci | Centro di Calcolo e Documentazione | | catucci at ccd.uniroma2.it | Universit? degli Studi di Roma "Tor Vergata" | | | Via O. Raimondo 18 ** I-00173 ROMA ** ITALY | | Tel. +39 06 7259 2255 | Fax. +39 06 7259 2125 | +-------------------------+----------------------------------------------+ From rgerhards at hq.adiscon.com Thu Jan 15 20:06:06 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Thu, 15 Jan 2009 20:06:06 +0100 Subject: [rsyslog] rsyslog still crashes In-Reply-To: References: Message-ID: <1232046366.22744.34.camel@localhost.localdomain> On Fri, 2009-01-16 at 01:20 +0100, Michael Biebl wrote: > Given the -c4 command line argument, I'd expect it to be 4.1.3. > > Sounds familiar to > http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=509292 (which is > 3.18.6). > > It seems to be a more general problem with multi core (= very fast??) systems. Yes, that is what my analysis so far points to. It's also part of the problem, because I do not have very fast hardware to reproduce the issue (and it is also not easy to reliably reproduce if you have...). I've gotten a couple of reports (I think most on the mailing list) on such problems and all they have in common is 4+ core machines. I'll try to get hold based on what Lorenzo submits. In his environment, the problem seems to occur most reliably (he probably has the fastest machine...). Lorenzo: details follow soon. Rainer From rgerhards at hq.adiscon.com Thu Jan 15 20:14:18 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Thu, 15 Jan 2009 20:14:18 +0100 Subject: [rsyslog] rsyslog still crashes In-Reply-To: References: Message-ID: <1232046859.22744.39.camel@localhost.localdomain> On Thu, 2009-01-15 at 18:58 +0100, Lorenzo M. Catucci wrote: > I've just tried again rsyslog on my 8 core mail server, and got the very > same crash from september/october. So, without valgrind, can you reproduce the issue each time you start it? That would be very useful. > I've restarted the server under > valgrind control, and all seems to be running well... I guess the issue here is that valgrind slows down things and also simulates (I think) 2 CPUs only. > A good 2009 to all! same to you! Thanks for being persistent with this issue (it begins to drive me crazy). >From what I have learned so far we seem to have a race condition that causes memory corrupt. The backtrace you include also points into that direction. Those few cases where I got a usable backtrace all point to the very same location. However, that does not mean this location has the bug. It seems to occur some time earlier, and manifests when the message is destructed. It could be a double-free or even some wild memory access that accidently overwrites some structures. If we are able to get a stable repro, and we are able to run with at least some minimal diagnostics, we may be much better of tackeling that beast. First step is to see that we get a stable repro. If we do, I need to think about minimal debug. The full debugging system makes the bug disappear, I think because it changes the timing. Rainer From lorenzo at sancho.ccd.uniroma2.it Fri Jan 16 12:28:59 2009 From: lorenzo at sancho.ccd.uniroma2.it (Lorenzo M. Catucci) Date: Fri, 16 Jan 2009 12:28:59 +0100 (CET) Subject: [rsyslog] rsyslog still crashes In-Reply-To: <1232046859.22744.39.camel@localhost.localdomain> References: <1232046859.22744.39.camel@localhost.localdomain> Message-ID: On Thu, 15 Jan 2009, Rainer Gerhards wrote: RG> On Thu, 2009-01-15 at 18:58 +0100, Lorenzo M. Catucci wrote: RG> > I've just tried again rsyslog on my 8 core mail server, and got the very RG> > same crash from september/october. RG> RG> So, without valgrind, can you reproduce the issue each time you start RG> it? That would be very useful. RG> Yes: any time I start a free-running instance, I get the very same segmentation fault and core-file to backtrace. RG> RG> > I've restarted the server under RG> > valgrind control, and all seems to be running well... RG> RG> I guess the issue here is that valgrind slows down things and also RG> simulates (I think) 2 CPUs only. RG> Right, I didn't know valgrind both limited the CPU bandwidth and the (v)CPU number, but any of them would hide the existing race condition RG> RG> From what I have learned so far we seem to have a race condition that RG> causes memory corrupt. The backtrace you include also points into that RG> direction. Those few cases where I got a usable backtrace all point to RG> the very same location. However, that does not mean this location has RG> the bug. It seems to occur some time earlier, and manifests when the RG> message is destructed. It could be a double-free or even some wild RG> memory access that accidently overwrites some structures. RG> RG> If we are able to get a stable repro, and we are able to run with at RG> least some minimal diagnostics, we may be much better of tackeling that RG> beast. RG> RG> First step is to see that we get a stable repro. If we do, I need to RG> think about minimal debug. The full debugging system makes the bug RG> disappear, I think because it changes the timing. RG> I don't think we could hope for a stable reproducer for an heisen-bug... all I can provide is a very high throughput system generating a very high local message rate. As a matter of facts, this rsyslog instance is acting as a forwader to a remote instance that didn't suffer any crash. The only differences between the engines' configurations are: 1. the remote logs to a postgres instance instead of spool files, 2. the remote does just run the postgresql instance and the logger My gut feeling is that the different behaviour doesn't come from any of these differences, but from the different memory-path taken from the messages, which in the remote case are serialised from the underlying network transport. We'll see! Yours, lorenzo +-------------------------+----------------------------------------------+ | Lorenzo M. Catucci | Centro di Calcolo e Documentazione | | catucci at ccd.uniroma2.it | Universit? degli Studi di Roma "Tor Vergata" | | | Via O. Raimondo 18 ** I-00173 ROMA ** ITALY | | Tel. +39 06 7259 2255 | Fax. +39 06 7259 2125 | +-------------------------+----------------------------------------------+ From rgerhards at hq.adiscon.com Fri Jan 16 12:44:53 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Fri, 16 Jan 2009 12:44:53 +0100 Subject: [rsyslog] rsyslog still crashes In-Reply-To: References: <1232046859.22744.39.camel@localhost.localdomain> Message-ID: <577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com> > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of Lorenzo M. Catucci > Sent: Friday, January 16, 2009 12:29 PM > To: rsyslog-users > Subject: Re: [rsyslog] rsyslog still crashes > > On Thu, 15 Jan 2009, Rainer Gerhards wrote: > > RG> On Thu, 2009-01-15 at 18:58 +0100, Lorenzo M. Catucci wrote: > RG> > I've just tried again rsyslog on my 8 core mail server, and got > the very > RG> > same crash from september/october. > RG> > RG> So, without valgrind, can you reproduce the issue each time you > start > RG> it? That would be very useful. > RG> > > Yes: any time I start a free-running instance, I get the very same > segmentation fault and core-file to backtrace. > > RG> > RG> > I've restarted the server under > RG> > valgrind control, and all seems to be running well... > RG> > RG> I guess the issue here is that valgrind slows down things and also > RG> simulates (I think) 2 CPUs only. > RG> > > Right, I didn't know valgrind both limited the CPU bandwidth and the > (v)CPU number, but any of them would hide the existing race condition Actually, valgrind executes the app in a virtual CPU/Memory environment. So this is *quite different* from the real machine, but nevertheless extremely useful in most cases. While in theory so the actual hardware should not affect the valgrind outcome, my former debugging has shown it does. Thus my first try is always valgrind. But it seems not to help here as we have seen... > RG> > RG> From what I have learned so far we seem to have a race condition > that > RG> causes memory corrupt. The backtrace you include also points into > that > RG> direction. Those few cases where I got a usable backtrace all point > to > RG> the very same location. However, that does not mean this location > has > RG> the bug. It seems to occur some time earlier, and manifests when > the > RG> message is destructed. It could be a double-free or even some wild > RG> memory access that accidently overwrites some structures. > RG> > RG> If we are able to get a stable repro, and we are able to run with > at > RG> least some minimal diagnostics, we may be much better of tackeling > that > RG> beast. > RG> > RG> First step is to see that we get a stable repro. If we do, I need > to > RG> think about minimal debug. The full debugging system makes the bug > RG> disappear, I think because it changes the timing. > RG> > > I don't think we could hope for a stable reproducer for an heisen- > bug... Of course not 100%. But what you have sounds good enough. I must now see that/how I can change the system so that we have some additional instrumentation while the bug is still there. I'll first look at some compile options. Is it OK for you if I just send some messages to stdout? > all I can provide is a very high throughput system generating a very > high > local message rate. As a matter of facts, this rsyslog instance is > acting as a forwader to a remote instance that didn't suffer any crash. > > The only differences between the engines' configurations are: > 1. the remote logs to a postgres instance instead of spool files, > 2. the remote does just run the postgresql instance and the logger > > My gut feeling is that the different behaviour doesn't come from any of > these differences, but from the different memory-path taken from the > messages, which in the remote case are serialised from the underlying > network transport. This may be... Rainer From lorenzo at sancho.ccd.uniroma2.it Fri Jan 16 13:01:47 2009 From: lorenzo at sancho.ccd.uniroma2.it (Lorenzo M. Catucci) Date: Fri, 16 Jan 2009 13:01:47 +0100 (CET) Subject: [rsyslog] rsyslog still crashes In-Reply-To: <577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com> References: <1232046859.22744.39.camel@localhost.localdomain> <577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com> Message-ID: On Fri, 16 Jan 2009, Rainer Gerhards wrote: RG> RG> Of course not 100%. But what you have sounds good enough. I must now see RG> that/how I can change the system so that we have some additional RG> instrumentation while the bug is still there. I'll first look at some RG> compile options. Is it OK for you if I just send some messages to RG> stdout? RG> Yes, be it stdout... I'm eager to have an rsyslog instance running well, since I've really liked what I've seen (with the small exception of the crashes!) See you soon, lorenzo +-------------------------+----------------------------------------------+ | Lorenzo M. Catucci | Centro di Calcolo e Documentazione | | catucci at ccd.uniroma2.it | Universit? degli Studi di Roma "Tor Vergata" | | | Via O. Raimondo 18 ** I-00173 ROMA ** ITALY | | Tel. +39 06 7259 2255 | Fax. +39 06 7259 2125 | +-------------------------+----------------------------------------------+ From rgerhards at hq.adiscon.com Fri Jan 16 15:22:02 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Fri, 16 Jan 2009 15:22:02 +0100 Subject: [rsyslog] rsyslog still crashes In-Reply-To: References: <1232046859.22744.39.camel@localhost.localdomain><577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com> Message-ID: <577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com> Lorenzo, I have created a new branch "raceDebug" and done a first commit to it. The change is very lightweight. Please pull, compile as usual and give it a try. It spits out some info to stdout from time to time (hopefully). I am not sure if it aborts, depending on the output it may or may not. Even if we get messages, they are probably not enough to pinpoint the bug, but I wanted to do something very light to see if the bug stays. Feedback appreciated. Rainer > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of Lorenzo M. Catucci > Sent: Friday, January 16, 2009 1:02 PM > To: rsyslog-users > Subject: Re: [rsyslog] rsyslog still crashes > > On Fri, 16 Jan 2009, Rainer Gerhards wrote: > RG> > RG> Of course not 100%. But what you have sounds good enough. I must > now see > RG> that/how I can change the system so that we have some additional > RG> instrumentation while the bug is still there. I'll first look at > some > RG> compile options. Is it OK for you if I just send some messages to > RG> stdout? > RG> > > Yes, be it stdout... I'm eager to have an rsyslog instance running > well, > since I've really liked what I've seen (with the small exception of the > crashes!) > > See you soon, > > lorenzo > > > +-------------------------+-------------------------------------------- > --+ > | Lorenzo M. Catucci | Centro di Calcolo e Documentazione > | > | catucci at ccd.uniroma2.it | Universit? degli Studi di Roma "Tor > Vergata" | > | | Via O. Raimondo 18 ** I-00173 ROMA ** > ITALY | > | Tel. +39 06 7259 2255 | Fax. +39 06 7259 2125 > | > +-------------------------+-------------------------------------------- > --+ From pieter.thysebaert at intec.ugent.be Fri Jan 16 15:07:19 2009 From: pieter.thysebaert at intec.ugent.be (pieter.thysebaert at intec.ugent.be) Date: Fri, 16 Jan 2009 15:07:19 +0100 (CET) Subject: [rsyslog] (no subject) Message-ID: <56908.212.190.198.36.1232114839.squirrel@webserver6.intec.ugent.be> Hello, I've found on-line claims that rsyslog can be compiled (and maybe even runs ok?) on HP-UX. However, I've not found too much information about this, so I'd like to ask: has anyone been able to compile (and run) rsyslog 3.20.2 on HP-UX 11? If so, does it need patching? What packages are required to build it successfully? (only HP software or gcc + gnu tools?) I'm asking because a colleague briefly attempted to configure the package on hpux UX11.11, and configure ended with > checking for pthread.h... yes > checking for pthread_create in -lpthread... no Any success stories out there? Thanks! Pieter From aoz.syn at gmail.com Fri Jan 16 16:19:39 2009 From: aoz.syn at gmail.com (RB) Date: Fri, 16 Jan 2009 08:19:39 -0700 Subject: [rsyslog] Baclogged files to disk are pretty slow In-Reply-To: <19646_1232060570_n0FN2mSc020768_96AF20FDF4301D419B33CCE8E3A0132B0ACECB55@SAT4MX07.RACKSPACE.CORP> References: <21149_1232040593_n0FHSsbU031125_96AF20FDF4301D419B33CCE8E3A0132B0AB699D2@SAT4MX07.RACKSPACE.CORP> <4255c2570901151111n6696fbc9md66a30c9bc9b4a10@mail.gmail.com> <19646_1232060570_n0FN2mSc020768_96AF20FDF4301D419B33CCE8E3A0132B0ACECB55@SAT4MX07.RACKSPACE.CORP> Message-ID: <4255c2570901160719o4aa3bc6bk9813225374bfc53c@mail.gmail.com> On Thu, Jan 15, 2009 at 16:01, Daniel Anson wrote: > I would hope that there is an easy solution as my next idea is to write > some type of daemonized process that can insert messages from a pool of > MySQL connections. I can achieve this in C but would rather hopefully > find a solution inside of the configuration. Short of implementing the queue/worker configuration (no idea how), it seems the only current option would be to implement something of the sort, either by an update to the ommysql module (optimal, as it gets your code supported by someone else for its lifetim) or by some external program. I'd think an optimal external solution would be some sort of relp2mysql bridge, but suspect that would end up reimplementing a good chunk of rsyslog. From lorenzo at sancho.ccd.uniroma2.it Fri Jan 16 16:22:45 2009 From: lorenzo at sancho.ccd.uniroma2.it (Lorenzo M. Catucci) Date: Fri, 16 Jan 2009 16:22:45 +0100 (CET) Subject: [rsyslog] rsyslog still crashes In-Reply-To: <577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com> References: <1232046859.22744.39.camel@localhost.localdomain><577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com> Message-ID: On Fri, 16 Jan 2009, Rainer Gerhards wrote: RG> Lorenzo, RG> RG> I have created a new branch "raceDebug" and done a first commit to it. RG> The change is very lightweight. Please pull, compile as usual and give RG> it a try. It spits out some info to stdout from time to time RG> (hopefully). I am not sure if it aborts, depending on the output it RG> may or may not. Even if we get messages, they are probably not enough RG> to pinpoint the bug, but I wanted to do something very light to see if RG> the bug stays. RG> RG> Feedback appreciated. RG> Rainer, I've just checked-out the branch; I've run configure with the following command line: ./configure --prefix=/usr --enable-mysql --enable-pgsql --enable-mail --enable-imfile --enable-debug --enable-rtinst --enable-valgrind --no-create --no-recursion From "git diff -r HEAD^ HEAD" I've seen an #if 0 section in the commit. Let me know if you'd prefer if I change it to #if 1. I've just started rsyslogd with rsyslogd -c4 -n on a screen session, with the same configuration files I'm using since september. Since both the "rsyslogd -c4 -n" and the later "rsyslogd -c4 -d" invocation crashed very quickly, I've restarted it once more with stdout redirected to a a logfile, and now it's running. Will let you know if it crashes once more. Yours, lorenzo +-------------------------+----------------------------------------------+ | Lorenzo M. Catucci | Centro di Calcolo e Documentazione | | catucci at ccd.uniroma2.it | Universit? degli Studi di Roma "Tor Vergata" | | | Via O. Raimondo 18 ** I-00173 ROMA ** ITALY | | Tel. +39 06 7259 2255 | Fax. +39 06 7259 2125 | +-------------------------+----------------------------------------------+ From rgerhards at hq.adiscon.com Fri Jan 16 16:33:04 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Fri, 16 Jan 2009 16:33:04 +0100 Subject: [rsyslog] rsyslog still crashes In-Reply-To: References: <1232046859.22744.39.camel@localhost.localdomain><577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com> Message-ID: <577465F99B41C842AAFBE9ED71E70ABA44F9C2@grfint2.intern.adiscon.com> > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of Lorenzo M. Catucci > Sent: Friday, January 16, 2009 4:23 PM > To: rsyslog-users > Subject: Re: [rsyslog] rsyslog still crashes > > On Fri, 16 Jan 2009, Rainer Gerhards wrote: > > RG> Lorenzo, > RG> > RG> I have created a new branch "raceDebug" and done a first commit to > it. > RG> The change is very lightweight. Please pull, compile as usual and > give > RG> it a try. It spits out some info to stdout from time to time > RG> (hopefully). I am not sure if it aborts, depending on the output it > RG> may or may not. Even if we get messages, they are probably not > enough > RG> to pinpoint the bug, but I wanted to do something very light to see > if > RG> the bug stays. > RG> > RG> Feedback appreciated. > RG> > > Rainer, I've just checked-out the branch; I've run configure with the > following command line: > > ./configure --prefix=/usr --enable-mysql --enable-pgsql --enable-mail > --enable-imfile --enable-debug --enable-rtinst --enable-valgrind > --no-create --no-recursion > > From "git diff -r HEAD^ HEAD" I've seen an #if 0 section in the > commit. > Let me know if you'd prefer if I change it to #if 1. Mmmhh... you can use debug. Yes, please then change it to 1. > > I've just started rsyslogd with rsyslogd -c4 -n on a screen session, > with > the same configuration files I'm using since september. > > Since both the "rsyslogd -c4 -n" and the later "rsyslogd -c4 -d" > invocation crashed very quickly, I've restarted it once more with > stdout > redirected to a a logfile, and now it's running. Will let you know if > it > crashes once more. That sounds good. Do you happen to have the output from those crashes? Anyway, I will be interested in what it now comes up with. As a side-note, I have introduced another race by calling the library functions. There is always some good and bad. The regular debugging system prevents this problem by protecting the writes with mutexes. That, however, affects the timing and thus we do not see the real issue. So what I have done is bad, but may be useful. I forgot to mention that with my last post... Rainer > > Yours, > > lorenzo > > > +-------------------------+-------------------------------------------- > --+ > | Lorenzo M. Catucci | Centro di Calcolo e Documentazione > | > | catucci at ccd.uniroma2.it | Universit? degli Studi di Roma "Tor > Vergata" | > | | Via O. Raimondo 18 ** I-00173 ROMA ** > ITALY | > | Tel. +39 06 7259 2255 | Fax. +39 06 7259 2125 > | > +-------------------------+-------------------------------------------- > --+ From lorenzo at sancho.ccd.uniroma2.it Fri Jan 16 17:07:30 2009 From: lorenzo at sancho.ccd.uniroma2.it (Lorenzo M. Catucci) Date: Fri, 16 Jan 2009 17:07:30 +0100 (CET) Subject: [rsyslog] rsyslog still crashes In-Reply-To: <577465F99B41C842AAFBE9ED71E70ABA44F9C2@grfint2.intern.adiscon.com> References: <1232046859.22744.39.camel@localhost.localdomain><577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44F9C2@grfint2.intern.adiscon.com> Message-ID: On Fri, 16 Jan 2009, Rainer Gerhards wrote: RG> RG> That sounds good. Do you happen to have the output from those crashes? RG> The -n crash was completely silent; the -d run was chatty (as expected); with stdout redirected, it took a lot more time to crash, but here are both the logfile and the gdb backtrace. Yours, lorenzo +-------------------------+----------------------------------------------+ | Lorenzo M. Catucci | Centro di Calcolo e Documentazione | | catucci at ccd.uniroma2.it | Universit? degli Studi di Roma "Tor Vergata" | | | Via O. Raimondo 18 ** I-00173 ROMA ** ITALY | | Tel. +39 06 7259 2255 | Fax. +39 06 7259 2125 | +-------------------------+----------------------------------------------+ -------------- next part -------------- GNU gdb 6.8-debian Copyright (C) 2008 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-linux-gnu"... Reading symbols from /usr/lib/libz.so.1...done. Loaded symbols for /usr/lib/libz.so.1 Reading symbols from /lib/libpthread.so.0...Reading symbols from /usr/lib/debug/lib/libpthread-2.7.so...done. done. Loaded symbols for /lib/libpthread.so.0 Reading symbols from /lib/libdl.so.2...Reading symbols from /usr/lib/debug/lib/libdl-2.7.so...done. done. Loaded symbols for /lib/libdl.so.2 Reading symbols from /lib/librt.so.1...Reading symbols from /usr/lib/debug/lib/librt-2.7.so...done. done. Loaded symbols for /lib/librt.so.1 Reading symbols from /lib/libc.so.6...Reading symbols from /usr/lib/debug/lib/libc-2.7.so...done. done. Loaded symbols for /lib/libc.so.6 Reading symbols from /lib/ld-linux-x86-64.so.2...Reading symbols from /usr/lib/debug/lib/ld-2.7.so...done. done. Loaded symbols for /lib64/ld-linux-x86-64.so.2 Reading symbols from /usr/lib/rsyslog/lmnet.so...done. Loaded symbols for /usr/lib/rsyslog/lmnet.so Reading symbols from /lib/libnss_files.so.2...Reading symbols from /usr/lib/debug/lib/libnss_files-2.7.so...done. done. Loaded symbols for /lib/libnss_files.so.2 Reading symbols from /usr/lib/rsyslog/imuxsock.so...done. Loaded symbols for /usr/lib/rsyslog/imuxsock.so Reading symbols from /usr/lib/rsyslog/imklog.so...done. Loaded symbols for /usr/lib/rsyslog/imklog.so Reading symbols from /lib/libnss_compat.so.2...Reading symbols from /usr/lib/debug/lib/libnss_compat-2.7.so...done. done. Loaded symbols for /lib/libnss_compat.so.2 Reading symbols from /lib/libnsl.so.1...Reading symbols from /usr/lib/debug/lib/libnsl-2.7.so...done. done. Loaded symbols for /lib/libnsl.so.1 Reading symbols from /lib/libnss_nis.so.2...Reading symbols from /usr/lib/debug/lib/libnss_nis-2.7.so...done. done. Loaded symbols for /lib/libnss_nis.so.2 Reading symbols from /usr/lib/rsyslog/lmnetstrms.so...done. Loaded symbols for /usr/lib/rsyslog/lmnetstrms.so Reading symbols from /usr/lib/rsyslog/lmtcpclt.so...done. Loaded symbols for /usr/lib/rsyslog/lmtcpclt.so Reading symbols from /usr/lib/rsyslog/lmnsd_ptcp.so...done. Loaded symbols for /usr/lib/rsyslog/lmnsd_ptcp.so Core was generated by `rsyslogd -c4 -n'. Program terminated with signal 11, Segmentation fault. [New process 19309] [New process 19311] [New process 19310] [New process 19308] [New process 19307] #0 0x000000000041cb79 in msgDestruct (ppThis=0x68ae18) at msg.c:354 354 if(strcmp((char*)(((obj_t*)pThis)->pObjInfo->pszID), "msg")) { (gdb) Thread 5 (process 19307): #0 0x00002af4d1020ce2 in select () from /lib/libc.so.6 #1 0x000000000040db93 in mainThread () at syslogd.c:2704 #2 0x000000000040ee96 in realMain (argc=, argv=) at syslogd.c:3631 #3 0x00002af4d0f761a6 in __libc_start_main () from /lib/libc.so.6 #4 0x000000000040a259 in _start () Thread 4 (process 19308): #0 0x00002af4d0938fad in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib/libpthread.so.0 #1 0x0000000000432f9f in wtiWorker (pThis=0x685270) at wti.c:406 #2 0x000000000043176a in wtpWorker (arg=0x685270) at wtp.c:425 #3 0x00002af4d0934fc7 in start_thread () from /lib/libpthread.so.0 #4 0x00002af4d10275ad in clone () from /lib/libc.so.6 #5 0x0000000000000000 in ?? () Thread 3 (process 19310): #0 0x00002af4d1020ce2 in select () from /lib/libc.so.6 #1 0x00002af4d16c69fd in runInput (pThrd=) at imuxsock.c:280 #2 0x000000000044373f in thrdStarter (arg=0x6a5db0) at ../threads.c:139 #3 0x00002af4d0934fc7 in start_thread () from /lib/libpthread.so.0 #4 0x00002af4d10275ad in clone () from /lib/libc.so.6 #5 0x0000000000000000 in ?? () Thread 2 (process 19311): #0 0x00002af4d093b7db in read () from /lib/libpthread.so.0 #1 0x00002af4d18cb1ef in klogLogKMsg () at linux.c:449 #2 0x00002af4d18ca594 in runInput (pThrd=0x6a9020) at imklog.c:224 #3 0x000000000044373f in thrdStarter (arg=0x6a9020) at ../threads.c:139 #4 0x00002af4d0934fc7 in start_thread () from /lib/libpthread.so.0 #5 0x00002af4d10275ad in clone () from /lib/libc.so.6 #6 0x0000000000000000 in ?? () Thread 1 (process 19309): #0 0x000000000041cb79 in msgDestruct (ppThis=0x68ae18) at msg.c:354 #1 0x0000000000443076 in actionCallAction (pAction=0x68ada0, pMsg=0x2aaaac0008c0) at ../action.c:774 #2 0x000000000040b307 in processMsgDoActions (pData=0x68ada0, pParam=0x41000e90) at syslogd.c:1140 #3 0x000000000041deb8 in llExecFunc (pThis=0x68ac10, pFunc=0x40b2b0 , pParam=0x41000e90) at linkedlist.c:391 #4 0x000000000040ae19 in msgConsumer (notNeeded=, pUsr=) at syslogd.c:1183 #5 0x000000000043c537 in queueConsumerReg (pThis=0x690050, pWti=0x6a3ce0, iCancelStateSave=) at queue.c:1598 #6 0x0000000000433010 in wtiWorker (pThis=0x6a3ce0) at wti.c:416 #7 0x000000000043176a in wtpWorker (arg=0x6a3ce0) at wtp.c:425 #8 0x00002af4d0934fc7 in start_thread () from /lib/libpthread.so.0 #9 0x00002af4d10275ad in clone () from /lib/libc.so.6 #10 0x0000000000000000 in ?? () (gdb) quit -------------- next part -------------- GNU gdb 6.8-debian Copyright (C) 2008 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-linux-gnu"... Reading symbols from /usr/lib/libz.so.1...done. Loaded symbols for /usr/lib/libz.so.1 Reading symbols from /lib/libpthread.so.0...Reading symbols from /usr/lib/debug/lib/libpthread-2.7.so...done. done. Loaded symbols for /lib/libpthread.so.0 Reading symbols from /lib/libdl.so.2...Reading symbols from /usr/lib/debug/lib/libdl-2.7.so...done. done. Loaded symbols for /lib/libdl.so.2 Reading symbols from /lib/librt.so.1...Reading symbols from /usr/lib/debug/lib/librt-2.7.so...done. done. Loaded symbols for /lib/librt.so.1 Reading symbols from /lib/libc.so.6...Reading symbols from /usr/lib/debug/lib/libc-2.7.so...done. done. Loaded symbols for /lib/libc.so.6 Reading symbols from /lib/ld-linux-x86-64.so.2...Reading symbols from /usr/lib/debug/lib/ld-2.7.so...done. done. Loaded symbols for /lib64/ld-linux-x86-64.so.2 Reading symbols from /usr/lib/rsyslog/lmnet.so...done. Loaded symbols for /usr/lib/rsyslog/lmnet.so Reading symbols from /lib/libnss_files.so.2...Reading symbols from /usr/lib/debug/lib/libnss_files-2.7.so...done. done. Loaded symbols for /lib/libnss_files.so.2 Reading symbols from /usr/lib/rsyslog/imuxsock.so...done. Loaded symbols for /usr/lib/rsyslog/imuxsock.so Reading symbols from /usr/lib/rsyslog/imklog.so...done. Loaded symbols for /usr/lib/rsyslog/imklog.so Reading symbols from /lib/libnss_compat.so.2...Reading symbols from /usr/lib/debug/lib/libnss_compat-2.7.so...done. done. Loaded symbols for /lib/libnss_compat.so.2 Reading symbols from /lib/libnsl.so.1...Reading symbols from /usr/lib/debug/lib/libnsl-2.7.so...done. done. Loaded symbols for /lib/libnsl.so.1 Reading symbols from /lib/libnss_nis.so.2...Reading symbols from /usr/lib/debug/lib/libnss_nis-2.7.so...done. done. Loaded symbols for /lib/libnss_nis.so.2 Reading symbols from /usr/lib/rsyslog/lmnetstrms.so...done. Loaded symbols for /usr/lib/rsyslog/lmnetstrms.so Reading symbols from /usr/lib/rsyslog/lmtcpclt.so...done. Loaded symbols for /usr/lib/rsyslog/lmtcpclt.so Reading symbols from /usr/lib/rsyslog/lmnsd_ptcp.so...done. Loaded symbols for /usr/lib/rsyslog/lmnsd_ptcp.so Core was generated by `rsyslogd -c4 -d'. Program terminated with signal 11, Segmentation fault. [New process 20676] [New process 20678] [New process 20677] [New process 20675] [New process 20674] #0 0x000000000041cb79 in msgDestruct (ppThis=0x68ace8) at msg.c:354 354 if(strcmp((char*)(((obj_t*)pThis)->pObjInfo->pszID), "msg")) { (gdb) Thread 5 (process 20674): #0 0x00002ab1af573ce2 in select () from /lib/libc.so.6 #1 0x000000000040db93 in mainThread () at syslogd.c:2704 #2 0x000000000040ee96 in realMain (argc=, argv=) at syslogd.c:3631 #3 0x00002ab1af4c91a6 in __libc_start_main () from /lib/libc.so.6 #4 0x000000000040a259 in _start () Thread 4 (process 20675): #0 0x00002ab1aee8bfad in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib/libpthread.so.0 #1 0x0000000000432f9f in wtiWorker (pThis=0x685140) at wti.c:406 #2 0x000000000043176a in wtpWorker (arg=0x685140) at wtp.c:425 #3 0x00002ab1aee87fc7 in start_thread () from /lib/libpthread.so.0 #4 0x00002ab1af57a5ad in clone () from /lib/libc.so.6 #5 0x0000000000000000 in ?? () Thread 3 (process 20677): #0 0x00002ab1af573ce2 in select () from /lib/libc.so.6 #1 0x00002ab1afc199fd in runInput (pThrd=) at imuxsock.c:280 #2 0x000000000044373f in thrdStarter (arg=0x6a5c80) at ../threads.c:139 #3 0x00002ab1aee87fc7 in start_thread () from /lib/libpthread.so.0 #4 0x00002ab1af57a5ad in clone () from /lib/libc.so.6 #5 0x0000000000000000 in ?? () Thread 2 (process 20678): #0 0x00002ab1aee8e7db in read () from /lib/libpthread.so.0 #1 0x00002ab1afe1e1ef in klogLogKMsg () at linux.c:449 #2 0x00002ab1afe1d594 in runInput (pThrd=0x6a8ef0) at imklog.c:224 #3 0x000000000044373f in thrdStarter (arg=0x6a8ef0) at ../threads.c:139 #4 0x00002ab1aee87fc7 in start_thread () from /lib/libpthread.so.0 #5 0x00002ab1af57a5ad in clone () from /lib/libc.so.6 #6 0x0000000000000000 in ?? () Thread 1 (process 20676): #0 0x000000000041cb79 in msgDestruct (ppThis=0x68ace8) at msg.c:354 #1 0x0000000000443076 in actionCallAction (pAction=0x68ac70, pMsg=0x6aee30) at ../action.c:774 #2 0x000000000040b307 in processMsgDoActions (pData=0x68ac70, pParam=0x41000e90) at syslogd.c:1140 #3 0x000000000041deb8 in llExecFunc (pThis=0x68aae0, pFunc=0x40b2b0 , pParam=0x41000e90) at linkedlist.c:391 #4 0x000000000040ae19 in msgConsumer (notNeeded=, pUsr=) at syslogd.c:1183 #5 0x000000000043c537 in queueConsumerReg (pThis=0x68ff20, pWti=0x6a3bb0, iCancelStateSave=) at queue.c:1598 #6 0x0000000000433010 in wtiWorker (pThis=0x6a3bb0) at wti.c:416 #7 0x000000000043176a in wtpWorker (arg=0x6a3bb0) at wtp.c:425 #8 0x00002ab1aee87fc7 in start_thread () from /lib/libpthread.so.0 #9 0x00002ab1af57a5ad in clone () from /lib/libc.so.6 #10 0x0000000000000000 in ?? () (gdb) quit -------------- next part -------------- GNU gdb 6.8-debian Copyright (C) 2008 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-linux-gnu"... Reading symbols from /usr/lib/libz.so.1...done. Loaded symbols for /usr/lib/libz.so.1 Reading symbols from /lib/libpthread.so.0...Reading symbols from /usr/lib/debug/lib/libpthread-2.7.so...done. done. Loaded symbols for /lib/libpthread.so.0 Reading symbols from /lib/libdl.so.2...Reading symbols from /usr/lib/debug/lib/libdl-2.7.so...done. done. Loaded symbols for /lib/libdl.so.2 Reading symbols from /lib/librt.so.1...Reading symbols from /usr/lib/debug/lib/librt-2.7.so...done. done. Loaded symbols for /lib/librt.so.1 Reading symbols from /lib/libc.so.6...Reading symbols from /usr/lib/debug/lib/libc-2.7.so...done. done. Loaded symbols for /lib/libc.so.6 Reading symbols from /lib/ld-linux-x86-64.so.2...Reading symbols from /usr/lib/debug/lib/ld-2.7.so...done. done. Loaded symbols for /lib64/ld-linux-x86-64.so.2 Reading symbols from /usr/lib/rsyslog/lmnet.so...done. Loaded symbols for /usr/lib/rsyslog/lmnet.so Reading symbols from /lib/libnss_files.so.2...Reading symbols from /usr/lib/debug/lib/libnss_files-2.7.so...done. done. Loaded symbols for /lib/libnss_files.so.2 Reading symbols from /usr/lib/rsyslog/imuxsock.so...done. Loaded symbols for /usr/lib/rsyslog/imuxsock.so Reading symbols from /usr/lib/rsyslog/imklog.so...done. Loaded symbols for /usr/lib/rsyslog/imklog.so Reading symbols from /lib/libnss_compat.so.2...Reading symbols from /usr/lib/debug/lib/libnss_compat-2.7.so...done. done. Loaded symbols for /lib/libnss_compat.so.2 Reading symbols from /lib/libnsl.so.1...Reading symbols from /usr/lib/debug/lib/libnsl-2.7.so...done. done. Loaded symbols for /lib/libnsl.so.1 Reading symbols from /lib/libnss_nis.so.2...Reading symbols from /usr/lib/debug/lib/libnss_nis-2.7.so...done. done. Loaded symbols for /lib/libnss_nis.so.2 Reading symbols from /usr/lib/rsyslog/lmnetstrms.so...done. Loaded symbols for /usr/lib/rsyslog/lmnetstrms.so Reading symbols from /usr/lib/rsyslog/lmtcpclt.so...done. Loaded symbols for /usr/lib/rsyslog/lmtcpclt.so Reading symbols from /usr/lib/rsyslog/lmnsd_ptcp.so...done. Loaded symbols for /usr/lib/rsyslog/lmnsd_ptcp.so Core was generated by `rsyslogd -c4 -d'. Program terminated with signal 6, Aborted. [New process 21096] [New process 21098] [New process 21097] [New process 21095] [New process 21094] #0 0x00002ac0a65dded5 in raise () from /lib/libc.so.6 (gdb) Thread 5 (process 21094): #0 0x00002ac0a6674ce2 in select () from /lib/libc.so.6 #1 0x000000000040db93 in mainThread () at syslogd.c:2704 #2 0x000000000040ee96 in realMain (argc=, argv=) at syslogd.c:3631 #3 0x00002ac0a65ca1a6 in __libc_start_main () from /lib/libc.so.6 #4 0x000000000040a259 in _start () Thread 4 (process 21095): #0 0x00002ac0a5f8cfad in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib/libpthread.so.0 #1 0x0000000000432f9f in wtiWorker (pThis=0x685140) at wti.c:406 #2 0x000000000043176a in wtpWorker (arg=0x685140) at wtp.c:425 #3 0x00002ac0a5f88fc7 in start_thread () from /lib/libpthread.so.0 #4 0x00002ac0a667b5ad in clone () from /lib/libc.so.6 #5 0x0000000000000000 in ?? () Thread 3 (process 21097): #0 0x00002ac0a6674ce2 in select () from /lib/libc.so.6 #1 0x00002ac0a6d1a9fd in runInput (pThrd=) at imuxsock.c:280 #2 0x000000000044373f in thrdStarter (arg=0x6a5c80) at ../threads.c:139 #3 0x00002ac0a5f88fc7 in start_thread () from /lib/libpthread.so.0 #4 0x00002ac0a667b5ad in clone () from /lib/libc.so.6 #5 0x0000000000000000 in ?? () Thread 2 (process 21098): #0 0x00002ac0a5f8f7db in read () from /lib/libpthread.so.0 #1 0x00002ac0a6f1f1ef in klogLogKMsg () at linux.c:449 #2 0x00002ac0a6f1e594 in runInput (pThrd=0x6a8ef0) at imklog.c:224 #3 0x000000000044373f in thrdStarter (arg=0x6a8ef0) at ../threads.c:139 #4 0x00002ac0a5f88fc7 in start_thread () from /lib/libpthread.so.0 #5 0x00002ac0a667b5ad in clone () from /lib/libc.so.6 #6 0x0000000000000000 in ?? () Thread 1 (process 21096): #0 0x00002ac0a65dded5 in raise () from /lib/libc.so.6 #1 0x00002ac0a65df3f3 in abort () from /lib/libc.so.6 #2 0x0000000000423697 in sigsegvHdlr (signum=6) at debug.c:759 #3 #4 0x00002ac0a65dded5 in raise () from /lib/libc.so.6 #5 0x00002ac0a65df3f3 in abort () from /lib/libc.so.6 #6 0x00002ac0a65d6dc9 in __assert_fail () from /lib/libc.so.6 #7 0x000000000043a4be in queueChkDiscardMsg (pThis=0x68ff20, iQueueSize=0, bRunsDA=0, pUsr=0x2aaaac002e30) at queue.c:1393 #8 0x000000000043bde3 in queueDequeueConsumable (pThis=0x68ff20, pWti=0x6a3bb0, iCancelStateSave=0) at queue.c:1478 #9 0x000000000043c4f1 in queueConsumerReg (pThis=0x68ff20, pWti=0x6a3bb0, iCancelStateSave=0) at queue.c:1597 #10 0x0000000000433010 in wtiWorker (pThis=0x6a3bb0) at wti.c:416 #11 0x000000000043176a in wtpWorker (arg=0x6a3bb0) at wtp.c:425 #12 0x00002ac0a5f88fc7 in start_thread () from /lib/libpthread.so.0 #13 0x00002ac0a667b5ad in clone () from /lib/libc.so.6 #14 0x0000000000000000 in ?? () (gdb) quit From lorenzo at sancho.ccd.uniroma2.it Fri Jan 16 17:10:29 2009 From: lorenzo at sancho.ccd.uniroma2.it (Lorenzo M. Catucci) Date: Fri, 16 Jan 2009 17:10:29 +0100 (CET) Subject: [rsyslog] rsyslog still crashes In-Reply-To: References: <1232046859.22744.39.camel@localhost.localdomain><577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44F9C2@grfint2.intern.adiscon.com> Message-ID: On Fri, 16 Jan 2009, Lorenzo M. Catucci wrote: LMC> LMC> The -n crash was completely silent; the -d run was chatty (as expected); LMC> with stdout redirected, it took a lot more time to crash, but here are LMC> both the logfile and the gdb backtrace. LMC> As for the last crash, I found on the screen session the line: rsyslogd: queue.c:1393: queueChkDiscardMsg: Assertion `(unsigned) ((obj_t*)(pUsr))->iObjCooCKiE == (unsigned) 0xBADEFEE' failed. since I forgot redirecting stderr too. Yours, lorenzo +-------------------------+----------------------------------------------+ | Lorenzo M. Catucci | Centro di Calcolo e Documentazione | | catucci at ccd.uniroma2.it | Universit? degli Studi di Roma "Tor Vergata" | | | Via O. Raimondo 18 ** I-00173 ROMA ** ITALY | | Tel. +39 06 7259 2255 | Fax. +39 06 7259 2125 | +-------------------------+----------------------------------------------+ From rgerhards at hq.adiscon.com Fri Jan 16 17:17:25 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Fri, 16 Jan 2009 17:17:25 +0100 Subject: [rsyslog] rsyslog still crashes In-Reply-To: References: <1232046859.22744.39.camel@localhost.localdomain><577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9C2@grfint2.intern.adiscon.com> Message-ID: <577465F99B41C842AAFBE9ED71E70ABA44F9C7@grfint2.intern.adiscon.com> Ok, this together with the others is evidence that something runs really wild and overwrites memory blocks. The reason this message did not appear earlier is that I disable the check in DestroyMsg() and permit it to return even though I then know memory is corrupted. So what you see here is a follow-up error. The good news, I think, is that it looks (but may fool me) like the issue seems to be in temporal proximity of the abort. That would be really good news. Let me think a bit about the situation, I'll probably come up with another instrumentation. The issue is that I'd potentially need to output one or even two log lines per message, and that creates other sync issues. Plus, I don't know if I overrun your disk with that (depending on workload, which seems to be quite high). Rainer > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of Lorenzo M. Catucci > Sent: Friday, January 16, 2009 5:10 PM > To: rsyslog-users > Subject: Re: [rsyslog] rsyslog still crashes > > On Fri, 16 Jan 2009, Lorenzo M. Catucci wrote: > > LMC> > LMC> The -n crash was completely silent; the -d run was chatty (as > expected); > LMC> with stdout redirected, it took a lot more time to crash, but here > are > LMC> both the logfile and the gdb backtrace. > LMC> > > As for the last crash, I found on the screen session the line: > > rsyslogd: queue.c:1393: queueChkDiscardMsg: Assertion `(unsigned) > ((obj_t*)(pUsr))->iObjCooCKiE == (unsigned) 0xBADEFEE' failed. > > since I forgot redirecting stderr too. > > Yours, > > lorenzo > > +-------------------------+-------------------------------------------- > --+ > | Lorenzo M. Catucci | Centro di Calcolo e Documentazione > | > | catucci at ccd.uniroma2.it | Universit? degli Studi di Roma "Tor > Vergata" | > | | Via O. Raimondo 18 ** I-00173 ROMA ** > ITALY | > | Tel. +39 06 7259 2255 | Fax. +39 06 7259 2125 > | > +-------------------------+-------------------------------------------- > --+ From rgerhards at hq.adiscon.com Fri Jan 16 17:19:34 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Fri, 16 Jan 2009 17:19:34 +0100 Subject: [rsyslog] rsyslog still crashes In-Reply-To: References: <1232046859.22744.39.camel@localhost.localdomain><577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9C2@grfint2.intern.adiscon.com> Message-ID: <577465F99B41C842AAFBE9ED71E70ABA44F9C8@grfint2.intern.adiscon.com> Lorenzo, one thing: can you change the actionqueuemode to "direct" just for a short period. I would be very interested to see what happens. Rainer > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of Lorenzo M. Catucci > Sent: Friday, January 16, 2009 5:10 PM > To: rsyslog-users > Subject: Re: [rsyslog] rsyslog still crashes > > On Fri, 16 Jan 2009, Lorenzo M. Catucci wrote: > > LMC> > LMC> The -n crash was completely silent; the -d run was chatty (as > expected); > LMC> with stdout redirected, it took a lot more time to crash, but here > are > LMC> both the logfile and the gdb backtrace. > LMC> > > As for the last crash, I found on the screen session the line: > > rsyslogd: queue.c:1393: queueChkDiscardMsg: Assertion `(unsigned) > ((obj_t*)(pUsr))->iObjCooCKiE == (unsigned) 0xBADEFEE' failed. > > since I forgot redirecting stderr too. > > Yours, > > lorenzo > > +-------------------------+-------------------------------------------- > --+ > | Lorenzo M. Catucci | Centro di Calcolo e Documentazione > | > | catucci at ccd.uniroma2.it | Universit? degli Studi di Roma "Tor > Vergata" | > | | Via O. Raimondo 18 ** I-00173 ROMA ** > ITALY | > | Tel. +39 06 7259 2255 | Fax. +39 06 7259 2125 > | > +-------------------------+-------------------------------------------- > --+ From rgerhards at hq.adiscon.com Fri Jan 16 17:47:02 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Fri, 16 Jan 2009 17:47:02 +0100 Subject: [rsyslog] rsyslog still crashes In-Reply-To: <577465F99B41C842AAFBE9ED71E70ABA44F9C8@grfint2.intern.adiscon.com> References: <1232046859.22744.39.camel@localhost.localdomain><577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9C2@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44F9C8@grfint2.intern.adiscon.com> Message-ID: <577465F99B41C842AAFBE9ED71E70ABA44F9C9@grfint2.intern.adiscon.com> Lorenzo and others: I hopefully got a system today where I can reproduce. I am setting it up right now. I also have written a stub wiki page with information useful to hunt this bug: http://wiki.rsyslog.com/index.php/V3_Race_Condition_Hunt_Page Lorenzo, can you please double-check I have used the right config indeed. All others: if you can add scenarios/information, please do. I'll try to repro the problem as soon as the system is ready. Hope it will work... Rainer > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of Rainer Gerhards > Sent: Friday, January 16, 2009 5:20 PM > To: rsyslog-users > Subject: Re: [rsyslog] rsyslog still crashes > > Lorenzo, > > one thing: can you change the actionqueuemode to "direct" just for a > short period. I would be very interested to see what happens. > > Rainer > > > -----Original Message----- > > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > > bounces at lists.adiscon.com] On Behalf Of Lorenzo M. Catucci > > Sent: Friday, January 16, 2009 5:10 PM > > To: rsyslog-users > > Subject: Re: [rsyslog] rsyslog still crashes > > > > On Fri, 16 Jan 2009, Lorenzo M. Catucci wrote: > > > > LMC> > > LMC> The -n crash was completely silent; the -d run was chatty (as > > expected); > > LMC> with stdout redirected, it took a lot more time to crash, but > here > > are > > LMC> both the logfile and the gdb backtrace. > > LMC> > > > > As for the last crash, I found on the screen session the line: > > > > rsyslogd: queue.c:1393: queueChkDiscardMsg: Assertion `(unsigned) > > ((obj_t*)(pUsr))->iObjCooCKiE == (unsigned) 0xBADEFEE' failed. > > > > since I forgot redirecting stderr too. > > > > Yours, > > > > lorenzo > > > > +-------------------------+------------------------------------------ > -- > > --+ > > | Lorenzo M. Catucci | Centro di Calcolo e Documentazione > > | > > | catucci at ccd.uniroma2.it | Universit? degli Studi di Roma "Tor > > Vergata" | > > | | Via O. Raimondo 18 ** I-00173 ROMA ** > > ITALY | > > | Tel. +39 06 7259 2255 | Fax. +39 06 7259 2125 > > | > > +-------------------------+------------------------------------------ > -- > > --+ > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com From lorenzo at sancho.ccd.uniroma2.it Fri Jan 16 17:52:28 2009 From: lorenzo at sancho.ccd.uniroma2.it (Lorenzo M. Catucci) Date: Fri, 16 Jan 2009 17:52:28 +0100 (CET) Subject: [rsyslog] rsyslog still crashes In-Reply-To: <577465F99B41C842AAFBE9ED71E70ABA44F9C8@grfint2.intern.adiscon.com> References: <1232046859.22744.39.camel@localhost.localdomain><577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9C2@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44F9C8@grfint2.intern.adiscon.com> Message-ID: On Fri, 16 Jan 2009, Rainer Gerhards wrote: RG> Lorenzo, RG> RG> one thing: can you change the actionqueuemode to "direct" just for a RG> short period. I would be very interested to see what happens. RG> Very short period... it crashed about as soon as started... I'm enclosing both the log and the backtrace. See you soon, lorenzo +-------------------------+----------------------------------------------+ | Lorenzo M. Catucci | Centro di Calcolo e Documentazione | | catucci at ccd.uniroma2.it | Universit? degli Studi di Roma "Tor Vergata" | | | Via O. Raimondo 18 ** I-00173 ROMA ** ITALY | | Tel. +39 06 7259 2255 | Fax. +39 06 7259 2125 | +-------------------------+----------------------------------------------+ -------------- next part -------------- GNU gdb 6.8-debian Copyright (C) 2008 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-linux-gnu"... Reading symbols from /usr/lib/libz.so.1...done. Loaded symbols for /usr/lib/libz.so.1 Reading symbols from /lib/libpthread.so.0...Reading symbols from /usr/lib/debug/lib/libpthread-2.7.so...done. done. Loaded symbols for /lib/libpthread.so.0 Reading symbols from /lib/libdl.so.2...Reading symbols from /usr/lib/debug/lib/libdl-2.7.so...done. done. Loaded symbols for /lib/libdl.so.2 Reading symbols from /lib/librt.so.1...Reading symbols from /usr/lib/debug/lib/librt-2.7.so...done. done. Loaded symbols for /lib/librt.so.1 Reading symbols from /lib/libc.so.6...Reading symbols from /usr/lib/debug/lib/libc-2.7.so...done. done. Loaded symbols for /lib/libc.so.6 Reading symbols from /lib/ld-linux-x86-64.so.2...Reading symbols from /usr/lib/debug/lib/ld-2.7.so...done. done. Loaded symbols for /lib64/ld-linux-x86-64.so.2 Reading symbols from /usr/lib/rsyslog/lmnet.so...done. Loaded symbols for /usr/lib/rsyslog/lmnet.so Reading symbols from /lib/libnss_files.so.2...Reading symbols from /usr/lib/debug/lib/libnss_files-2.7.so...done. done. Loaded symbols for /lib/libnss_files.so.2 Reading symbols from /usr/lib/rsyslog/imuxsock.so...done. Loaded symbols for /usr/lib/rsyslog/imuxsock.so Reading symbols from /usr/lib/rsyslog/imklog.so...done. Loaded symbols for /usr/lib/rsyslog/imklog.so Reading symbols from /lib/libnss_compat.so.2...Reading symbols from /usr/lib/debug/lib/libnss_compat-2.7.so...done. done. Loaded symbols for /lib/libnss_compat.so.2 Reading symbols from /lib/libnsl.so.1...Reading symbols from /usr/lib/debug/lib/libnsl-2.7.so...done. done. Loaded symbols for /lib/libnsl.so.1 Reading symbols from /lib/libnss_nis.so.2...Reading symbols from /usr/lib/debug/lib/libnss_nis-2.7.so...done. done. Loaded symbols for /lib/libnss_nis.so.2 Reading symbols from /usr/lib/rsyslog/lmnetstrms.so...done. Loaded symbols for /usr/lib/rsyslog/lmnetstrms.so Reading symbols from /usr/lib/rsyslog/lmtcpclt.so...done. Loaded symbols for /usr/lib/rsyslog/lmtcpclt.so Core was generated by `rsyslogd -c4 -d'. Program terminated with signal 11, Segmentation fault. [New process 27339] [New process 27341] [New process 27340] [New process 27338] #0 0x00002b034900f030 in strlen () from /lib/libc.so.6 (gdb) Thread 4 (process 27338): #0 0x00002b03489774c5 in __lll_unlock_wake () from /lib/libpthread.so.0 #1 0x00002b0348973ff9 in _L_unlock_56 () from /lib/libpthread.so.0 #2 0x00002b0348973c56 in __pthread_mutex_unlock_usercnt () from /lib/libpthread.so.0 #3 0x0000000000422a09 in dbgprint (pObj=, pszMsg=0x7fff6256dea0 " X ", lenMsg=3) at debug.c:157 #4 0x0000000000422c33 in dbgprintf (fmt=) at debug.c:892 #5 0x000000000040d522 in init () at syslogd.c:2207 #6 0x000000000040da79 in mainThread () at syslogd.c:2954 #7 0x000000000040ee96 in realMain (argc=, argv=) at syslogd.c:3631 #8 0x00002b0348fb21a6 in __libc_start_main () from /lib/libc.so.6 #9 0x000000000040a259 in _start () Thread 3 (process 27340): #0 0x00002b034905cce2 in select () from /lib/libc.so.6 #1 0x00002b03497029fd in runInput (pThrd=) at imuxsock.c:280 #2 0x000000000044377f in thrdStarter (arg=0x6a3b90) at ../threads.c:139 #3 0x00002b0348970fc7 in start_thread () from /lib/libpthread.so.0 #4 0x00002b03490635ad in clone () from /lib/libc.so.6 #5 0x0000000000000000 in ?? () Thread 2 (process 27341): #0 0x00002b03489777db in read () from /lib/libpthread.so.0 #1 0x00002b03499071ef in klogLogKMsg () at linux.c:449 #2 0x00002b0349906594 in runInput (pThrd=0x6a6b90) at imklog.c:224 #3 0x000000000044377f in thrdStarter (arg=0x6a6b90) at ../threads.c:139 #4 0x00002b0348970fc7 in start_thread () from /lib/libpthread.so.0 #5 0x00002b03490635ad in clone () from /lib/libc.so.6 #6 0x0000000000000000 in ?? () Thread 1 (process 27339): #0 0x00002b034900f030 in strlen () from /lib/libc.so.6 #1 0x00002b0348fdbcb1 in vfprintf () from /lib/libc.so.6 #2 0x00002b0348fe1c08 in fprintf () from /lib/libc.so.6 #3 0x000000000041ce7d in msgDestruct (ppThis=) at msg.c:350 #4 0x000000000044283a in actionCallDoAction (pAction=0x6856d0, pMsg=0x6a41f0) at ../action.c:495 #5 0x0000000000439c9c in qAddDirect (pThis=0x6857e0, pUsr=0x6a41f0) at queue.c:939 #6 0x000000000043dd83 in queueEnqObj (pThis=0x6857e0, flowCtlType=, pUsr=0x6a41f0) at queue.c:1016 #7 0x0000000000442d63 in actionWriteToAction (pAction=0x6856d0) at ../action.c:672 #8 0x00000000004430d0 in actionCallAction (pAction=0x6856d0, pMsg=0x6a41f0) at ../action.c:778 #9 0x000000000040b307 in processMsgDoActions (pData=0x6856d0, pParam=0x407ffe90) at syslogd.c:1140 #10 0x000000000041def8 in llExecFunc (pThis=0x685540, pFunc=0x40b2b0 , pParam=0x407ffe90) at linkedlist.c:391 #11 0x000000000040ae19 in msgConsumer (notNeeded=, pUsr=) at syslogd.c:1183 #12 0x000000000043c577 in queueConsumerReg (pThis=0x68cc80, pWti=0x6a1030, iCancelStateSave=) at queue.c:1598 #13 0x0000000000433050 in wtiWorker (pThis=0x6a1030) at wti.c:416 #14 0x00000000004317aa in wtpWorker (arg=0x6a1030) at wtp.c:425 #15 0x00002b0348970fc7 in start_thread () from /lib/libpthread.so.0 #16 0x00002b03490635ad in clone () from /lib/libc.so.6 #17 0x0000000000000000 in ?? () (gdb) quit -------------- next part -------------- 4037.620068405:main thread: Writing pidfile /var/run/rsyslogd.pid. 4037.620491470:main thread: rsyslog 4.1.3 - called init() 4037.620502795:main thread: Unloading non-static modules. 4037.620513481:main thread: module lmnet NOT unloaded because it still has a refcount of 3 4037.620522445:main thread: Clearing templates. 4037.620569724:main thread: cfline: '$ModLoad imuxsock # provides support for local system logging' 4037.620585477:main thread: Requested to load module 'imuxsock' 4037.620596298:main thread: loading module '/usr/lib/rsyslog/imuxsock.so' 4037.620662954:main thread: imuxsock version 4.1.3 initializing 4037.620699263:main thread: module of type 0 being loaded. 4037.620712772:main thread: cfline: '$ModLoad imklog # provides kernel logging support (previously done by rklogd)' 4037.620724718:main thread: Requested to load module 'imklog' 4037.620733972:main thread: loading module '/usr/lib/rsyslog/imklog.so' 4037.620847557:main thread: module of type 0 being loaded. 4037.620864846:main thread: cfline: '$ActionFileDefaultTemplate RSYSLOG_TraditionalFileFormat' 4037.620884928:main thread: cfline: '$FileOwner root' 4037.621151637:main thread: uid 0 obtained for user 'root' 4037.621164483:main thread: cfline: '$FileGroup adm' 4037.621221737:main thread: gid 4 obtained for group 'adm' 4037.621233731:main thread: cfline: '$FileCreateMode 0640' 4037.621247204:main thread: cfline: '$IncludeConfig /etc/rsyslog.d/*.conf' 4037.621306972:main thread: requested to include config file '/etc/rsyslog.d/remote.conf' 4037.621334470:main thread: cfline: '$WorkDirectory /var/log/rsyslog' 4037.621352254:main thread: cfline: '$ActionQueueType Direct # use synchronous processing' 4037.621692792:main thread: action queue type set to DIRECT (no queueing at all) 4037.621705098:main thread: cfline: '$ActionQueueFileName srvrfwd # set file name, also enables disk mode' 4037.621720665:main thread: cfline: '$ActionResumeRetryCount -1 # infinite retries on insert failure' 4037.621734291:main thread: cfline: '$ActionQueueSaveOnShutdown on # save in-memory data if rsyslog shuts down' 4037.621748715:main thread: cfline: 'mail.* @@xx.yy.zz.tt:514' 4037.621761573:main thread: - traditional PRI filter 4037.621771329:main thread: symbolic name: * ==> 255 4037.621783748:main thread: symbolic name: mail ==> 16 4037.621800473:main thread: tried selector action for builtin-file: -2001 4037.621816553:main thread: caller requested object 'netstrms', not found (iRet -3003) 4037.621829132:main thread: Requested to load module 'lmnetstrms' 4037.621839089:main thread: loading module '/usr/lib/rsyslog/lmnetstrms.so' 4037.621919155:main thread: module of type 2 being loaded. 4037.621932301:main thread: source file omfwd.c requested reference for module 'lmnetstrms', reference count now 1 4037.621945375:main thread: source file omfwd.c requested reference for module 'lmnetstrms', reference count now 2 4037.621960807:main thread: caller requested object 'tcpclt', not found (iRet -3003) 4037.621970535:main thread: Requested to load module 'lmtcpclt' 4037.621979727:main thread: loading module '/usr/lib/rsyslog/lmtcpclt.so' 4037.622039220:main thread: module of type 2 being loaded. 4037.622051937:main thread: source file omfwd.c requested reference for module 'lmtcpclt', reference count now 1 4037.622064386:main thread: hostname 'xx.yy.zz.tt', port '514' 4037.622084093:main thread: tried selector action for builtin-fwd: 0 4037.622095973:main thread: Module builtin-fwd processed this config line. 4037.622111045:main thread: template: 'RSYSLOG_TraditionalForwardFormat' assigned 4037.622134550:main thread: action 1 queue: save on shutdown 1, max disk space allowed 0 4037.622153957:main thread: action 1 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 4037.622166394:main thread: Action 0x6838c0: queue 0x683d60 created 4037.622179432:main thread: cfline: '$ActionExecOnlyWhenPreviousIsSuspended on' 4037.622192407:main thread: cfline: '& /data/var_syslog/failover.log' 4037.622218048:main thread: tried selector action for builtin-file: 0 4037.622239084:main thread: Module builtin-file processed this config line. 4037.622249944:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 4037.622264904:main thread: action 2 queue: save on shutdown 1, max disk space allowed 0 4037.622278185:main thread: action 2 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 4037.622289525:main thread: Action 0x684b30: queue 0x684d70 created 4037.622300676:main thread: cfline: '$ActionExecOnlyWhenPreviousIsSuspended off' 4037.622315313:main thread: selector line successfully processed 4037.622335353:main thread: cfline: 'auth,authpriv.* /var/log/auth.log' 4037.622346713:main thread: - traditional PRI filter 4037.622355695:main thread: symbolic name: * ==> 255 4037.622367074:main thread: symbolic name: auth ==> 32 4037.622378090:main thread: symbolic name: authpriv ==> 80 4037.622399801:main thread: tried selector action for builtin-file: 0 4037.622409569:main thread: Module builtin-file processed this config line. 4037.622419853:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 4037.622431973:main thread: action 3 queue: save on shutdown 1, max disk space allowed 0 4037.622445019:main thread: action 3 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 4037.622456983:main thread: Action 0x685160: queue 0x685220 created 4037.622467966:main thread: cfline: '*.*;auth,authpriv.none -/var/log/syslog' 4037.622477221:main thread: selector line successfully processed 4037.622486077:main thread: - traditional PRI filter 4037.622494606:main thread: symbolic name: * ==> 255 4037.622506225:main thread: symbolic name: none ==> 16 4037.622517007:main thread: symbolic name: auth ==> 32 4037.622527927:main thread: symbolic name: authpriv ==> 80 4037.622547618:main thread: tried selector action for builtin-file: 0 4037.622557092:main thread: Module builtin-file processed this config line. 4037.622567055:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 4037.622578953:main thread: action 4 queue: save on shutdown 1, max disk space allowed 0 4037.622591601:main thread: action 4 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 4037.622603373:main thread: Action 0x6856d0: queue 0x6857e0 created 4037.622614425:main thread: cfline: 'daemon.* -/var/log/daemon.log' 4037.622623611:main thread: selector line successfully processed 4037.622632946:main thread: - traditional PRI filter 4037.622641538:main thread: symbolic name: * ==> 255 4037.622652635:main thread: symbolic name: daemon ==> 24 4037.622672048:main thread: tried selector action for builtin-file: 0 4037.622681333:main thread: Module builtin-file processed this config line. 4037.622690864:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 4037.622704736:main thread: action 5 queue: save on shutdown 1, max disk space allowed 0 4037.622718299:main thread: action 5 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 4037.622730175:main thread: Action 0x685cb0: queue 0x685dc0 created 4037.622740990:main thread: cfline: 'kern.* -/var/log/kern.log' 4037.622749924:main thread: selector line successfully processed 4037.622759053:main thread: - traditional PRI filter 4037.622767804:main thread: symbolic name: * ==> 255 4037.622779282:main thread: symbolic name: kern ==> 0 4037.622799130:main thread: tried selector action for builtin-file: 0 4037.622808619:main thread: Module builtin-file processed this config line. 4037.622818753:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 4037.622830206:main thread: action 6 queue: save on shutdown 1, max disk space allowed 0 4037.622842911:main thread: action 6 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 4037.622854803:main thread: Action 0x686290: queue 0x6863a0 created 4037.622865624:main thread: cfline: 'lpr.* -/var/log/lpr.log' 4037.622874702:main thread: selector line successfully processed 4037.622883912:main thread: - traditional PRI filter 4037.622904459:main thread: symbolic name: * ==> 255 4037.622915496:main thread: symbolic name: lpr ==> 48 4037.622935076:main thread: tried selector action for builtin-file: 0 4037.622944394:main thread: Module builtin-file processed this config line. 4037.622953982:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 4037.622965406:main thread: action 7 queue: save on shutdown 1, max disk space allowed 0 4037.622978123:main thread: action 7 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 4037.622989985:main thread: Action 0x686870: queue 0x686980 created 4037.623000683:main thread: cfline: 'mail.* -/var/log/mail.log' 4037.623009707:main thread: selector line successfully processed 4037.623018565:main thread: - traditional PRI filter 4037.623027088:main thread: symbolic name: * ==> 255 4037.623038884:main thread: symbolic name: mail ==> 16 4037.623058105:main thread: tried selector action for builtin-file: 0 4037.623067588:main thread: Module builtin-file processed this config line. 4037.623077685:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 4037.623093423:main thread: action 8 queue: save on shutdown 1, max disk space allowed 0 4037.623107052:main thread: action 8 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 4037.623118908:main thread: Action 0x686e50: queue 0x686f60 created 4037.623129726:main thread: cfline: 'user.* -/var/log/user.log' 4037.623138774:main thread: selector line successfully processed 4037.623147684:main thread: - traditional PRI filter 4037.623156198:main thread: symbolic name: * ==> 255 4037.623167187:main thread: symbolic name: user ==> 8 4037.623186686:main thread: tried selector action for builtin-file: 0 4037.623196019:main thread: Module builtin-file processed this config line. 4037.623205766:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 4037.623217211:main thread: action 9 queue: save on shutdown 1, max disk space allowed 0 4037.623229541:main thread: action 9 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 4037.623240500:main thread: Action 0x6873f0: queue 0x687500 created 4037.623252272:main thread: cfline: 'mail.info -/var/log/mail.info' 4037.623261136:main thread: selector line successfully processed 4037.623269866:main thread: - traditional PRI filter 4037.623278671:main thread: symbolic name: info ==> 6 4037.623289546:main thread: symbolic name: mail ==> 16 4037.623308401:main thread: tried selector action for builtin-file: 0 4037.623317689:main thread: Module builtin-file processed this config line. 4037.623327277:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 4037.623338569:main thread: action 10 queue: save on shutdown 1, max disk space allowed 0 4037.623351333:main thread: action 10 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 4037.623362865:main thread: Action 0x6879d0: queue 0x687ae0 created 4037.623373563:main thread: cfline: 'mail.warn -/var/log/mail.warn' 4037.623382608:main thread: selector line successfully processed 4037.623391311:main thread: - traditional PRI filter 4037.623399873:main thread: symbolic name: warn ==> 4 4037.623410589:main thread: symbolic name: mail ==> 16 4037.623429414:main thread: tried selector action for builtin-file: 0 4037.623438681:main thread: Module builtin-file processed this config line. 4037.623451643:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 4037.623463664:main thread: action 11 queue: save on shutdown 1, max disk space allowed 0 4037.623476036:main thread: action 11 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 4037.623486893:main thread: Action 0x687fb0: queue 0x6880c0 created 4037.623497465:main thread: cfline: 'mail.err /var/log/mail.err' 4037.623506468:main thread: selector line successfully processed 4037.623515453:main thread: - traditional PRI filter 4037.623523865:main thread: symbolic name: err ==> 3 4037.623545812:main thread: symbolic name: mail ==> 16 4037.623566230:main thread: tried selector action for builtin-file: 0 4037.623575947:main thread: Module builtin-file processed this config line. 4037.623585871:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 4037.623597019:main thread: action 12 queue: save on shutdown 1, max disk space allowed 0 4037.623609634:main thread: action 12 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 4037.623621292:main thread: Action 0x688590: queue 0x6886a0 created 4037.623632775:main thread: cfline: 'news.crit /var/log/news/news.crit' 4037.623642228:main thread: selector line successfully processed 4037.623651312:main thread: - traditional PRI filter 4037.623660168:main thread: symbolic name: crit ==> 2 4037.623671004:main thread: symbolic name: news ==> 56 4037.623692517:main thread: tried selector action for builtin-file: 0 4037.623701901:main thread: Module builtin-file processed this config line. 4037.623711765:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 4037.623723191:main thread: action 13 queue: save on shutdown 1, max disk space allowed 0 4037.623735872:main thread: action 13 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 4037.623747536:main thread: Action 0x688b70: queue 0x688c80 created 4037.623758651:main thread: cfline: 'news.err /var/log/news/news.err' 4037.623767741:main thread: selector line successfully processed 4037.623776690:main thread: - traditional PRI filter 4037.623785240:main thread: symbolic name: err ==> 3 4037.623796478:main thread: symbolic name: news ==> 56 4037.623819517:main thread: tried selector action for builtin-file: 0 4037.623829048:main thread: Module builtin-file processed this config line. 4037.623838879:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 4037.623850438:main thread: action 14 queue: save on shutdown 1, max disk space allowed 0 4037.623862924:main thread: action 14 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 4037.623873871:main thread: Action 0x689150: queue 0x689260 created 4037.623884569:main thread: cfline: 'news.notice -/var/log/news/news.notice' 4037.623893560:main thread: selector line successfully processed 4037.623902664:main thread: - traditional PRI filter 4037.623911415:main thread: symbolic name: notice ==> 5 4037.623922467:main thread: symbolic name: news ==> 56 4037.623942264:main thread: tried selector action for builtin-file: 0 4037.623951402:main thread: Module builtin-file processed this config line. 4037.623961122:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 4037.623972360:main thread: action 15 queue: save on shutdown 1, max disk space allowed 0 4037.623985014:main thread: action 15 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 4037.623996926:main thread: Action 0x689730: queue 0x689840 created 4037.624009085:main thread: cfline: '*.=debug;auth,authpriv.none;news.none;mail.none -/var/log/debug' 4037.624018460:main thread: selector line successfully processed 4037.624027550:main thread: - traditional PRI filter 4037.624036394:main thread: symbolic name: debug ==> 7 4037.624047617:main thread: symbolic name: none ==> 16 4037.624058183:main thread: symbolic name: auth ==> 32 4037.624069187:main thread: symbolic name: authpriv ==> 80 4037.624080178:main thread: symbolic name: none ==> 16 4037.624090699:main thread: symbolic name: news ==> 56 4037.624101499:main thread: symbolic name: none ==> 16 4037.624112416:main thread: symbolic name: mail ==> 16 4037.624131976:main thread: tried selector action for builtin-file: 0 4037.624141360:main thread: Module builtin-file processed this config line. 4037.624151527:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 4037.624166254:main thread: action 16 queue: save on shutdown 1, max disk space allowed 0 4037.624179048:main thread: action 16 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 4037.624203996:main thread: Action 0x689d10: queue 0x689e20 created 4037.624216560:main thread: cfline: '*.=info;*.=notice;*.=warn;auth,authpriv.none;cron,daemon.none;mail,news.none -/var/log/messages' 4037.624225710:main thread: selector line successfully processed 4037.624234992:main thread: - traditional PRI filter 4037.624243941:main thread: symbolic name: info ==> 6 4037.624255317:main thread: symbolic name: notice ==> 5 4037.624266620:main thread: symbolic name: warn ==> 4 4037.624277663:main thread: symbolic name: none ==> 16 4037.624288730:main thread: symbolic name: auth ==> 32 4037.624299497:main thread: symbolic name: authpriv ==> 80 4037.624310429:main thread: symbolic name: none ==> 16 4037.624321088:main thread: symbolic name: cron ==> 72 4037.624331828:main thread: symbolic name: daemon ==> 24 4037.624342664:main thread: symbolic name: none ==> 16 4037.624353199:main thread: symbolic name: mail ==> 16 4037.624363960:main thread: symbolic name: news ==> 56 4037.624383361:main thread: tried selector action for builtin-file: 0 4037.624392931:main thread: Module builtin-file processed this config line. 4037.624402870:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 4037.624414390:main thread: action 17 queue: save on shutdown 1, max disk space allowed 0 4037.624427209:main thread: action 17 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 4037.624438942:main thread: Action 0x68a2f0: queue 0x68a400 created 4037.624450554:main thread: cfline: '*.emerg *' 4037.624459350:main thread: selector line successfully processed 4037.624468485:main thread: - traditional PRI filter 4037.624477275:main thread: symbolic name: emerg ==> 0 4037.624489113:main thread: tried selector action for builtin-file: -2001 4037.624498587:main thread: tried selector action for builtin-fwd: -2001 4037.624509258:main thread: tried selector action for builtin-shell: -2001 4037.624519854:main thread: tried selector action for builtin-discard: -2001 4037.624531161:main thread: write-alltried selector action for builtin-usrmsg: 0 4037.624543715:main thread: Module builtin-usrmsg processed this config line. 4037.624553426:main thread: template: ' WallFmt' assigned 4037.624568261:main thread: action 18 queue: save on shutdown 1, max disk space allowed 0 4037.624581266:main thread: action 18 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 4037.624592975:main thread: Action 0x68ad40: queue 0x68af50 created 4037.624608143:main thread: cfline: 'daemon.*;mail.*;news.err;*.=debug;*.=info;*.=notice;*.=warn |/dev/xconsole' 4037.624617917:main thread: selector line successfully processed 4037.624627063:main thread: - traditional PRI filter 4037.624635829:main thread: symbolic name: * ==> 255 4037.624646719:main thread: symbolic name: daemon ==> 24 4037.624657687:main thread: symbolic name: * ==> 255 4037.624668442:main thread: symbolic name: mail ==> 16 4037.624679359:main thread: symbolic name: err ==> 3 4037.624689994:main thread: symbolic name: news ==> 56 4037.624700698:main thread: symbolic name: debug ==> 7 4037.624711852:main thread: symbolic name: info ==> 6 4037.624722777:main thread: symbolic name: notice ==> 5 4037.624733886:main thread: symbolic name: warn ==> 4 4037.624753131:main thread: Error opening log file: /dev/xconsole 4037.624764081:main thread: Called LogError, msg: /dev/xconsole rsyslogd: /dev/xconsole 4037.624834841:main thread: tried selector action for builtin-file: 0 4037.624844138:main thread: Module builtin-file processed this config line. 4037.624854248:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 4037.624866050:main thread: action 19 queue: save on shutdown 1, max disk space allowed 0 4037.624878512:main thread: action 19 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 4037.624889537:main thread: Action 0x68c870: queue 0x68c980 created 4037.624901089:main thread: selector line successfully processed 4037.624925545:main thread: main queue: is NOT disk-assisted 4037.624949380:main thread: main queue: type 0, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 4037.624967146:main thread: main queue:Reg: finalizing construction of worker thread pool 4037.624985371:main thread: main queue:Reg/w0: finalizing construction of worker instance data 4037.624994322:main thread: main queue: queue starts up without (loading) any DA disk state (this is normal for the DA queue itself!) 4037.625008485:main thread: main queue:Reg: high activity - starting 1 additional worker thread(s). 4037.625021502:main thread: main queue:Reg/w0: receiving command 2 4037.625062410:main thread: main queue:Reg: started with state 0, num workers now 1 4037.625097359:main thread: Main processing queue is initialized and running 4037.625132246:main thread: Opened UNIX socket '/dev/log' (fd 3). 4037.625198155:main thread: main queue: entry added, size now 1 entries 4037.625212867:main thread: wtpAdviseMaxWorkers signals busy 4037.625224705:main thread: main queue: EnqueueMsg advised worker start 4037.625241685:40800950: main queue:Reg/w0: receiving command 4 4037.625272671:imuxsock.c: --------imuxsock calling select, active file descriptors (max 3): 3 4037.625309667:main thread: Active selectors: 4037.625319477:main thread: Selector 1: 4037.625327307:main thread: X X FF X X X X X X X X X X X X X X X X X X X X X X Actions: 4037.625400575:main thread: builtin-fwd: Instance data: 0x680d20 4037.625426870:main thread: RepeatedMsgReduction: 0 4037.625435459:main thread: Resume Interval: 30 4037.625443472:main thread: Suspended: 0 4037.625454034:main thread: Disabled: 0 4037.625462161:main thread: Exec only when previous is suspended: 0 4037.625470180:main thread: 4037.625477854:main thread: 4037.625486236:main thread: builtin-file: Instance data: 0x684870 4037.625499685:main thread: RepeatedMsgReduction: 0 4037.625508049:main thread: Resume Interval: 30 4037.625516113:main thread: Suspended: 0 4037.625526223:main thread: Disabled: 0 4037.625534110:main thread: Exec only when previous is suspended: 1 4037.625542227:main thread: 4037.625549973:main thread: 4037.625558091:main thread: 4037.625565903:main thread: Selector 2: 4037.625573421:main thread: X X X X FF X X X X X FF X X X 4037.625647001:main queue:Reg/w0: main queue: entry deleted, state 0, size now 0 entries 4037.625668214:main queue:Reg/w0: Called action, logging to builtin-file 4037.625702210:main queue:Reg/w0: (/var/log/syslog) From rgerhards at hq.adiscon.com Fri Jan 16 17:54:29 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Fri, 16 Jan 2009 17:54:29 +0100 Subject: [rsyslog] rsyslog still crashes In-Reply-To: References: <1232046859.22744.39.camel@localhost.localdomain><577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9C2@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9C8@grfint2.intern.adiscon.com> Message-ID: <577465F99B41C842AAFBE9ED71E70ABA44F9CA@grfint2.intern.adiscon.com> OK, maybe we can simplify the config, that would remove code pathes from the potential bug candidate list. Could you comment out all the $ActionQueue* settings? Rainer > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of Lorenzo M. Catucci > Sent: Friday, January 16, 2009 5:52 PM > To: rsyslog-users > Subject: Re: [rsyslog] rsyslog still crashes > > On Fri, 16 Jan 2009, Rainer Gerhards wrote: > > RG> Lorenzo, > RG> > RG> one thing: can you change the actionqueuemode to "direct" just for > a > RG> short period. I would be very interested to see what happens. > RG> > > Very short period... it crashed about as soon as started... I'm > enclosing > both the log and the backtrace. > > See you soon, > > lorenzo > > > +-------------------------+-------------------------------------------- > --+ > | Lorenzo M. Catucci | Centro di Calcolo e Documentazione > | > | catucci at ccd.uniroma2.it | Universit? degli Studi di Roma "Tor > Vergata" | > | | Via O. Raimondo 18 ** I-00173 ROMA ** > ITALY | > | Tel. +39 06 7259 2255 | Fax. +39 06 7259 2125 > | > +-------------------------+-------------------------------------------- > --+ From lorenzo at sancho.ccd.uniroma2.it Fri Jan 16 18:07:50 2009 From: lorenzo at sancho.ccd.uniroma2.it (Lorenzo M. Catucci) Date: Fri, 16 Jan 2009 18:07:50 +0100 (CET) Subject: [rsyslog] rsyslog still crashes In-Reply-To: <577465F99B41C842AAFBE9ED71E70ABA44F9CA@grfint2.intern.adiscon.com> References: <1232046859.22744.39.camel@localhost.localdomain><577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9C2@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9C8@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44F9CA@grfint2.intern.adiscon.com> Message-ID: On Fri, 16 Jan 2009, Rainer Gerhards wrote: RG> OK, maybe we can simplify the config, that would remove code pathes RG> from the potential bug candidate list. Could you comment out all the RG> $ActionQueue* settings? RG> Done, it's still crashing immediately! Here are the logs. lorenzo +-------------------------+----------------------------------------------+ | Lorenzo M. Catucci | Centro di Calcolo e Documentazione | | catucci at ccd.uniroma2.it | Universit? degli Studi di Roma "Tor Vergata" | | | Via O. Raimondo 18 ** I-00173 ROMA ** ITALY | | Tel. +39 06 7259 2255 | Fax. +39 06 7259 2125 | +-------------------------+----------------------------------------------+ -------------- next part -------------- GNU gdb 6.8-debian Copyright (C) 2008 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-linux-gnu"... Reading symbols from /usr/lib/libz.so.1...done. Loaded symbols for /usr/lib/libz.so.1 Reading symbols from /lib/libpthread.so.0...Reading symbols from /usr/lib/debug/lib/libpthread-2.7.so...done. done. Loaded symbols for /lib/libpthread.so.0 Reading symbols from /lib/libdl.so.2...Reading symbols from /usr/lib/debug/lib/libdl-2.7.so...done. done. Loaded symbols for /lib/libdl.so.2 Reading symbols from /lib/librt.so.1...Reading symbols from /usr/lib/debug/lib/librt-2.7.so...done. done. Loaded symbols for /lib/librt.so.1 Reading symbols from /lib/libc.so.6...Reading symbols from /usr/lib/debug/lib/libc-2.7.so...done. done. Loaded symbols for /lib/libc.so.6 Reading symbols from /lib/ld-linux-x86-64.so.2...Reading symbols from /usr/lib/debug/lib/ld-2.7.so...done. done. Loaded symbols for /lib64/ld-linux-x86-64.so.2 Reading symbols from /usr/lib/rsyslog/lmnet.so...done. Loaded symbols for /usr/lib/rsyslog/lmnet.so Reading symbols from /lib/libnss_files.so.2...Reading symbols from /usr/lib/debug/lib/libnss_files-2.7.so...done. done. Loaded symbols for /lib/libnss_files.so.2 Reading symbols from /usr/lib/rsyslog/imuxsock.so...done. Loaded symbols for /usr/lib/rsyslog/imuxsock.so Reading symbols from /usr/lib/rsyslog/imklog.so...done. Loaded symbols for /usr/lib/rsyslog/imklog.so Reading symbols from /lib/libnss_compat.so.2...Reading symbols from /usr/lib/debug/lib/libnss_compat-2.7.so...done. done. Loaded symbols for /lib/libnss_compat.so.2 Reading symbols from /lib/libnsl.so.1...Reading symbols from /usr/lib/debug/lib/libnsl-2.7.so...done. done. Loaded symbols for /lib/libnsl.so.1 Reading symbols from /lib/libnss_nis.so.2...Reading symbols from /usr/lib/debug/lib/libnss_nis-2.7.so...done. done. Loaded symbols for /lib/libnss_nis.so.2 Reading symbols from /usr/lib/rsyslog/lmnetstrms.so...done. Loaded symbols for /usr/lib/rsyslog/lmnetstrms.so Reading symbols from /usr/lib/rsyslog/lmtcpclt.so...done. Loaded symbols for /usr/lib/rsyslog/lmtcpclt.so Core was generated by `rsyslogd -c4 -d'. Program terminated with signal 11, Segmentation fault. [New process 4397] [New process 4399] [New process 4398] [New process 4396] #0 0x00002ac177165030 in strlen () from /lib/libc.so.6 (gdb) Thread 4 (process 4396): #0 0x00002ac176acd4c5 in __lll_unlock_wake () from /lib/libpthread.so.0 #1 0x00002ac176ac9ff9 in _L_unlock_56 () from /lib/libpthread.so.0 #2 0x00002ac176ac9c56 in __pthread_mutex_unlock_usercnt () from /lib/libpthread.so.0 #3 0x0000000000422a09 in dbgprint (pObj=, pszMsg=0x7fff34417d50 "FF ", lenMsg=3) at debug.c:157 #4 0x0000000000422c33 in dbgprintf (fmt=) at debug.c:892 #5 0x000000000040d549 in init () at syslogd.c:2209 #6 0x000000000040da79 in mainThread () at syslogd.c:2954 #7 0x000000000040ee96 in realMain (argc=, argv=) at syslogd.c:3631 #8 0x00002ac1771081a6 in __libc_start_main () from /lib/libc.so.6 #9 0x000000000040a259 in _start () Thread 3 (process 4398): #0 0x00002ac1771b2ce2 in select () from /lib/libc.so.6 #1 0x00002ac1778589fd in runInput (pThrd=) at imuxsock.c:280 #2 0x000000000044377f in thrdStarter (arg=0x6a3a30) at ../threads.c:139 #3 0x00002ac176ac6fc7 in start_thread () from /lib/libpthread.so.0 #4 0x00002ac1771b95ad in clone () from /lib/libc.so.6 #5 0x0000000000000000 in ?? () Thread 2 (process 4399): #0 0x00002ac176acd7db in read () from /lib/libpthread.so.0 #1 0x00002ac177a5d1ef in klogLogKMsg () at linux.c:449 #2 0x00002ac177a5c594 in runInput (pThrd=0x6a6a30) at imklog.c:224 #3 0x000000000044377f in thrdStarter (arg=0x6a6a30) at ../threads.c:139 #4 0x00002ac176ac6fc7 in start_thread () from /lib/libpthread.so.0 #5 0x00002ac1771b95ad in clone () from /lib/libc.so.6 #6 0x0000000000000000 in ?? () Thread 1 (process 4397): #0 0x00002ac177165030 in strlen () from /lib/libc.so.6 #1 0x00002ac177131cb1 in vfprintf () from /lib/libc.so.6 #2 0x00002ac177137c08 in fprintf () from /lib/libc.so.6 #3 0x000000000041ce7d in msgDestruct (ppThis=) at msg.c:350 #4 0x000000000044283a in actionCallDoAction (pAction=0x685570, pMsg=0x6a4090) at ../action.c:495 #5 0x0000000000439c9c in qAddDirect (pThis=0x685680, pUsr=0x6a4090) at queue.c:939 #6 0x000000000043dd83 in queueEnqObj (pThis=0x685680, flowCtlType=, pUsr=0x6a4090) at queue.c:1016 #7 0x0000000000442d63 in actionWriteToAction (pAction=0x685570) at ../action.c:672 #8 0x00000000004430d0 in actionCallAction (pAction=0x685570, pMsg=0x6a4090) at ../action.c:778 #9 0x000000000040b307 in processMsgDoActions (pData=0x685570, pParam=0x407ffe90) at syslogd.c:1140 #10 0x000000000041def8 in llExecFunc (pThis=0x6853e0, pFunc=0x40b2b0 , pParam=0x407ffe90) at linkedlist.c:391 #11 0x000000000040ae19 in msgConsumer (notNeeded=, pUsr=) at syslogd.c:1183 #12 0x000000000043c577 in queueConsumerReg (pThis=0x68cb20, pWti=0x6a0ed0, iCancelStateSave=) at queue.c:1598 #13 0x0000000000433050 in wtiWorker (pThis=0x6a0ed0) at wti.c:416 #14 0x00000000004317aa in wtpWorker (arg=0x6a0ed0) at wtp.c:425 #15 0x00002ac176ac6fc7 in start_thread () from /lib/libpthread.so.0 #16 0x00002ac1771b95ad in clone () from /lib/libc.so.6 #17 0x0000000000000000 in ?? () (gdb) quit -------------- next part -------------- 5437.595245610:main thread: Writing pidfile /var/run/rsyslogd.pid. 5437.595686368:main thread: rsyslog 4.1.3 - called init() 5437.595698050:main thread: Unloading non-static modules. 5437.595709554:main thread: module lmnet NOT unloaded because it still has a refcount of 3 5437.595719067:main thread: Clearing templates. 5437.595771624:main thread: cfline: '$ModLoad imuxsock # provides support for local system logging' 5437.595788522:main thread: Requested to load module 'imuxsock' 5437.595799718:main thread: loading module '/usr/lib/rsyslog/imuxsock.so' 5437.595870056:main thread: imuxsock version 4.1.3 initializing 5437.595908971:main thread: module of type 0 being loaded. 5437.595923470:main thread: cfline: '$ModLoad imklog # provides kernel logging support (previously done by rklogd)' 5437.595935908:main thread: Requested to load module 'imklog' 5437.595945421:main thread: loading module '/usr/lib/rsyslog/imklog.so' 5437.596063430:main thread: module of type 0 being loaded. 5437.596081982:main thread: cfline: '$ActionFileDefaultTemplate RSYSLOG_TraditionalFileFormat' 5437.596103387:main thread: cfline: '$FileOwner root' 5437.596366370:main thread: uid 0 obtained for user 'root' 5437.596380167:main thread: cfline: '$FileGroup adm' 5437.596439758:main thread: gid 4 obtained for group 'adm' 5437.596452445:main thread: cfline: '$FileCreateMode 0640' 5437.596466524:main thread: cfline: '$IncludeConfig /etc/rsyslog.d/*.conf' 5437.596530495:main thread: requested to include config file '/etc/rsyslog.d/remote.conf' 5437.596560987:main thread: cfline: '$WorkDirectory /var/log/rsyslog' 5437.596580414:main thread: cfline: '$ActionResumeRetryCount -1 # infinite retries on insert failure' 5437.596596212:main thread: cfline: 'mail.* @@xx.yy.zz.tt:514' 5437.596612292:main thread: - traditional PRI filter 5437.596622579:main thread: symbolic name: * ==> 255 5437.596635854:main thread: symbolic name: mail ==> 16 5437.596652432:main thread: tried selector action for builtin-file: -2001 5437.596668871:main thread: caller requested object 'netstrms', not found (iRet -3003) 5437.596678996:main thread: Requested to load module 'lmnetstrms' 5437.596688740:main thread: loading module '/usr/lib/rsyslog/lmnetstrms.so' 5437.596773657:main thread: module of type 2 being loaded. 5437.596787910:main thread: source file omfwd.c requested reference for module 'lmnetstrms', reference count now 1 5437.596801209:main thread: source file omfwd.c requested reference for module 'lmnetstrms', reference count now 2 5437.596819848:main thread: caller requested object 'tcpclt', not found (iRet -3003) 5437.596830324:main thread: Requested to load module 'lmtcpclt' 5437.596839704:main thread: loading module '/usr/lib/rsyslog/lmtcpclt.so' 5437.596905755:main thread: module of type 2 being loaded. 5437.596919522:main thread: source file omfwd.c requested reference for module 'lmtcpclt', reference count now 1 5437.596932436:main thread: hostname 'xx.yy.zz.tt', port '514' 5437.596953352:main thread: tried selector action for builtin-fwd: 0 5437.596966354:main thread: Module builtin-fwd processed this config line. 5437.596982080:main thread: template: 'RSYSLOG_TraditionalForwardFormat' assigned 5437.597007211:main thread: action 1 queue: save on shutdown 1, max disk space allowed 0 5437.597027685:main thread: action 1 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 5437.597040903:main thread: Action 0x683630: queue 0x683ad0 created 5437.597054232:main thread: cfline: '$ActionExecOnlyWhenPreviousIsSuspended on' 5437.597069310:main thread: cfline: '& /data/var_syslog/failover.log' 5437.597096292:main thread: tried selector action for builtin-file: 0 5437.597106887:main thread: Module builtin-file processed this config line. 5437.597117030:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 5437.597129750:main thread: action 2 queue: save on shutdown 1, max disk space allowed 0 5437.597143076:main thread: action 2 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 5437.597154860:main thread: Action 0x6849d0: queue 0x684c10 created 5437.597184145:main thread: cfline: '$ActionExecOnlyWhenPreviousIsSuspended off' 5437.597199760:main thread: selector line successfully processed 5437.597220784:main thread: cfline: 'auth,authpriv.* /var/log/auth.log' 5437.597232670:main thread: - traditional PRI filter 5437.597241793:main thread: symbolic name: * ==> 255 5437.597253664:main thread: symbolic name: auth ==> 32 5437.597265178:main thread: symbolic name: authpriv ==> 80 5437.597288145:main thread: tried selector action for builtin-file: 0 5437.597298717:main thread: Module builtin-file processed this config line. 5437.597311914:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 5437.597324910:main thread: action 3 queue: save on shutdown 1, max disk space allowed 0 5437.597338377:main thread: action 3 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 5437.597350305:main thread: Action 0x685000: queue 0x6850c0 created 5437.597361812:main thread: cfline: '*.*;auth,authpriv.none -/var/log/syslog' 5437.597371760:main thread: selector line successfully processed 5437.597381303:main thread: - traditional PRI filter 5437.597390459:main thread: symbolic name: * ==> 255 5437.597402201:main thread: symbolic name: none ==> 16 5437.597413358:main thread: symbolic name: auth ==> 32 5437.597424743:main thread: symbolic name: authpriv ==> 80 5437.597445059:main thread: tried selector action for builtin-file: 0 5437.597455309:main thread: Module builtin-file processed this config line. 5437.597465506:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 5437.597477596:main thread: action 4 queue: save on shutdown 1, max disk space allowed 0 5437.597490499:main thread: action 4 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 5437.597501863:main thread: Action 0x685570: queue 0x685680 created 5437.597513116:main thread: cfline: 'daemon.* -/var/log/daemon.log' 5437.597522704:main thread: selector line successfully processed 5437.597532007:main thread: - traditional PRI filter 5437.597540904:main thread: symbolic name: * ==> 255 5437.597552373:main thread: symbolic name: daemon ==> 24 5437.597573067:main thread: tried selector action for builtin-file: 0 5437.597583540:main thread: Module builtin-file processed this config line. 5437.597593506:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 5437.597605173:main thread: action 5 queue: save on shutdown 1, max disk space allowed 0 5437.597618478:main thread: action 5 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 5437.597630567:main thread: Action 0x685b50: queue 0x685c60 created 5437.597641754:main thread: cfline: 'kern.* -/var/log/kern.log' 5437.597651414:main thread: selector line successfully processed 5437.597660795:main thread: - traditional PRI filter 5437.597669852:main thread: symbolic name: * ==> 255 5437.597681123:main thread: symbolic name: kern ==> 0 5437.597705051:main thread: tried selector action for builtin-file: 0 5437.597715490:main thread: Module builtin-file processed this config line. 5437.597725735:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 5437.597737798:main thread: action 6 queue: save on shutdown 1, max disk space allowed 0 5437.597751004:main thread: action 6 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 5437.597762995:main thread: Action 0x686130: queue 0x686240 created 5437.597774134:main thread: cfline: 'lpr.* -/var/log/lpr.log' 5437.597783830:main thread: selector line successfully processed 5437.597793046:main thread: - traditional PRI filter 5437.597801811:main thread: symbolic name: * ==> 255 5437.597813298:main thread: symbolic name: lpr ==> 48 5437.597833524:main thread: tried selector action for builtin-file: 0 5437.597843772:main thread: Module builtin-file processed this config line. 5437.597853705:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 5437.597865321:main thread: action 7 queue: save on shutdown 1, max disk space allowed 0 5437.597890979:main thread: action 7 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 5437.597903456:main thread: Action 0x686710: queue 0x686820 created 5437.597914697:main thread: cfline: 'mail.* -/var/log/mail.log' 5437.597924591:main thread: selector line successfully processed 5437.597934092:main thread: - traditional PRI filter 5437.597943242:main thread: symbolic name: * ==> 255 5437.597954096:main thread: symbolic name: mail ==> 16 5437.597974738:main thread: tried selector action for builtin-file: 0 5437.597985043:main thread: Module builtin-file processed this config line. 5437.597995450:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 5437.598012859:main thread: action 8 queue: save on shutdown 1, max disk space allowed 0 5437.598027103:main thread: action 8 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 5437.598039193:main thread: Action 0x686cf0: queue 0x686e00 created 5437.598050464:main thread: cfline: 'user.* -/var/log/user.log' 5437.598059877:main thread: selector line successfully processed 5437.598069162:main thread: - traditional PRI filter 5437.598078222:main thread: symbolic name: * ==> 255 5437.598089760:main thread: symbolic name: user ==> 8 5437.598110994:main thread: tried selector action for builtin-file: 0 5437.598121194:main thread: Module builtin-file processed this config line. 5437.598130959:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 5437.598142863:main thread: action 9 queue: save on shutdown 1, max disk space allowed 0 5437.598156515:main thread: action 9 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 5437.598168587:main thread: Action 0x687290: queue 0x6873a0 created 5437.598180692:main thread: cfline: 'mail.info -/var/log/mail.info' 5437.598190523:main thread: selector line successfully processed 5437.598199946:main thread: - traditional PRI filter 5437.598208868:main thread: symbolic name: info ==> 6 5437.598220223:main thread: symbolic name: mail ==> 16 5437.598240955:main thread: tried selector action for builtin-file: 0 5437.598251116:main thread: Module builtin-file processed this config line. 5437.598261157:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 5437.598272995:main thread: action 10 queue: save on shutdown 1, max disk space allowed 0 5437.598286279:main thread: action 10 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 5437.598298537:main thread: Action 0x687870: queue 0x687980 created 5437.598309727:main thread: cfline: 'mail.warn -/var/log/mail.warn' 5437.598319450:main thread: selector line successfully processed 5437.598329097:main thread: - traditional PRI filter 5437.598338166:main thread: symbolic name: warn ==> 4 5437.598349602:main thread: symbolic name: mail ==> 16 5437.598369906:main thread: tried selector action for builtin-file: 0 5437.598379983:main thread: Module builtin-file processed this config line. 5437.598389949:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 5437.598405150:main thread: action 11 queue: save on shutdown 1, max disk space allowed 0 5437.598419093:main thread: action 11 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 5437.598430433:main thread: Action 0x687e50: queue 0x687f60 created 5437.598441455:main thread: cfline: 'mail.err /var/log/mail.err' 5437.598450704:main thread: selector line successfully processed 5437.598459923:main thread: - traditional PRI filter 5437.598468857:main thread: symbolic name: err ==> 3 5437.598480887:main thread: symbolic name: mail ==> 16 5437.598501595:main thread: tried selector action for builtin-file: 0 5437.598515449:main thread: Module builtin-file processed this config line. 5437.598525751:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 5437.598537496:main thread: action 12 queue: save on shutdown 1, max disk space allowed 0 5437.598550707:main thread: action 12 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 5437.598573321:main thread: Action 0x688430: queue 0x688540 created 5437.598585363:main thread: cfline: 'news.crit /var/log/news/news.crit' 5437.598595176:main thread: selector line successfully processed 5437.598604833:main thread: - traditional PRI filter 5437.598613572:main thread: symbolic name: crit ==> 2 5437.598624768:main thread: symbolic name: news ==> 56 5437.598647705:main thread: tried selector action for builtin-file: 0 5437.598657971:main thread: Module builtin-file processed this config line. 5437.598668150:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 5437.598680177:main thread: action 13 queue: save on shutdown 1, max disk space allowed 0 5437.598693176:main thread: action 13 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 5437.598705332:main thread: Action 0x688a10: queue 0x688b20 created 5437.598716744:main thread: cfline: 'news.err /var/log/news/news.err' 5437.598726596:main thread: selector line successfully processed 5437.598736043:main thread: - traditional PRI filter 5437.598744979:main thread: symbolic name: err ==> 3 5437.598756160:main thread: symbolic name: news ==> 56 5437.598777286:main thread: tried selector action for builtin-file: 0 5437.598787129:main thread: Module builtin-file processed this config line. 5437.598800314:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 5437.598812803:main thread: action 14 queue: save on shutdown 1, max disk space allowed 0 5437.598826177:main thread: action 14 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 5437.598837804:main thread: Action 0x688ff0: queue 0x689100 created 5437.598849081:main thread: cfline: 'news.notice -/var/log/news/news.notice' 5437.598858618:main thread: selector line successfully processed 5437.598867741:main thread: - traditional PRI filter 5437.598876750:main thread: symbolic name: notice ==> 5 5437.598888111:main thread: symbolic name: news ==> 56 5437.598908859:main thread: tried selector action for builtin-file: 0 5437.598919188:main thread: Module builtin-file processed this config line. 5437.598929240:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 5437.598940865:main thread: action 15 queue: save on shutdown 1, max disk space allowed 0 5437.598953981:main thread: action 15 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 5437.598966119:main thread: Action 0x6895d0: queue 0x6896e0 created 5437.598978587:main thread: cfline: '*.=debug;auth,authpriv.none;news.none;mail.none -/var/log/debug' 5437.598988430:main thread: selector line successfully processed 5437.598997799:main thread: - traditional PRI filter 5437.599006913:main thread: symbolic name: debug ==> 7 5437.599018781:main thread: symbolic name: none ==> 16 5437.599030057:main thread: symbolic name: auth ==> 32 5437.599041136:main thread: symbolic name: authpriv ==> 80 5437.599052494:main thread: symbolic name: none ==> 16 5437.599063705:main thread: symbolic name: news ==> 56 5437.599075069:main thread: symbolic name: none ==> 16 5437.599086205:main thread: symbolic name: mail ==> 16 5437.599107133:main thread: tried selector action for builtin-file: 0 5437.599117174:main thread: Module builtin-file processed this config line. 5437.599127409:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 5437.599139610:main thread: action 16 queue: save on shutdown 1, max disk space allowed 0 5437.599152729:main thread: action 16 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 5437.599164081:main thread: Action 0x689bb0: queue 0x689cc0 created 5437.599176261:main thread: cfline: '*.=info;*.=notice;*.=warn;auth,authpriv.none;cron,daemon.none;mail,news.none -/var/log/messages' 5437.599185849:main thread: selector line successfully processed 5437.599198395:main thread: - traditional PRI filter 5437.599207875:main thread: symbolic name: info ==> 6 5437.599231109:main thread: symbolic name: notice ==> 5 5437.599243598:main thread: symbolic name: warn ==> 4 5437.599255067:main thread: symbolic name: none ==> 16 5437.599266446:main thread: symbolic name: auth ==> 32 5437.599277561:main thread: symbolic name: authpriv ==> 80 5437.599294223:main thread: symbolic name: none ==> 16 5437.599305491:main thread: symbolic name: cron ==> 72 5437.599316587:main thread: symbolic name: daemon ==> 24 5437.599327972:main thread: symbolic name: none ==> 16 5437.599338829:main thread: symbolic name: mail ==> 16 5437.599349656:main thread: symbolic name: news ==> 56 5437.599370203:main thread: tried selector action for builtin-file: 0 5437.599380253:main thread: Module builtin-file processed this config line. 5437.599390312:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 5437.599402081:main thread: action 17 queue: save on shutdown 1, max disk space allowed 0 5437.599414977:main thread: action 17 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 5437.599426824:main thread: Action 0x68a190: queue 0x68a2a0 created 5437.599438617:main thread: cfline: '*.emerg *' 5437.599447848:main thread: selector line successfully processed 5437.599457043:main thread: - traditional PRI filter 5437.599465704:main thread: symbolic name: emerg ==> 0 5437.599477968:main thread: tried selector action for builtin-file: -2001 5437.599487949:main thread: tried selector action for builtin-fwd: -2001 5437.599498509:main thread: tried selector action for builtin-shell: -2001 5437.599509125:main thread: tried selector action for builtin-discard: -2001 5437.599520609:main thread: write-alltried selector action for builtin-usrmsg: 0 5437.599533671:main thread: Module builtin-usrmsg processed this config line. 5437.599543706:main thread: template: ' WallFmt' assigned 5437.599558706:main thread: action 18 queue: save on shutdown 1, max disk space allowed 0 5437.599572392:main thread: action 18 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 5437.599584044:main thread: Action 0x68abe0: queue 0x68adf0 created 5437.599599727:main thread: cfline: 'daemon.*;mail.*;news.err;*.=debug;*.=info;*.=notice;*.=warn |/dev/xconsole' 5437.599609933:main thread: selector line successfully processed 5437.599619050:main thread: - traditional PRI filter 5437.599627543:main thread: symbolic name: * ==> 255 5437.599639018:main thread: symbolic name: daemon ==> 24 5437.599650199:main thread: symbolic name: * ==> 255 5437.599661098:main thread: symbolic name: mail ==> 16 5437.599672207:main thread: symbolic name: err ==> 3 5437.599683163:main thread: symbolic name: news ==> 56 5437.599694127:main thread: symbolic name: debug ==> 7 5437.599705530:main thread: symbolic name: info ==> 6 5437.599716852:main thread: symbolic name: notice ==> 5 5437.599728234:main thread: symbolic name: warn ==> 4 5437.599747710:main thread: Error opening log file: /dev/xconsole 5437.599759170:main thread: Called LogError, msg: /dev/xconsole rsyslogd: /dev/xconsole 5437.599828730:main thread: tried selector action for builtin-file: 0 5437.599838531:main thread: Module builtin-file processed this config line. 5437.599848509:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 5437.599860758:main thread: action 19 queue: save on shutdown 1, max disk space allowed 0 5437.599874021:main thread: action 19 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 5437.599885441:main thread: Action 0x68c710: queue 0x68c820 created 5437.599897609:main thread: selector line successfully processed 5437.599922620:main thread: main queue: is NOT disk-assisted 5437.599936522:main thread: main queue: type 0, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 5437.599955023:main thread: main queue:Reg: finalizing construction of worker thread pool 5437.599972840:main thread: main queue:Reg/w0: finalizing construction of worker instance data 5437.599983459:main thread: main queue: queue starts up without (loading) any DA disk state (this is normal for the DA queue itself!) 5437.600011680:main thread: main queue:Reg: high activity - starting 1 additional worker thread(s). 5437.600025603:main thread: main queue:Reg/w0: receiving command 2 5437.600068026:main thread: main queue:Reg: started with state 0, num workers now 1 5437.600104178:main thread: Main processing queue is initialized and running 5437.600141621:main thread: Opened UNIX socket '/dev/log' (fd 3). 5437.600209693:main thread: main queue: entry added, size now 1 entries 5437.600224753:main thread: wtpAdviseMaxWorkers signals busy 5437.600237254:main thread: main queue: EnqueueMsg advised worker start 5437.600255410:40800950: main queue:Reg/w0: receiving command 4 5437.600288919:imuxsock.c: --------imuxsock calling select, active file descriptors (max 3): 3 5437.600327454:main thread: Active selectors: 5437.600338062:main thread: Selector 1: 5437.600345985:main thread: X X FF X X X X X X X X X X X X X X X X X X X X X X Actions: 5437.600417150:main thread: builtin-fwd: Instance data: 0x680a90 5437.600444615:main thread: RepeatedMsgReduction: 0 5437.600453239:main thread: Resume Interval: 30 5437.600461504:main thread: Suspended: 0 5437.600472064:main thread: Disabled: 0 5437.600480533:main thread: Exec only when previous is suspended: 0 5437.600489317:main thread: 5437.600497369:main thread: 5437.600506120:main thread: builtin-file: Instance data: 0x684710 5437.600520046:main thread: RepeatedMsgReduction: 0 5437.600528425:main thread: Resume Interval: 30 5437.600536885:main thread: Suspended: 0 5437.600547397:main thread: Disabled: 0 5437.600555448:main thread: Exec only when previous is suspended: 1 5437.600563851:main thread: 5437.600571822:main thread: 5437.600579955:main thread: 5437.600587890:main thread: Selector 2: 5437.600595939:main thread: X X X X FF X X X X X FF X X X X X X X X X X X X X X Actions: 5437.600664965:main thread: builtin-file: Instance data: 0x67f920 5437.600677232:main thread: RepeatedMsgReduction: 0 5437.600685740:main thread: Resume Interval: 30 5437.600694011:main thread: Suspended: 0 5437.600704478:main thread: Disabled: 0 5437.600712497:main thread: Exec only when previous is suspended: 0 5437.600720972:main thread: 5437.600728721:main thread: 5437.600736893:main thread: 5437.600744893:main thread: Selector 3: 5437.600752783:main thread: FF FF FF FF X FF FF FF FF FF X FF FF FF FF FF FF FF FF FF FF FF FF 5437.600852964:main queue:Reg/w0: main queue: entry deleted, state 0, size now 0 entries 5437.600874750:main queue:Reg/w0: Called action, logging to builtin-file 5437.600907327:main queue:Reg/w0: (/var/log/syslog) From lorenzo at sancho.ccd.uniroma2.it Fri Jan 16 18:17:30 2009 From: lorenzo at sancho.ccd.uniroma2.it (Lorenzo M. Catucci) Date: Fri, 16 Jan 2009 18:17:30 +0100 (CET) Subject: [rsyslog] rsyslog still crashes In-Reply-To: <577465F99B41C842AAFBE9ED71E70ABA44F9CA@grfint2.intern.adiscon.com> References: <1232046859.22744.39.camel@localhost.localdomain><577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9C2@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9C8@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44F9CA@grfint2.intern.adiscon.com> Message-ID: On Fri, 16 Jan 2009, Rainer Gerhards wrote: RG> OK, maybe we can simplify the config, that would remove code pathes RG> from the potential bug candidate list. Could you comment out all the RG> $ActionQueue* settings? RG> I've just restored the #if 0 in runtime/msg.c; it seems the immediate crashes came from those two lines. Now logging. Servus, lorenzo +-------------------------+----------------------------------------------+ | Lorenzo M. Catucci | Centro di Calcolo e Documentazione | | catucci at ccd.uniroma2.it | Universit? degli Studi di Roma "Tor Vergata" | | | Via O. Raimondo 18 ** I-00173 ROMA ** ITALY | | Tel. +39 06 7259 2255 | Fax. +39 06 7259 2125 | +-------------------------+----------------------------------------------+ From rgerhards at hq.adiscon.com Fri Jan 16 18:21:34 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Fri, 16 Jan 2009 18:21:34 +0100 Subject: [rsyslog] rsyslog still crashes In-Reply-To: References: <1232046859.22744.39.camel@localhost.localdomain><577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9C2@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9C8@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9CA@grfint2.intern.adiscon.com> Message-ID: <577465F99B41C842AAFBE9ED71E70ABA44F9CB@grfint2.intern.adiscon.com> Ah, ok. Side-note: I got my machine up and it is running some test. Unfortunately no aborts so far, but is has only 4 cores... I hope something turns out... Rainer > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of Lorenzo M. Catucci > Sent: Friday, January 16, 2009 6:18 PM > To: rsyslog-users > Subject: Re: [rsyslog] rsyslog still crashes > > On Fri, 16 Jan 2009, Rainer Gerhards wrote: > > RG> OK, maybe we can simplify the config, that would remove code pathes > RG> from the potential bug candidate list. Could you comment out all > the > RG> $ActionQueue* settings? > RG> > > I've just restored the #if 0 in runtime/msg.c; it seems the immediate > crashes came from those two lines. Now logging. > > Servus, > > lorenzo > > > +-------------------------+-------------------------------------------- > --+ > | Lorenzo M. Catucci | Centro di Calcolo e Documentazione > | > | catucci at ccd.uniroma2.it | Universit? degli Studi di Roma "Tor > Vergata" | > | | Via O. Raimondo 18 ** I-00173 ROMA ** > ITALY | > | Tel. +39 06 7259 2255 | Fax. +39 06 7259 2125 > | > +-------------------------+-------------------------------------------- > --+ From lorenzo at sancho.ccd.uniroma2.it Fri Jan 16 18:29:26 2009 From: lorenzo at sancho.ccd.uniroma2.it (Lorenzo M. Catucci) Date: Fri, 16 Jan 2009 18:29:26 +0100 (CET) Subject: [rsyslog] rsyslog still crashes In-Reply-To: <577465F99B41C842AAFBE9ED71E70ABA44F9CB@grfint2.intern.adiscon.com> References: <1232046859.22744.39.camel@localhost.localdomain><577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9C2@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9C8@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9CA@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44F9CB@grfint2.intern.adiscon.com> Message-ID: On Fri, 16 Jan 2009, Rainer Gerhards wrote: RG> Ah, ok. Side-note: I got my machine up and it is running some test. RG> Unfortunately no aborts so far, but is has only 4 cores... I hope RG> something turns out... RG> I think the real problem is in keeping those cores very busy... I'd try to spawn something like 20 loggers each spawning a couple "workers" per second and logging startup/shutdown of any child. Maybe make each worker sleep for a random time before exiting. I don't have any Fedora/RedHat system; if nothing else, I'd suggest doing your tests on a debian/testing system too. Yours, lorenzo PS still running... +-------------------------+----------------------------------------------+ | Lorenzo M. Catucci | Centro di Calcolo e Documentazione | | catucci at ccd.uniroma2.it | Universit? degli Studi di Roma "Tor Vergata" | | | Via O. Raimondo 18 ** I-00173 ROMA ** ITALY | | Tel. +39 06 7259 2255 | Fax. +39 06 7259 2125 | +-------------------------+----------------------------------------------+ From rgerhards at hq.adiscon.com Fri Jan 16 18:30:51 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Fri, 16 Jan 2009 18:30:51 +0100 Subject: [rsyslog] rsyslog still crashes In-Reply-To: References: <1232046859.22744.39.camel@localhost.localdomain><577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9C2@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9C8@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9CA@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9CB@grfint2.intern.adiscon.com> Message-ID: <577465F99B41C842AAFBE9ED71E70ABA44F9CD@grfint2.intern.adiscon.com> > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of Lorenzo M. Catucci > Sent: Friday, January 16, 2009 6:29 PM > To: rsyslog-users > Subject: Re: [rsyslog] rsyslog still crashes > > On Fri, 16 Jan 2009, Rainer Gerhards wrote: > > RG> Ah, ok. Side-note: I got my machine up and it is running some test. > RG> Unfortunately no aborts so far, but is has only 4 cores... I hope > RG> something turns out... > RG> > > I think the real problem is in keeping those cores very busy... I'd try > to > spawn something like 20 loggers each spawning a couple "workers" per > second and logging startup/shutdown of any child. Maybe make each > worker > sleep for a random time before exiting. Good suggestion, thanks. > > I don't have any Fedora/RedHat system; if nothing else, I'd suggest > doing > your tests on a debian/testing system too. That's what I am running on that machine - with components downloaded today. Rainer From david at lang.hm Sat Jan 17 00:26:04 2009 From: david at lang.hm (david at lang.hm) Date: Fri, 16 Jan 2009 15:26:04 -0800 (PST) Subject: [rsyslog] Baclogged files to disk are pretty slow In-Reply-To: <4255c2570901160719o4aa3bc6bk9813225374bfc53c@mail.gmail.com> References: <21149_1232040593_n0FHSsbU031125_96AF20FDF4301D419B33CCE8E3A0132B0AB699D2@SAT4MX07.RACKSPACE.CORP> <4255c2570901151111n6696fbc9md66a30c9bc9b4a10@mail.gmail.com> <19646_1232060570_n0FN2mSc020768_96AF20FDF4301D419B33CCE8E3A0132B0ACECB55@SAT4MX07.RACKSPACE.CORP> <4255c2570901160719o4aa3bc6bk9813225374bfc53c@mail.gmail.com> Message-ID: On Fri, 16 Jan 2009, RB wrote: > On Thu, Jan 15, 2009 at 16:01, Daniel Anson wrote: >> I would hope that there is an easy solution as my next idea is to write >> some type of daemonized process that can insert messages from a pool of >> MySQL connections. I can achieve this in C but would rather hopefully >> find a solution inside of the configuration. > > Short of implementing the queue/worker configuration (no idea how), it > seems the only current option would be to implement something of the > sort, either by an update to the ommysql module (optimal, as it gets > your code supported by someone else for its lifetim) or by some > external program. multiple workers will help if mySQL can handle more transactions at a time if the hit in parallel. the fact that you are doing 4000/sec indicates that you are not doing a fsync for each insert, so it is unlikly to help (if you are fsync limited the data rates are probably gong to be closer to 100-200/sec depending on your drives) > I'd think an optimal external solution would be some sort of > relp2mysql bridge, but suspect that would end up reimplementing a good > chunk of rsyslog. actually, the optimal solution is to modify rsyslog to be able to handle multiple messages at once in the output queues. that is a major effort (2-4 man weeks) that will require a sponser. Once this is implemented I would expect the throughput to go up by 2-3 orders of magnatude for database inserts. David Lang From david at lang.hm Sat Jan 17 03:31:24 2009 From: david at lang.hm (david at lang.hm) Date: Fri, 16 Jan 2009 18:31:24 -0800 (PST) Subject: [rsyslog] rsyslog still crashes In-Reply-To: <1232046366.22744.34.camel@localhost.localdomain> References: <1232046366.22744.34.camel@localhost.localdomain> Message-ID: On Thu, 15 Jan 2009, Rainer Gerhards wrote: > On Fri, 2009-01-16 at 01:20 +0100, Michael Biebl wrote: >> Given the -c4 command line argument, I'd expect it to be 4.1.3. >> >> Sounds familiar to >> http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=509292 (which is >> 3.18.6). >> >> It seems to be a more general problem with multi core (= very fast??) systems. > > Yes, that is what my analysis so far points to. It's also part of the > problem, because I do not have very fast hardware to reproduce the issue > (and it is also not easy to reliably reproduce if you have...). > > I've gotten a couple of reports (I think most on the mailing list) on > such problems and all they have in common is 4+ core machines. > > I'll try to get hold based on what Lorenzo submits. In his environment, > the problem seems to occur most reliably (he probably has the fastest > machine...). > > Lorenzo: details follow soon. I just got some time to work on this sort of thing again. my test system is a 4-socket (dual core) opteron system with 16g of ram I've done a fair amount of stress testing of the system without lockups (around the time the 4.1 branch started) if you can describe a test setup I can see about reproducing it. David Lang From david at lang.hm Sat Jan 17 03:40:22 2009 From: david at lang.hm (david at lang.hm) Date: Fri, 16 Jan 2009 18:40:22 -0800 (PST) Subject: [rsyslog] rsyslog still crashes In-Reply-To: <577465F99B41C842AAFBE9ED71E70ABA44F9C9@grfint2.intern.adiscon.com> References: <1232046859.22744.39.camel@localhost.localdomain><577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9C2@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44F9C8@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44F9C9@grfint2.intern.adiscon.com> Message-ID: On Fri, 16 Jan 2009, Rainer Gerhards wrote: > Lorenzo and others: > > I hopefully got a system today where I can reproduce. I am setting it up right now. I also have written a stub wiki page with information useful to hunt this bug: one other thing that you can do for this sort of thing is to use the amazon cloud. to quote a message from Rob Landley to the linux-kernel mailing list > My friend Mark's been experimenting with the amazon "cloud" thing, > feeding in an image with a qemu instance and distcc+cross-compiler, and > running builds under that. Renting an 8-way ~2.5 ghz server with 7 > gigabytes of ram and 1.6 terabytes of disk is 80 cents/hour through them > plus another few cents/day for bandwidth and persistent storage and > such. That's likely to get cheaper as time goes on. > > We're still planning to buy a build server of our own to have something > in- house, but for running nightly builds it's almost to the point where > depreciation on the hardware is more than buying time from a server > farm. Just _one_ of those 8-way servers is enough hardware to build an > entire distro in an hour or so. > > What this really allows us to do is experiment with "how parallel can we > get our build"? Because renting ten 8-way servers in a cluster is > $8/hour, and distcc already scales trivially over that. Down the road > what Firmware Linux is working towards is multiple qemu instances > running in parallel with a central instance distributing builds to each > one, so each can do its own ./configure in parallel, distribute > compilation to the distccd instances as it has stuff to compile, and > then package up the resulting binary into one of those portage tarballs > and send it back to the central node to install on a network mount that > the lot of 'em can mount as build context, so the packages can get their > dependencies right. (You don't want your build taking place in a > network mount, but your OS being on one you never write to isn't so bad > as long as you have local storage to build in.) > > We'll probably leverage the heck out of Portage for this, and might wind > up modifying it heavily. Dunno yet. (We can even force dependencies on > portage so it doesn't need to calculate 'em, the central node can do > that and then say "you have these packages, _build_"...) > > But yeah, hobbyists with a laptop, network access, and a monthly budget > of $20 can do cluster builds these days. would it make sense to start a fund to pay for some time for you to use like this? David Lang > http://wiki.rsyslog.com/index.php/V3_Race_Condition_Hunt_Page > > Lorenzo, can you please double-check I have used the right config indeed. > > All others: if you can add scenarios/information, please do. I'll try to repro the problem as soon as the system is ready. Hope it will work... > > Rainer > >> -----Original Message----- >> From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- >> bounces at lists.adiscon.com] On Behalf Of Rainer Gerhards >> Sent: Friday, January 16, 2009 5:20 PM >> To: rsyslog-users >> Subject: Re: [rsyslog] rsyslog still crashes >> >> Lorenzo, >> >> one thing: can you change the actionqueuemode to "direct" just for a >> short period. I would be very interested to see what happens. >> >> Rainer >> >>> -----Original Message----- >>> From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- >>> bounces at lists.adiscon.com] On Behalf Of Lorenzo M. Catucci >>> Sent: Friday, January 16, 2009 5:10 PM >>> To: rsyslog-users >>> Subject: Re: [rsyslog] rsyslog still crashes >>> >>> On Fri, 16 Jan 2009, Lorenzo M. Catucci wrote: >>> >>> LMC> >>> LMC> The -n crash was completely silent; the -d run was chatty (as >>> expected); >>> LMC> with stdout redirected, it took a lot more time to crash, but >> here >>> are >>> LMC> both the logfile and the gdb backtrace. >>> LMC> >>> >>> As for the last crash, I found on the screen session the line: >>> >>> rsyslogd: queue.c:1393: queueChkDiscardMsg: Assertion `(unsigned) >>> ((obj_t*)(pUsr))->iObjCooCKiE == (unsigned) 0xBADEFEE' failed. >>> >>> since I forgot redirecting stderr too. >>> >>> Yours, >>> >>> lorenzo >>> >>> +-------------------------+------------------------------------------ >> -- >>> --+ >>> | Lorenzo M. Catucci | Centro di Calcolo e Documentazione >>> | >>> | catucci at ccd.uniroma2.it | Universit? degli Studi di Roma "Tor >>> Vergata" | >>> | | Via O. Raimondo 18 ** I-00173 ROMA ** >>> ITALY | >>> | Tel. +39 06 7259 2255 | Fax. +39 06 7259 2125 >>> | >>> +-------------------------+------------------------------------------ >> -- >>> --+ >> _______________________________________________ >> rsyslog mailing list >> http://lists.adiscon.net/mailman/listinfo/rsyslog >> http://www.rsyslog.com > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com From david at lang.hm Sat Jan 17 03:42:09 2009 From: david at lang.hm (david at lang.hm) Date: Fri, 16 Jan 2009 18:42:09 -0800 (PST) Subject: [rsyslog] rsyslog on AIX Message-ID: we are looking at using rsyslog on AIX and the sysadmins are reporting 'problems getting it to compile' (unfortunantly no details yet) has anyone tried this? David Lang From mbiebl at gmail.com Sat Jan 17 11:10:39 2009 From: mbiebl at gmail.com (Michael Biebl) Date: Sat, 17 Jan 2009 11:10:39 +0100 Subject: [rsyslog] Is rsyslog leaking memory? Message-ID: Hi, I'm running rsyslog 3.20.2 I noticed the following: # /etc/init.d/rsyslog restart VSZ RSS (as reported by ps) 27100 1184 # logger foo 27100 1196 # logger foo (1000x) 27100 1200 # logger foo (1000x) 27100 1204 # logger foo (1000x) 27100 1208 and so on. This made me wonder, if rsyslog is leaking memory somewhere. I also noticed, that for each loaded module, rsyslog resevers exactly 8 Mb of anoymous memory (pmap -d `pgrep rsyslog`) With a couple of loaded modules you easily get over 50Mb VSZ. Michael -- Why is it that all of the instruments seeking intelligent life in the universe are pointed away from Earth? From david at lang.hm Sun Jan 18 01:55:56 2009 From: david at lang.hm (david at lang.hm) Date: Sat, 17 Jan 2009 16:55:56 -0800 (PST) Subject: [rsyslog] Is rsyslog leaking memory? In-Reply-To: References: Message-ID: On Sat, 17 Jan 2009, Michael Biebl wrote: > Hi, > > I'm running rsyslog 3.20.2 > > I noticed the following: > # /etc/init.d/rsyslog restart > VSZ RSS (as reported by ps) > 27100 1184 > # logger foo > 27100 1196 > # logger foo (1000x) > 27100 1200 > # logger foo (1000x) > 27100 1204 > # logger foo (1000x) > 27100 1208 > > and so on. > > > This made me wonder, if rsyslog is leaking memory somewhere. I have run rsyslog through stress tests where I have sent it 1B log messages and do not think that there is a memory leak. what I think that you are seeing is that the default rsyslog memory queue only uses as much ram as it needs to hold the data (even though it's described as a array it seems to grow dynamicly, I'm not sure about it shrinking) when you log a bunch of messages via logger you push data into the array faster then it gets extracted, so it takes more memory (up until you hit the max size of the array, which I think is 1000 entries) > I also noticed, that for each loaded module, rsyslog resevers exactly > 8 Mb of anoymous memory (pmap -d `pgrep rsyslog`) > With a couple of loaded modules you easily get over 50Mb VSZ. I haven't tried doing stuff with different modules, so I don't know about this. David Lang From rgerhards at hq.adiscon.com Sun Jan 18 12:01:53 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Sun, 18 Jan 2009 12:01:53 +0100 Subject: [rsyslog] Is rsyslog leaking memory? In-Reply-To: References: Message-ID: <1232276513.22744.45.camel@localhost.localdomain> On Sat, 2009-01-17 at 16:55 -0800, david at lang.hm wrote: > On Sat, 17 Jan 2009, Michael Biebl wrote: > > > Hi, > > > > I'm running rsyslog 3.20.2 > > > > I noticed the following: > > # /etc/init.d/rsyslog restart > > VSZ RSS (as reported by ps) > > 27100 1184 > > # logger foo > > 27100 1196 > > # logger foo (1000x) > > 27100 1200 > > # logger foo (1000x) > > 27100 1204 > > # logger foo (1000x) > > 27100 1208 > > > > and so on. > > > > > > This made me wonder, if rsyslog is leaking memory somewhere. > > I have run rsyslog through stress tests where I have sent it 1B log > messages and do not think that there is a memory leak. I am using valgrind's excellent memory debugger routinely during development. That brings up leaks and invalid memory access rather quickly. In fact, code quality has much improved when I started to use valgrind routinely roughly a year ago. From time to time I also do specific tests for leaks, both using valgrind and the traditional analysis technics. >From what I have seen so far, I, too, doubt there is a leak. However, there are various levels of testing. For example, the postgres output module and the GSSAPI code is contributed and I do not even have a test environment. So these are not checked using that procedure. The libdbi code is only checked every now and then and not with all backends (e.g. no Oracle at hand ... and so on...). If I ever get over to a full testing suite (no collaborators found so far...), I'll probably be able to do more consitent testing of all modules. > > what I think that you are seeing is that the default rsyslog memory queue > only uses as much ram as it needs to hold the data (even though it's > described as a array it seems to grow dynamicly, I'm not sure about it > shrinking) If you use "fixedarray" mode, the pointer array is allocated statically, no matter how many messages are in the queue. HOWEVER, this is only the pointers, so quite few memory. Actual messages are dynamically allocated and freed when processed - in any mode. > > when you log a bunch of messages via logger you push data into the array > faster then it gets extracted, so it takes more memory (up until you hit > the max size of the array, which I think is 1000 entries) > > > I also noticed, that for each loaded module, rsyslog resevers exactly > > 8 Mb of anoymous memory (pmap -d `pgrep rsyslog`) > > With a couple of loaded modules you easily get over 50Mb VSZ. > > I haven't tried doing stuff with different modules, so I don't know about > this. I am not sure where it comes from, but I'd think into the dlload direction. Could also very well be the runtime stack for each thread (not dug into the details). Rainer From david at lang.hm Sun Jan 18 13:33:16 2009 From: david at lang.hm (david at lang.hm) Date: Sun, 18 Jan 2009 04:33:16 -0800 (PST) Subject: [rsyslog] Is rsyslog leaking memory? In-Reply-To: <1232276513.22744.45.camel@localhost.localdomain> References: <1232276513.22744.45.camel@localhost.localdomain> Message-ID: On Sun, 18 Jan 2009, Rainer Gerhards wrote: > On Sat, 2009-01-17 at 16:55 -0800, david at lang.hm wrote: > >> >> what I think that you are seeing is that the default rsyslog memory queue >> only uses as much ram as it needs to hold the data (even though it's >> described as a array it seems to grow dynamicly, I'm not sure about it >> shrinking) > > If you use "fixedarray" mode, the pointer array is allocated statically, > no matter how many messages are in the queue. HOWEVER, this is only the > pointers, so quite few memory. Actual messages are dynamically allocated > and freed when processed - in any mode. that makes sense. It would be interesting to see what would happen to the enqueue/dequeue timings if the message memory was staticly allocated from what I remember seeing of the memory footprint it does appear as if you allocate the max size for the message each time, not the minimum sized needed to hold the message if that shows a noticable difference it may be worth allocating the memory in chunks substantially larger than a single message David Lang From rgerhards at hq.adiscon.com Sun Jan 18 12:21:24 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Sun, 18 Jan 2009 12:21:24 +0100 Subject: [rsyslog] Is rsyslog leaking memory? In-Reply-To: References: <1232276513.22744.45.camel@localhost.localdomain> Message-ID: <1232277684.22744.48.camel@localhost.localdomain> On Sun, 2009-01-18 at 04:33 -0800, david at lang.hm wrote: > On Sun, 18 Jan 2009, Rainer Gerhards wrote: > > > On Sat, 2009-01-17 at 16:55 -0800, david at lang.hm wrote: > > > >> > >> what I think that you are seeing is that the default rsyslog memory queue > >> only uses as much ram as it needs to hold the data (even though it's > >> described as a array it seems to grow dynamicly, I'm not sure about it > >> shrinking) > > > > If you use "fixedarray" mode, the pointer array is allocated statically, > > no matter how many messages are in the queue. HOWEVER, this is only the > > pointers, so quite few memory. Actual messages are dynamically allocated > > and freed when processed - in any mode. > > that makes sense. It would be interesting to see what would happen to the > enqueue/dequeue timings if the message memory was staticly allocated > > from what I remember seeing of the memory footprint it does appear as if > you allocate the max size for the message each time, not the minimum sized > needed to hold the message > yes, that's right. This is done to prevent an additional copy to clean things up (realloc might work, too) and memory fragmentation. The later is really nasty, I've seen that some memory areas remain allocated for quite some while due to fragmentation. > if that shows a noticable difference it may be worth allocating the memory > in chunks substantially larger than a single message That's a good suggestion. The basic classes are able to trim strings. It may be worth putting a config option into it. The current approach works well for small queues, but obviously does provide sub-optimal performance as soon as the queues grow considerably. So it may even make sense to start trimming messages only after a certain amount of messages are in-queue. Rainer From rgerhards at hq.adiscon.com Sun Jan 18 12:26:51 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Sun, 18 Jan 2009 12:26:51 +0100 Subject: [rsyslog] rsyslog on AIX In-Reply-To: References: Message-ID: <1232278011.22744.50.camel@localhost.localdomain> On Fri, 2009-01-16 at 18:42 -0800, david at lang.hm wrote: > we are looking at using rsyslog on AIX and the sysadmins are reporting > 'problems getting it to compile' (unfortunantly no details yet) > > has anyone tried this? All I know is that it doesn't work. No idea on how hard it is to get this right. Some time ago I was interested in porting (and had the time...) but found neither (virtual) hardware/software nor anyone interested in it. So I dropped the idea. I'd still be very interested in a port, but now unfortunately have much less time... Rainer From david at lang.hm Mon Jan 19 02:30:58 2009 From: david at lang.hm (david at lang.hm) Date: Sun, 18 Jan 2009 17:30:58 -0800 (PST) Subject: [rsyslog] Is rsyslog leaking memory? In-Reply-To: <1232277684.22744.48.camel@localhost.localdomain> References: <1232276513.22744.45.camel@localhost.localdomain> <1232277684.22744.48.camel@localhost.localdomain> Message-ID: On Sun, 18 Jan 2009, Rainer Gerhards wrote: > On Sun, 2009-01-18 at 04:33 -0800, david at lang.hm wrote: >> On Sun, 18 Jan 2009, Rainer Gerhards wrote: >> >>> On Sat, 2009-01-17 at 16:55 -0800, david at lang.hm wrote: >>> >>>> >>>> what I think that you are seeing is that the default rsyslog memory queue >>>> only uses as much ram as it needs to hold the data (even though it's >>>> described as a array it seems to grow dynamicly, I'm not sure about it >>>> shrinking) >>> >>> If you use "fixedarray" mode, the pointer array is allocated statically, >>> no matter how many messages are in the queue. HOWEVER, this is only the >>> pointers, so quite few memory. Actual messages are dynamically allocated >>> and freed when processed - in any mode. >> >> that makes sense. It would be interesting to see what would happen to the >> enqueue/dequeue timings if the message memory was staticly allocated >> >> from what I remember seeing of the memory footprint it does appear as if >> you allocate the max size for the message each time, not the minimum sized >> needed to hold the message >> > yes, that's right. This is done to prevent an additional copy to clean > things up (realloc might work, too) and memory fragmentation. The later > is really nasty, I've seen that some memory areas remain allocated for > quite some while due to fragmentation. > >> if that shows a noticable difference it may be worth allocating the memory >> in chunks substantially larger than a single message > > That's a good suggestion. The basic classes are able to trim strings. It > may be worth putting a config option into it. The current approach works > well for small queues, but obviously does provide sub-optimal > performance as soon as the queues grow considerably. So it may even make > sense to start trimming messages only after a certain amount of messages > are in-queue. I'm not sure that we're saying the same thing. let me try again. what I was thinking was that instead of allocating memory for one message at a time, initially allocate memory for 100 messages, then if this needs to be extended increase the allocation by 50-100%. this minimizes the number of allocations needed and the fragmentation of system memory. just like the fixed-array queue option is significantly faster than the linked list queue option (I assume from a combination of having to chase pointers and allocate/deallocate memory), there may be similar benifits from doing the same thing for the message content itself. David Lang From rgerhards at hq.adiscon.com Sun Jan 18 16:45:56 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Sun, 18 Jan 2009 16:45:56 +0100 Subject: [rsyslog] rsyslog still crashes In-Reply-To: References: <1232046859.22744.39.camel@localhost.localdomain> <577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44F9C2@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44F9C8@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44F9CA@grfint2.intern.adiscon.com> Message-ID: <1232293556.2453.3.camel@rf10up.intern.adiscon.com> Hi Lorenzo, I've gone through the material once more. Indeed, it looks like the previous tests (with the #if 1) were not really useful. Sorry for that. Please let me know the outcome of this run here. Also, I thought about one shot we may give it at reducing complexity. I am not sure if it works out, but if it does, that would be a big benefit. Could you please try the following: Use the master branch (the one you previously used). Reduce rsyslog.conf to just the necessary inputs (ideally only imuxsock) and a SINGLE file writer, no further actions. Let that run and tell us if it aborts, too. If it does, we have outruled a lot of code and we can focus much better in our troubleshooting. On my box, I unfortunately had no success yet in reproducing the issue - even though I put a lot of stress on the machine. Will be trying more today, hopefully that brings up some results... Rainer On Fri, 2009-01-16 at 18:17 +0100, Lorenzo M. Catucci wrote: > On Fri, 16 Jan 2009, Rainer Gerhards wrote: > > RG> OK, maybe we can simplify the config, that would remove code pathes > RG> from the potential bug candidate list. Could you comment out all the > RG> $ActionQueue* settings? > RG> > > I've just restored the #if 0 in runtime/msg.c; it seems the immediate > crashes came from those two lines. Now logging. > > Servus, > > lorenzo > > > +-------------------------+----------------------------------------------+ > | Lorenzo M. Catucci | Centro di Calcolo e Documentazione | > | catucci at ccd.uniroma2.it | Universit? degli Studi di Roma "Tor Vergata" | > | | Via O. Raimondo 18 ** I-00173 ROMA ** ITALY | > | Tel. +39 06 7259 2255 | Fax. +39 06 7259 2125 | > +-------------------------+----------------------------------------------+ > _______________________________________________ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com From rgerhards at hq.adiscon.com Sun Jan 18 16:57:03 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Sun, 18 Jan 2009 16:57:03 +0100 Subject: [rsyslog] increasing alloc performance - was: Is rsyslog leaking memory? In-Reply-To: References: <1232276513.22744.45.camel@localhost.localdomain> <1232277684.22744.48.camel@localhost.localdomain> Message-ID: <1232294223.2453.9.camel@rf10up.intern.adiscon.com> On Sun, 2009-01-18 at 17:30 -0800, david at lang.hm wrote: > I'm not sure that we're saying the same thing. let me try again. You are right, we weren't... > > what I was thinking was that instead of allocating memory for one message > at a time, initially allocate memory for 100 messages, then if this needs > to be extended increase the allocation by 50-100%. this minimizes the > number of allocations needed and the fragmentation of system memory. > > just like the fixed-array queue option is significantly faster than the > linked list queue option (I assume from a combination of having to chase > pointers and allocate/deallocate memory), there may be similar benifits > from doing the same thing for the message content itself. I have to admit I am skeptic about this. The reason is that there are many non-fixed fields within the message and they are allocated as needed (with some initial size that fits most messages, but it may extend on an as-needed basis). So there is no real fixed size for any message. It also depends on template formatting and other factors. I think if I'd try to prealloc at least the initial chunks, I'd probably do pretty much the same that the malloc()/free() runtime does. That, however, will probably be less performant than the runtime is (at least I hope so, these parts of the code should be heavily tweaked). This is also an error-prone task. There may be a compromise in between (e.g. allocating a fixed chunk of message text together with the message blobs), but I still think the necessary complexity is not outweight by similar benefits. All in all, I think, we have seen that the in-user-space computing needs (and malloc counts as such) are not really the bottlenecks. Implementing e.g. a "bunch writer" (which enables submission of multiple messages at once to an action) seems to be (just) equally complex but promises far better results. In any case, I'd finally like to track down that dangling race before I do any further optimization. It looks like Lorenzo seems to have a relatively stable environment for reproduction and I'd like to take advantage of that. Rainer From david at lang.hm Mon Jan 19 09:29:35 2009 From: david at lang.hm (david at lang.hm) Date: Mon, 19 Jan 2009 00:29:35 -0800 (PST) Subject: [rsyslog] increasing alloc performance - was: Is rsyslog leaking memory? In-Reply-To: <1232294223.2453.9.camel@rf10up.intern.adiscon.com> References: <1232276513.22744.45.camel@localhost.localdomain> <1232277684.22744.48.camel@localhost.localdomain> <1232294223.2453.9.camel@rf10up.intern.adiscon.com> Message-ID: On Sun, 18 Jan 2009, Rainer Gerhards wrote: > On Sun, 2009-01-18 at 17:30 -0800, david at lang.hm wrote: >> >> what I was thinking was that instead of allocating memory for one message >> at a time, initially allocate memory for 100 messages, then if this needs >> to be extended increase the allocation by 50-100%. this minimizes the >> number of allocations needed and the fragmentation of system memory. >> >> just like the fixed-array queue option is significantly faster than the >> linked list queue option (I assume from a combination of having to chase >> pointers and allocate/deallocate memory), there may be similar benifits >> from doing the same thing for the message content itself. > > I have to admit I am skeptic about this. The reason is that there are > many non-fixed fields within the message and they are allocated as > needed (with some initial size that fits most messages, but it may > extend on an as-needed basis). So there is no real fixed size for any > message. It also depends on template formatting and other factors. > > I think if I'd try to prealloc at least the initial chunks, I'd probably > do pretty much the same that the malloc()/free() runtime does. That, > however, will probably be less performant than the runtime is (at least > I hope so, these parts of the code should be heavily tweaked). This is > also an error-prone task. > > There may be a compromise in between (e.g. allocating a fixed chunk of > message text together with the message blobs), but I still think the > necessary complexity is not outweight by similar benefits. > > All in all, I think, we have seen that the in-user-space computing needs > (and malloc counts as such) are not really the bottlenecks. Implementing > e.g. a "bunch writer" (which enables submission of multiple messages at > once to an action) seems to be (just) equally complex but promises far > better results. always possible. > In any case, I'd finally like to track down that dangling race before I > do any further optimization. It looks like Lorenzo seems to have a > relatively stable environment for reproduction and I'd like to take > advantage of that. agreed, tracking down a reproducable problem takes precidence over new improvements/tweaks any day. David Lang From rgerhards at hq.adiscon.com Mon Jan 19 10:17:18 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Mon, 19 Jan 2009 10:17:18 +0100 Subject: [rsyslog] rsyslog still crashes In-Reply-To: References: <1232046859.22744.39.camel@localhost.localdomain> <577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44F9C2@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44F9C8@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44F9C9@grfint2.intern.adiscon.com> Message-ID: <1232356638.2536.3.camel@rf10up.intern.adiscon.com> Hi David, On Fri, 2009-01-16 at 18:40 -0800, david at lang.hm wrote: > On Fri, 16 Jan 2009, Rainer Gerhards wrote: > > > Lorenzo and others: > > > > I hopefully got a system today where I can reproduce. I am setting it up right now. I also have written a stub wiki page with information useful to hunt this bug: > > one other thing that you can do for this sort of thing is to use the > amazon cloud. > > to quote a message from Rob Landley to the linux-kernel mailing list > > > My friend Mark's been experimenting with the amazon "cloud" thing, > > feeding in an image with a qemu instance and distcc+cross-compiler, and > > running builds under that. Renting an 8-way ~2.5 ghz server with 7 > > gigabytes of ram and 1.6 terabytes of disk is 80 cents/hour through them > > plus another few cents/day for bandwidth and persistent storage and > > such. That's likely to get cheaper as time goes on. > > > > We're still planning to buy a build server of our own to have something > > in- house, but for running nightly builds it's almost to the point where > > depreciation on the hardware is more than buying time from a server > > farm. Just _one_ of those 8-way servers is enough hardware to build an > > entire distro in an hour or so. > > > > What this really allows us to do is experiment with "how parallel can we > > get our build"? Because renting ten 8-way servers in a cluster is > > $8/hour, and distcc already scales trivially over that. Down the road > > what Firmware Linux is working towards is multiple qemu instances > > running in parallel with a central instance distributing builds to each > > one, so each can do its own ./configure in parallel, distribute > > compilation to the distccd instances as it has stuff to compile, and > > then package up the resulting binary into one of those portage tarballs > > and send it back to the central node to install on a network mount that > > the lot of 'em can mount as build context, so the packages can get their > > dependencies right. (You don't want your build taking place in a > > network mount, but your OS being on one you never write to isn't so bad > > as long as you have local storage to build in.) > > > > We'll probably leverage the heck out of Portage for this, and might wind > > up modifying it heavily. Dunno yet. (We can even force dependencies on > > portage so it doesn't need to calculate 'em, the central node can do > > that and then say "you have these packages, _build_"...) > > > > But yeah, hobbyists with a laptop, network access, and a monthly budget > > of $20 can do cluster builds these days. > > would it make sense to start a fund to pay for some time for you to use > like this? That's a very interesting idea, thanks for sharing. At present, however, I think I'll try to stick with Lorenzo's system, because it seems to be able to somewhat reliable reproduce the issue. My 4 core machine unfortunately runs flawlessly, so I suspect that it really depends on the mix of components, where a fast machine is a necessary perquisite, but not a sufficient one. Some other things seem need to go into the mix and I've unfortunately not yet identified them... But the could sounds like an interesting long-term idea, it would definitely be useful to be able to conduct some testing on high-end machines. Rainer From patrick.shen at net-m.de Mon Jan 19 10:21:19 2009 From: patrick.shen at net-m.de (Patrick Shen) Date: Mon, 19 Jan 2009 17:21:19 +0800 Subject: [rsyslog] A weird issue Message-ID: <4974460F.2040903@net-m.de> Hi all, Recently I encountered a weird problem. Let me explain below: I've a client which is using traditional syslog (NOT rsyslog) app for storing and forwarding logs to loghost. Here are some "snmpd" logs for example: ########################################################################################## Jan 19 10:03:09 athos snmpd[1104]: Connection from UDP: [192.168.23.7]:34289 Jan 19 10:03:09 athos snmpd[1104]: Received SNMP packet(s) from UDP: [192.168.23.7]:34289 Jan 19 10:04:10 athos snmpd[1104]: Connection from UDP: [192.168.23.7]:58181 Jan 19 10:04:10 athos snmpd[1104]: Received SNMP packet(s) from UDP: [192.168.23.7]:58181 Jan 19 10:04:10 athos snmpd[1104]: Connection from UDP: [192.168.23.7]:58181 *Jan 19 10:04:10 athos last message repeated 25 times* ########################################################################################## Please take into account the last line. And I've a loghost host for receiving by using rsyslog v3.20.2 and used following dynamic templates to store logs ########################################################################################## $template d_hosts,"/var/rsyslog/HOSTS/%hostname%/%$year%/%$month%/%syslogfacility-text%_%hostname%_%$year%_%$month%_%$day%.log" ########################################################################################## and also opened debug template by following configures in rsyslog.conf. ########################################################################################## $template DEBUG,"Debug line with all properties:\nFROMHOST: '%FROMHOST%', HOSTNAME: '%HOSTNAME%', PRI: %PRI%,\nsyslogtag '%syslogtag%', programname: '%programname%', APP-NAME: '%APP-NAME%', PROCID: '%PROCID%', MSGID: '%MSGID%', FACILITY-TEXT: '%syslogfacility-text%'\nTIMESTAMP: '%TIMESTAMP%', STRUCTURED-DATA: '%STRUCTURED-DATA%',\nmsg: '%msg%'\nrawmsg: '%rawmsg%'\n\n" *.* -/var/rsyslog/debug;DEBUG # or whatever file you like ########################################################################################## I'm monitoring on the server-side now, and checking the last line by raw message. ########################################################################################## Debug line with all properties: FROMHOST: 'athos', HOSTNAME: '*last*', PRI: 30, syslogtag 'message', programname: 'message', APP-NAME: 'message', PROCID: '-', MSGID: '-', FACILITY-TEXT: 'daemon' TIMESTAMP: 'Jan 19 09:59:09', STRUCTURED-DATA: '-', msg: ' repeated 25 times' rawmsg: '<30>last message repeated 25 times' ########################################################################################## Does anyone has any idea why HOSTNAME property is 'last'? (The timestamp is not important, because these messages occur often). Thanks, Patrick From rgerhards at hq.adiscon.com Mon Jan 19 11:00:27 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Mon, 19 Jan 2009 11:00:27 +0100 Subject: [rsyslog] A weird issue In-Reply-To: <4974460F.2040903@net-m.de> References: <4974460F.2040903@net-m.de> Message-ID: <1232359227.2536.6.camel@rf10up.intern.adiscon.com> On Mon, 2009-01-19 at 17:21 +0800, Patrick Shen wrote: > Hi all, > > Recently I encountered a weird problem. Let me explain below: > > I've a client which is using traditional syslog (NOT rsyslog) app for storing and forwarding > logs to loghost. > > Here are some "snmpd" logs for example: > ########################################################################################## > Jan 19 10:03:09 athos snmpd[1104]: Connection from UDP: [192.168.23.7]:34289 > Jan 19 10:03:09 athos snmpd[1104]: Received SNMP packet(s) from UDP: [192.168.23.7]:34289 > Jan 19 10:04:10 athos snmpd[1104]: Connection from UDP: [192.168.23.7]:58181 > Jan 19 10:04:10 athos snmpd[1104]: Received SNMP packet(s) from UDP: [192.168.23.7]:58181 > Jan 19 10:04:10 athos snmpd[1104]: Connection from UDP: [192.168.23.7]:58181 > *Jan 19 10:04:10 athos last message repeated 25 times* > ########################################################################################## > > Please take into account the last line. > > And I've a loghost host for receiving by using rsyslog v3.20.2 and used following dynamic templates to > store logs > ########################################################################################## > $template d_hosts,"/var/rsyslog/HOSTS/%hostname%/%$year%/%$month%/%syslogfacility-text%_%hostname%_%$year%_%$month%_%$day%.log" > ########################################################################################## > > and also opened debug template by following > configures in rsyslog.conf. > ########################################################################################## > $template DEBUG,"Debug line with all properties:\nFROMHOST: '%FROMHOST%', HOSTNAME: '%HOSTNAME%', PRI: %PRI%,\nsyslogtag '%syslogtag%', programname: '%programname%', APP-NAME: '%APP-NAME%', PROCID: > '%PROCID%', MSGID: '%MSGID%', FACILITY-TEXT: '%syslogfacility-text%'\nTIMESTAMP: '%TIMESTAMP%', STRUCTURED-DATA: '%STRUCTURED-DATA%',\nmsg: '%msg%'\nrawmsg: '%rawmsg%'\n\n" > *.* -/var/rsyslog/debug;DEBUG # or whatever file you like > ########################################################################################## > > I'm monitoring on the server-side now, and checking the last line by raw message. > ########################################################################################## > Debug line with all properties: > FROMHOST: 'athos', HOSTNAME: '*last*', PRI: 30, > syslogtag 'message', programname: 'message', APP-NAME: 'message', PROCID: '-', MSGID: '-', FACILITY-TEXT: 'daemon' > TIMESTAMP: 'Jan 19 09:59:09', STRUCTURED-DATA: '-', > msg: ' repeated 25 times' > rawmsg: '<30>last message repeated 25 times' > ########################################################################################## > > Does anyone has any idea why HOSTNAME property is 'last'? (The timestamp is not important, because these messages occur often). Yes, unfortunately ;) The reason simply is that sysklogd does emit malformed messages with the "last message repeated..." line. If you look at a packet capture, you'll see that they do not contain a hostname. What you see in your sysklogd log is a hostname that is locally appended. You can do a similar thing in rsyslog with the fromhost property - it does not contain the hostname but rather the system that send the message. In non-relay cases that should be the same, but in relay scenarios you see only the last hop (thus rsyslog by default uses RFC 3164 format). If you need the relay scenario, there is no way around putting rsyslog on the sending systems, too (or fixing sysklogd, which I guess you need to do yourself or it won't happen...). Rainer From rgerhards at hq.adiscon.com Mon Jan 19 11:10:50 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Mon, 19 Jan 2009 11:10:50 +0100 Subject: [rsyslog] rsyslog 3.20.3 released Message-ID: <577465F99B41C842AAFBE9ED71E70ABA44F9D6@grfint2.intern.adiscon.com> Hi all, Rsyslog 3.20.3, a member of the v3-stable branch, has been released today. It is a bug-fixing release that addresses a potential segfault that could happen if the $AllowedSenders configuration directive is used. It also addresses a doc bug, where the v3-compatibility document had an invalid directive name. This is a recommended update for all users of the v3-stable branch. Change Log: http://www.rsyslog.com/Article339.phtml Download: http://www.rsyslog.com/Downloads-req-viewdownloaddetails-lid-146.phtml I hope this release is useful. Feedback is appreciated. Best regards, Rainer Gerhards From patrick.shen at net-m.de Mon Jan 19 15:21:26 2009 From: patrick.shen at net-m.de (Patrick Shen) Date: Mon, 19 Jan 2009 22:21:26 +0800 Subject: [rsyslog] A weird issue In-Reply-To: <1232359227.2536.6.camel@rf10up.intern.adiscon.com> References: <4974460F.2040903@net-m.de> <1232359227.2536.6.camel@rf10up.intern.adiscon.com> Message-ID: <49748C66.7070102@net-m.de> Rainer Gerhards wrote: > On Mon, 2009-01-19 at 17:21 +0800, Patrick Shen wrote: >> Hi all, >> >> Recently I encountered a weird problem. Let me explain below: >> >> I've a client which is using traditional syslog (NOT rsyslog) app for storing and forwarding >> logs to loghost. >> >> Here are some "snmpd" logs for example: >> ########################################################################################## >> Jan 19 10:03:09 athos snmpd[1104]: Connection from UDP: [192.168.23.7]:34289 >> Jan 19 10:03:09 athos snmpd[1104]: Received SNMP packet(s) from UDP: [192.168.23.7]:34289 >> Jan 19 10:04:10 athos snmpd[1104]: Connection from UDP: [192.168.23.7]:58181 >> Jan 19 10:04:10 athos snmpd[1104]: Received SNMP packet(s) from UDP: [192.168.23.7]:58181 >> Jan 19 10:04:10 athos snmpd[1104]: Connection from UDP: [192.168.23.7]:58181 >> *Jan 19 10:04:10 athos last message repeated 25 times* >> ########################################################################################## >> >> Please take into account the last line. >> >> And I've a loghost host for receiving by using rsyslog v3.20.2 and used following dynamic templates to >> store logs >> ########################################################################################## >> $template d_hosts,"/var/rsyslog/HOSTS/%hostname%/%$year%/%$month%/%syslogfacility-text%_%hostname%_%$year%_%$month%_%$day%.log" >> ########################################################################################## >> >> and also opened debug template by following >> configures in rsyslog.conf. >> ########################################################################################## >> $template DEBUG,"Debug line with all properties:\nFROMHOST: '%FROMHOST%', HOSTNAME: '%HOSTNAME%', PRI: %PRI%,\nsyslogtag '%syslogtag%', programname: '%programname%', APP-NAME: '%APP-NAME%', PROCID: >> '%PROCID%', MSGID: '%MSGID%', FACILITY-TEXT: '%syslogfacility-text%'\nTIMESTAMP: '%TIMESTAMP%', STRUCTURED-DATA: '%STRUCTURED-DATA%',\nmsg: '%msg%'\nrawmsg: '%rawmsg%'\n\n" >> *.* -/var/rsyslog/debug;DEBUG # or whatever file you like >> ########################################################################################## >> >> I'm monitoring on the server-side now, and checking the last line by raw message. >> ########################################################################################## >> Debug line with all properties: >> FROMHOST: 'athos', HOSTNAME: '*last*', PRI: 30, >> syslogtag 'message', programname: 'message', APP-NAME: 'message', PROCID: '-', MSGID: '-', FACILITY-TEXT: 'daemon' >> TIMESTAMP: 'Jan 19 09:59:09', STRUCTURED-DATA: '-', >> msg: ' repeated 25 times' >> rawmsg: '<30>last message repeated 25 times' >> ########################################################################################## >> >> Does anyone has any idea why HOSTNAME property is 'last'? (The timestamp is not important, because these messages occur often). > > Yes, unfortunately ;) The reason simply is that sysklogd does emit > malformed messages with the "last message repeated..." line. If you look > at a packet capture, you'll see that they do not contain a hostname. > What you see in your sysklogd log is a hostname that is locally > appended. Ah, so simple. I'm surprised. Could you please recommend which app for packet capture? And I'd like to share another 2 log examples. ###################################################################################### Debug line with all properties: FROMHOST: 'helios', HOSTNAME: 'helios', PRI: 171, syslogtag '', programname: '', APP-NAME: '', PROCID: '-', MSGID: '-', FACILITY-TEXT: 'local5' TIMESTAMP: 'Jan 19 10:13:13', STRUCTURED-DATA: '-', msg: ' at net.netm.me.coim.GenericImportWorker.run(GenericImportWorker.java:47)' rawmsg: '<171> at net.netm.me.coim.GenericImportWorker.run(GenericImportWorker.java:47)' ###################################################################################### You could see some *spaces* between '<171>' and 'at net ...'. And HOSTNAME propety is "helios". ###################################################################################### Debug line with all properties: FROMHOST: 'helios', HOSTNAME: 'Caused', PRI: 171, syslogtag 'by:', programname: 'by', APP-NAME: 'by', PROCID: '-', MSGID: '-', FACILITY-TEXT: 'local5' TIMESTAMP: 'Jan 19 10:13:13', STRUCTURED-DATA: '-', msg: ' java.sql.BatchUpdateException: Batch entry 0 update item set itm_orderid=3722338, itm_masterorderid=0, refOrderId= 0, itm_name1=Bach: Weihnachtsoratorium, itm_name2=New London Consort, itm_author=NULL, itm_info=/var/APP/ME-utf8/content/ import/Universal-ClassicJazz/MusicDataInProgress/2000000338428, itm_info2=[NEW][ClassicJazz] [CONTENT-OK][CONTENT-320-OK] nullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnulln ullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnu ll[CHECK-MMC][CHECK-AGAIN], itm_lang=NULL, itm_isrc=NULL, itm_grid=NULL, itm_icpn=0028948002795, volume=NULL, track=0, it m_pricegroup=1880, itm_providerid=30000, itm_orderidprovider=0, itm_pricegroupprovider=1363, itm_itemidprovider=NULL, itm _viewable=1, itm_copyrightfree=F, itm_withdrmforwardlock=T, externalinfo=NULL, authorizedAge=0, meanEvaluation=0, numEval uations=0, licenseprovider_id=2131264, importSt' rawmsg: '<171>Caused by: java.sql.BatchUpdateException: Batch entry 0 update item set itm_orderid=3722338, itm_masterorde rid=0, refOrderId=0, itm_name1=Bach: Weihnachtsoratorium, itm_name2=New London Consort, itm_author=NULL, itm_info=/var/AP P/ME-utf8/content/import/Universal-ClassicJazz/MusicDataInProgress/2000000338428, itm_info2=[NEW][ClassicJazz] [CONTENT-O K][CONTENT-320-OK]nullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnul lnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnull nullnullnullnullnull[CHECK-MMC][CHECK-AGAIN], itm_lang=NULL, itm_isrc=NULL, itm_grid=NULL, itm_icpn=0028948002795, volume =NULL, track=0, itm_pricegroup=1880, itm_providerid=30000, itm_orderidprovider=0, itm_pricegroupprovider=1363, itm_itemid provider=NULL, itm_viewable=1, itm_copyrightfree=F, itm_withdrmforwardlock=T, externalinfo=NULL, authorizedAge=0, meanEva luation=0, numEvaluations=0, licenseprovider_id=2131264, importSt' ###################################################################################### But in above example: Word 'Caused' is between '<171>' and 'by ...'. So the HOSTNAME is accidentally set to 'Caused'. I'm wondering if it's a coincidence that if spaces exist between and messages in rawmsg and hostname is not provided, then HOSTNAME will be set correctly? > You can do a similar thing in rsyslog with the fromhost property - it > does not contain the hostname but rather the system that send the > message. In non-relay cases that should be the same, but in relay > scenarios you see only the last hop (thus rsyslog by default uses RFC > 3164 format). And I thought I could use 'FROMHOST' property, but I have another scenario. ###################################################################################### Debug line with all properties: FROMHOST: '172.20.101.6', HOSTNAME: 'icarus', PRI: 174, syslogtag 'httpd8330.sms:', programname: 'httpd8330.sms', APP-NAME: 'httpd8330.sms', PROCID: '-', MSGID: '-', FACILITY-TEXT: 'local5' TIMESTAMP: 'Jan 19 15:14:50', STRUCTURED-DATA: '-', msg: ' xxx.xxx.internal - - [19/Jan/2009:15:14:50 +0100] "GET /itransport/mbg/mbg/io/mbg?provider=TMOBILE_XTC_3ABO_LIVE&request-type=chargeSubscription&critialdata&originator-id=PGW_6686//S0002865748&service-type=web&payment-type=subscr&amount=299&subscription-amount=299&item-amount=299&msisdn=00491704127650&subscription-id=1662126457&subscription-type=2&reply-path=http://pgw:8330/sms/pgw/intern/ReportReception&sms-text=Dein+Abo+wurde+mit+2.99+Euro+gebucht. HTTP/1.1" 200 87#012' rawmsg: '<174>2009-01-19T15:14:50.923441+01:00 icarus httpd8330.sms: xxx.xxx.internal - - [19/Jan/2009:15:14:50 +0100] "GET /itransport/mbg/mbg/io/mbg?provider=TMOBILE_XTC_3ABO_LIVE&request-type=chargeSubscription&critialdata&originator-id=PGW_6686//S0002865748&service-type=web&payment-type=subscr&amount=299&subscription-amount=299&item-amount=299&msisdn=00491704127650&subscription-id=1662126457&subscription-type=2&reply-path=http://pgw:8330/sms/pgw/intern/ReportReception&sms-text=Dein+Abo+wurde+mit+2.99+Euro+gebucht. HTTP/1.1" 200 87#012' ###################################################################################### You could see in HOSTNAME field, it's correct set to 'icarus'. But in FROMHOST field is ip address. And I do have reverse zone for that ip in dns setting. Any ideas? > If you need the relay scenario, there is no way around putting rsyslog > on the sending systems, too (or fixing sysklogd, which I guess you need > to do yourself or it won't happen...). > > Rainer Thanks a lot for your information. Best regards, Patrick From jules at visionintel.com Mon Jan 19 15:23:27 2009 From: jules at visionintel.com (Jules Pagna Disso) Date: Mon, 19 Jan 2009 14:23:27 +0000 Subject: [rsyslog] client Message-ID: <69544300901190623g77d5f317n885a1e923f134f9f@mail.gmail.com> hi there, Is there an example of client sending alert to syslog? is it possible to create and send an alert from the command prompt to syslog? thanks, Jules From patrick.shen at net-m.de Mon Jan 19 15:48:11 2009 From: patrick.shen at net-m.de (Patrick Shen) Date: Mon, 19 Jan 2009 22:48:11 +0800 Subject: [rsyslog] client In-Reply-To: <69544300901190623g77d5f317n885a1e923f134f9f@mail.gmail.com> References: <69544300901190623g77d5f317n885a1e923f134f9f@mail.gmail.com> Message-ID: <497492AB.5030901@net-m.de> Jules Pagna Disso wrote: > hi there, > > Is there an example of client sending alert to syslog? > > is it possible to create and send an alert from the command prompt to > syslog? > > thanks, > Jules Do you mean 'logger' ? Try 'man logger'. Best regards, Patrick From lists at luigirosa.com Mon Jan 19 15:45:46 2009 From: lists at luigirosa.com (Luigi Rosa) Date: Mon, 19 Jan 2009 15:45:46 +0100 Subject: [rsyslog] client In-Reply-To: <69544300901190623g77d5f317n885a1e923f134f9f@mail.gmail.com> References: <69544300901190623g77d5f317n885a1e923f134f9f@mail.gmail.com> Message-ID: <4974921A.5040108@luigirosa.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Jules Pagna Disso said the following on 19/01/09 15:23: > Is there an example of client sending alert to syslog? You mean something like the logger utility? http://linux.about.com/library/cmd/blcmdl1_logger.htm Ciao, luigi - -- / +--[Luigi Rosa]-- \ She was a lovely girl. Our courtship was fast and furious. I was fast and she was furious. --Max Kauffmann -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iEYEARECAAYFAkl0khQACgkQ3kWu7Tfl6ZTTtwCgrgL4RTPoLiZoKaa0uw2mz9y/ KAYAnj/1BMfinxINNSgttd9TIOGfi/z4 =LxGV -----END PGP SIGNATURE----- From mrdemeanour at jackpot.uk.net Mon Jan 19 15:46:14 2009 From: mrdemeanour at jackpot.uk.net (Mr. Demeanour) Date: Mon, 19 Jan 2009 14:46:14 +0000 Subject: [rsyslog] client In-Reply-To: <69544300901190623g77d5f317n885a1e923f134f9f@mail.gmail.com> References: <69544300901190623g77d5f317n885a1e923f134f9f@mail.gmail.com> Message-ID: <49749236.6060108@jackpot.uk.net> Jules Pagna Disso wrote: > hi there, > > Is there an example of client sending alert to syslog? > > is it possible to create and send an alert from the command prompt to > syslog? Try: $ logger "Test log message" Regards, Jack. From rgerhards at hq.adiscon.com Mon Jan 19 14:45:41 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Mon, 19 Jan 2009 14:45:41 +0100 Subject: [rsyslog] A weird issue In-Reply-To: <49748C66.7070102@net-m.de> References: <4974460F.2040903@net-m.de> <1232359227.2536.6.camel@rf10up.intern.adiscon.com> <49748C66.7070102@net-m.de> Message-ID: <1232372741.2536.15.camel@rf10up.intern.adiscon.com> On Mon, 2009-01-19 at 22:21 +0800, Patrick Shen wrote: > >> Does anyone has any idea why HOSTNAME property is 'last'? (The timestamp is not important, because these messages occur often). > > > > Yes, unfortunately ;) The reason simply is that sysklogd does emit > > malformed messages with the "last message repeated..." line. If you look > > at a packet capture, you'll see that they do not contain a hostname. > > What you see in your sysklogd log is a hostname that is locally > > appended. > > Ah, so simple. I'm surprised. Could you please recommend which app for packet capture? Actually, I should have read your mail more careful. You already use rawmsg, which is the second best thing after the packet capture. But in this case, you'll see exactly the same thing (if you don't trust me, use WireShark, an excellent open source capture app). Look at this: rawmsg: '<171> at net.netm.me.coim.GenericImportWorker.run(GenericImportWorker.java:47)' Compare that the the header that is describe in RFC 3164 and you will see that there is nothing close to a real header inside that message. As the message is malformed, funny things can happen. In other words, results are unpredictable, and this is what you are seeing. > > And I'd like to share another 2 log examples. > > ###################################################################################### > Debug line with all properties: > FROMHOST: 'helios', HOSTNAME: 'helios', PRI: 171, > syslogtag '', programname: '', APP-NAME: '', PROCID: '-', MSGID: '-', FACILITY-TEXT: 'local5' > TIMESTAMP: 'Jan 19 10:13:13', STRUCTURED-DATA: '-', > msg: ' at net.netm.me.coim.GenericImportWorker.run(GenericImportWorker.java:47)' > rawmsg: '<171> at net.netm.me.coim.GenericImportWorker.run(GenericImportWorker.java:47)' > ###################################################################################### > > You could see some *spaces* between '<171>' and 'at net ...'. And HOSTNAME propety is "helios". > > > ###################################################################################### > Debug line with all properties: > FROMHOST: 'helios', HOSTNAME: 'Caused', PRI: 171, > syslogtag 'by:', programname: 'by', APP-NAME: 'by', PROCID: '-', MSGID: '-', FACILITY-TEXT: 'local5' > TIMESTAMP: 'Jan 19 10:13:13', STRUCTURED-DATA: '-', > msg: ' java.sql.BatchUpdateException: Batch entry 0 update item set itm_orderid=3722338, itm_masterorderid=0, refOrderId= > 0, itm_name1=Bach: Weihnachtsoratorium, itm_name2=New London Consort, itm_author=NULL, itm_info=/var/APP/ME-utf8/content/ > import/Universal-ClassicJazz/MusicDataInProgress/2000000338428, itm_info2=[NEW][ClassicJazz] [CONTENT-OK][CONTENT-320-OK] > nullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnulln > ullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnu > ll[CHECK-MMC][CHECK-AGAIN], itm_lang=NULL, itm_isrc=NULL, itm_grid=NULL, itm_icpn=0028948002795, volume=NULL, track=0, it > m_pricegroup=1880, itm_providerid=30000, itm_orderidprovider=0, itm_pricegroupprovider=1363, itm_itemidprovider=NULL, itm > _viewable=1, itm_copyrightfree=F, itm_withdrmforwardlock=T, externalinfo=NULL, authorizedAge=0, meanEvaluation=0, numEval > uations=0, licenseprovider_id=2131264, importSt' > rawmsg: '<171>Caused by: java.sql.BatchUpdateException: Batch entry 0 update item set itm_orderid=3722338, itm_masterorde > rid=0, refOrderId=0, itm_name1=Bach: Weihnachtsoratorium, itm_name2=New London Consort, itm_author=NULL, itm_info=/var/AP > P/ME-utf8/content/import/Universal-ClassicJazz/MusicDataInProgress/2000000338428, itm_info2=[NEW][ClassicJazz] [CONTENT-O > K][CONTENT-320-OK]nullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnul > lnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnull > nullnullnullnullnull[CHECK-MMC][CHECK-AGAIN], itm_lang=NULL, itm_isrc=NULL, itm_grid=NULL, itm_icpn=0028948002795, volume > =NULL, track=0, itm_pricegroup=1880, itm_providerid=30000, itm_orderidprovider=0, itm_pricegroupprovider=1363, itm_itemid > provider=NULL, itm_viewable=1, itm_copyrightfree=F, itm_withdrmforwardlock=T, externalinfo=NULL, authorizedAge=0, meanEva > luation=0, numEvaluations=0, licenseprovider_id=2131264, importSt' > ###################################################################################### > > But in above example: > Word 'Caused' is between '<171>' and 'by ...'. So the HOSTNAME is accidentally set to 'Caused'. > > I'm wondering if it's a coincidence that if spaces exist between and messages in rawmsg and hostname is not provided, > then HOSTNAME will be set correctly? that's probably the case with current code, but I don't guarantee that will stay. Again: invalid format => unpredictable results on all header fields > > > > You can do a similar thing in rsyslog with the fromhost property - it > > does not contain the hostname but rather the system that send the > > message. In non-relay cases that should be the same, but in relay > > scenarios you see only the last hop (thus rsyslog by default uses RFC > > 3164 format). > > And I thought I could use 'FROMHOST' property, but I have another scenario. > > ###################################################################################### > Debug line with all properties: > FROMHOST: '172.20.101.6', HOSTNAME: 'icarus', PRI: 174, > syslogtag 'httpd8330.sms:', programname: 'httpd8330.sms', APP-NAME: 'httpd8330.sms', PROCID: '-', MSGID: '-', FACILITY-TEXT: 'local5' > TIMESTAMP: 'Jan 19 15:14:50', STRUCTURED-DATA: '-', > msg: ' xxx.xxx.internal - - [19/Jan/2009:15:14:50 +0100] "GET > /itransport/mbg/mbg/io/mbg?provider=TMOBILE_XTC_3ABO_LIVE&request-type=chargeSubscription&critialdata&originator-id=PGW_6686//S0002865748&service-type=web&payment-type=subscr&amount=299&subscription-amount=299&item-amount=299&msisdn=00491704127650&subscription-id=1662126457&subscription-type=2&reply-path=http://pgw:8330/sms/pgw/intern/ReportReception&sms-text=Dein+Abo+wurde+mit+2.99+Euro+gebucht. > HTTP/1.1" 200 87#012' > rawmsg: '<174>2009-01-19T15:14:50.923441+01:00 icarus httpd8330.sms: xxx.xxx.internal - - [19/Jan/2009:15:14:50 +0100] "GET > /itransport/mbg/mbg/io/mbg?provider=TMOBILE_XTC_3ABO_LIVE&request-type=chargeSubscription&critialdata&originator-id=PGW_6686//S0002865748&service-type=web&payment-type=subscr&amount=299&subscription-amount=299&item-amount=299&msisdn=00491704127650&subscription-id=1662126457&subscription-type=2&reply-path=http://pgw:8330/sms/pgw/intern/ReportReception&sms-text=Dein+Abo+wurde+mit+2.99+Euro+gebucht. > HTTP/1.1" 200 87#012' > ###################################################################################### > that's a correctly formatted message > You could see in HOSTNAME field, it's correct set to 'icarus'. But in FROMHOST field is ip address. > And I do have reverse zone for that ip in dns setting. Any ideas? To get the name, you indeed need to enable remote lookups. One solution would be to permit different settings for different remote hosts, but that would be a feature request. Would make sense, but I am currently rather busy. If you add it to the bugzilla http://bugzilla.adiscon.com I'll see that I implement it when nothing of higher priority is in front of it. Rainer > > > If you need the relay scenario, there is no way around putting rsyslog > > on the sending systems, too (or fixing sysklogd, which I guess you need > > to do yourself or it won't happen...). > > > > Rainer > > Thanks a lot for your information. > > Best regards, > Patrick > > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com From patrick.shen at net-m.de Tue Jan 20 04:05:20 2009 From: patrick.shen at net-m.de (Patrick Shen) Date: Tue, 20 Jan 2009 11:05:20 +0800 Subject: [rsyslog] A weird issue In-Reply-To: <1232372741.2536.15.camel@rf10up.intern.adiscon.com> References: <4974460F.2040903@net-m.de> <1232359227.2536.6.camel@rf10up.intern.adiscon.com> <49748C66.7070102@net-m.de> <1232372741.2536.15.camel@rf10up.intern.adiscon.com> Message-ID: <49753F70.5050601@net-m.de> Rainer Gerhards wrote: > On Mon, 2009-01-19 at 22:21 +0800, Patrick Shen wrote: >> Ah, so simple. I'm surprised. Could you please recommend which app for packet capture? > > Actually, I should have read your mail more careful. You already use > rawmsg, which is the second best thing after the packet capture. But in > this case, you'll see exactly the same thing (if you don't trust me, use > WireShark, an excellent open source capture app). > > Look at this: > > rawmsg: '<171> at > net.netm.me.coim.GenericImportWorker.run(GenericImportWorker.java:47)' > > Compare that the the header that is describe in RFC 3164 and you will > see that there is nothing close to a real header inside that message. As > the message is malformed, funny things can happen. In other words, > results are unpredictable, and this is what you are seeing. > >> And I'd like to share another 2 log examples. >> >> ###################################################################################### >> Debug line with all properties: >> FROMHOST: 'helios', HOSTNAME: 'helios', PRI: 171, >> syslogtag '', programname: '', APP-NAME: '', PROCID: '-', MSGID: '-', FACILITY-TEXT: 'local5' >> TIMESTAMP: 'Jan 19 10:13:13', STRUCTURED-DATA: '-', >> msg: ' at net.netm.me.coim.GenericImportWorker.run(GenericImportWorker.java:47)' >> rawmsg: '<171> at net.netm.me.coim.GenericImportWorker.run(GenericImportWorker.java:47)' >> ###################################################################################### >> >> You could see some *spaces* between '<171>' and 'at net ...'. And HOSTNAME propety is "helios". >> >> >> ###################################################################################### >> Debug line with all properties: >> FROMHOST: 'helios', HOSTNAME: 'Caused', PRI: 171, >> syslogtag 'by:', programname: 'by', APP-NAME: 'by', PROCID: '-', MSGID: '-', FACILITY-TEXT: 'local5' >> TIMESTAMP: 'Jan 19 10:13:13', STRUCTURED-DATA: '-', >> msg: ' java.sql.BatchUpdateException: Batch entry 0 update item set itm_orderid=3722338, itm_masterorderid=0, refOrderId= >> 0, itm_name1=Bach: Weihnachtsoratorium, itm_name2=New London Consort, itm_author=NULL, itm_info=/var/APP/ME-utf8/content/ >> import/Universal-ClassicJazz/MusicDataInProgress/2000000338428, itm_info2=[NEW][ClassicJazz] [CONTENT-OK][CONTENT-320-OK] >> nullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnulln >> ullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnu >> ll[CHECK-MMC][CHECK-AGAIN], itm_lang=NULL, itm_isrc=NULL, itm_grid=NULL, itm_icpn=0028948002795, volume=NULL, track=0, it >> m_pricegroup=1880, itm_providerid=30000, itm_orderidprovider=0, itm_pricegroupprovider=1363, itm_itemidprovider=NULL, itm >> _viewable=1, itm_copyrightfree=F, itm_withdrmforwardlock=T, externalinfo=NULL, authorizedAge=0, meanEvaluation=0, numEval >> uations=0, licenseprovider_id=2131264, importSt' >> rawmsg: '<171>Caused by: java.sql.BatchUpdateException: Batch entry 0 update item set itm_orderid=3722338, itm_masterorde >> rid=0, refOrderId=0, itm_name1=Bach: Weihnachtsoratorium, itm_name2=New London Consort, itm_author=NULL, itm_info=/var/AP >> P/ME-utf8/content/import/Universal-ClassicJazz/MusicDataInProgress/2000000338428, itm_info2=[NEW][ClassicJazz] [CONTENT-O >> K][CONTENT-320-OK]nullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnul >> lnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnull >> nullnullnullnullnull[CHECK-MMC][CHECK-AGAIN], itm_lang=NULL, itm_isrc=NULL, itm_grid=NULL, itm_icpn=0028948002795, volume >> =NULL, track=0, itm_pricegroup=1880, itm_providerid=30000, itm_orderidprovider=0, itm_pricegroupprovider=1363, itm_itemid >> provider=NULL, itm_viewable=1, itm_copyrightfree=F, itm_withdrmforwardlock=T, externalinfo=NULL, authorizedAge=0, meanEva >> luation=0, numEvaluations=0, licenseprovider_id=2131264, importSt' >> ###################################################################################### >> >> But in above example: >> Word 'Caused' is between '<171>' and 'by ...'. So the HOSTNAME is accidentally set to 'Caused'. >> >> I'm wondering if it's a coincidence that if spaces exist between and messages in rawmsg and hostname is not provided, >> then HOSTNAME will be set correctly? > > that's probably the case with current code, but I don't guarantee that > will stay. Again: invalid format => unpredictable results on all header > fields OK, now I see the malformed format messages will cause unpredictable results in rsyslog. That's quite helpful. >> >> And I thought I could use 'FROMHOST' property, but I have another scenario. >> >> ###################################################################################### >> Debug line with all properties: >> FROMHOST: '172.20.101.6', HOSTNAME: 'icarus', PRI: 174, >> syslogtag 'httpd8330.sms:', programname: 'httpd8330.sms', APP-NAME: 'httpd8330.sms', PROCID: '-', MSGID: '-', FACILITY-TEXT: 'local5' >> TIMESTAMP: 'Jan 19 15:14:50', STRUCTURED-DATA: '-', >> msg: ' xxx.xxx.internal - - [19/Jan/2009:15:14:50 +0100] "GET >> /itransport/mbg/mbg/io/mbg?provider=TMOBILE_XTC_3ABO_LIVE&request-type=chargeSubscription&critialdata&originator-id=PGW_6686//S0002865748&service-type=web&payment-type=subscr&amount=299&subscription-amount=299&item-amount=299&msisdn=00491704127650&subscription-id=1662126457&subscription-type=2&reply-path=http://pgw:8330/sms/pgw/intern/ReportReception&sms-text=Dein+Abo+wurde+mit+2.99+Euro+gebucht. >> HTTP/1.1" 200 87#012' >> rawmsg: '<174>2009-01-19T15:14:50.923441+01:00 icarus httpd8330.sms: xxx.xxx.internal - - [19/Jan/2009:15:14:50 +0100] "GET >> /itransport/mbg/mbg/io/mbg?provider=TMOBILE_XTC_3ABO_LIVE&request-type=chargeSubscription&critialdata&originator-id=PGW_6686//S0002865748&service-type=web&payment-type=subscr&amount=299&subscription-amount=299&item-amount=299&msisdn=00491704127650&subscription-id=1662126457&subscription-type=2&reply-path=http://pgw:8330/sms/pgw/intern/ReportReception&sms-text=Dein+Abo+wurde+mit+2.99+Euro+gebucht. >> HTTP/1.1" 200 87#012' >> ###################################################################################### >> > that's a correctly formatted message > >> You could see in HOSTNAME field, it's correct set to 'icarus'. But in FROMHOST field is ip address. >> And I do have reverse zone for that ip in dns setting. Any ideas? > > To get the name, you indeed need to enable remote lookups. One solution > would be to permit different settings for different remote hosts, but > that would be a feature request. Would make sense, but I am currently > rather busy. If you add it to the bugzilla http://bugzilla.adiscon.com > I'll see that I implement it when nothing of higher priority is in front > of it. I've filed a bugzilla report [1] for your information. Anyway, one more question, if I use rsyslog at the client side, will it avoid malformed/invalid format message sending out? [1]: http://bugzilla.adiscon.com/show_bug.cgi?id=116 Thanks a lot for your help, Patrick From rgerhards at hq.adiscon.com Mon Jan 19 18:16:08 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Mon, 19 Jan 2009 18:16:08 +0100 Subject: [rsyslog] A weird issue In-Reply-To: <49753F70.5050601@net-m.de> References: <4974460F.2040903@net-m.de> <1232359227.2536.6.camel@rf10up.intern.adiscon.com> <49748C66.7070102@net-m.de> <1232372741.2536.15.camel@rf10up.intern.adiscon.com> <49753F70.5050601@net-m.de> Message-ID: <1232385368.2536.22.camel@rf10up.intern.adiscon.com> On Tue, 2009-01-20 at 11:05 +0800, Patrick Shen wrote: > >> But in above example: > >> Word 'Caused' is between '<171>' and 'by ...'. So the HOSTNAME is accidentally set to 'Caused'. > >> > >> I'm wondering if it's a coincidence that if spaces exist between and messages in rawmsg and hostname is not provided, > >> then HOSTNAME will be set correctly? > > > > that's probably the case with current code, but I don't guarantee that > > will stay. Again: invalid format => unpredictable results on all header > > fields > > OK, now I see the malformed format messages will cause unpredictable results in rsyslog. > That's quite helpful. > > >> > >> And I thought I could use 'FROMHOST' property, but I have another scenario. > >> > >> ###################################################################################### > >> Debug line with all properties: > >> FROMHOST: '172.20.101.6', HOSTNAME: 'icarus', PRI: 174, > >> syslogtag 'httpd8330.sms:', programname: 'httpd8330.sms', APP-NAME: 'httpd8330.sms', PROCID: '-', MSGID: '-', FACILITY-TEXT: 'local5' > >> TIMESTAMP: 'Jan 19 15:14:50', STRUCTURED-DATA: '-', > >> msg: ' xxx.xxx.internal - - [19/Jan/2009:15:14:50 +0100] "GET > >> /itransport/mbg/mbg/io/mbg?provider=TMOBILE_XTC_3ABO_LIVE&request-type=chargeSubscription&critialdata&originator-id=PGW_6686//S0002865748&service-type=web&payment-type=subscr&amount=299&subscription-amount=299&item-amount=299&msisdn=00491704127650&subscription-id=1662126457&subscription-type=2&reply-path=http://pgw:8330/sms/pgw/intern/ReportReception&sms-text=Dein+Abo+wurde+mit+2.99+Euro+gebucht. > >> HTTP/1.1" 200 87#012' > >> rawmsg: '<174>2009-01-19T15:14:50.923441+01:00 icarus httpd8330.sms: xxx.xxx.internal - - [19/Jan/2009:15:14:50 +0100] "GET > >> /itransport/mbg/mbg/io/mbg?provider=TMOBILE_XTC_3ABO_LIVE&request-type=chargeSubscription&critialdata&originator-id=PGW_6686//S0002865748&service-type=web&payment-type=subscr&amount=299&subscription-amount=299&item-amount=299&msisdn=00491704127650&subscription-id=1662126457&subscription-type=2&reply-path=http://pgw:8330/sms/pgw/intern/ReportReception&sms-text=Dein+Abo+wurde+mit+2.99+Euro+gebucht. > >> HTTP/1.1" 200 87#012' > >> ###################################################################################### > >> > > that's a correctly formatted message > > > >> You could see in HOSTNAME field, it's correct set to 'icarus'. But in FROMHOST field is ip address. > >> And I do have reverse zone for that ip in dns setting. Any ideas? > > > > To get the name, you indeed need to enable remote lookups. One solution > > would be to permit different settings for different remote hosts, but > > that would be a feature request. Would make sense, but I am currently > > rather busy. If you add it to the bugzilla http://bugzilla.adiscon.com > > I'll see that I implement it when nothing of higher priority is in front > > of it. > > I've filed a bugzilla report [1] for your information. Anyway, one more question, if I use rsyslog at > the client side, will it avoid malformed/invalid format message sending out? I have tweaked the feature request a bit so that it matches the actual request ;) As far as rsyslog on the client side is concerned, you need to do nothing. If you use the default templates, it emits correctly formatted messages. Rainer From rgerhards at hq.adiscon.com Tue Jan 20 14:00:00 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Tue, 20 Jan 2009 14:00:00 +0100 Subject: [rsyslog] Anyone in Computer Forensics? Message-ID: <577465F99B41C842AAFBE9ED71E70ABA44F9EF@grfint2.intern.adiscon.com> Hi all, are there some folks on this list who are working in the computer forensics space? I wonder how syslog, and rsyslog in specific, works in forensics. Most importantly, I am interested in what stops acceptance in the forensics field (or what nurtures it). I am interested in feedback to help shape the medium to long term schedule for rsyslog (including those initiatives that I should learn more about). Any feedback is appreciated. Thanks, Rainer From rgerhards at hq.adiscon.com Tue Jan 20 15:27:57 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Tue, 20 Jan 2009 15:27:57 +0100 Subject: [rsyslog] Is rsyslog leaking memory? In-Reply-To: References: Message-ID: <577465F99B41C842AAFBE9ED71E70ABA44F9F4@grfint2.intern.adiscon.com> FYI: Based on a forum thread, I just created this page: http://wiki.rsyslog.com/index.php/Reducing_memory_usage I think it actually describes the source of the 8MB memory blocks. Rainer > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of Michael Biebl > Sent: Saturday, January 17, 2009 11:11 AM > To: rsyslog-users > Subject: [rsyslog] Is rsyslog leaking memory? > > Hi, > > I'm running rsyslog 3.20.2 > > I noticed the following: > # /etc/init.d/rsyslog restart > VSZ RSS (as reported by ps) > 27100 1184 > # logger foo > 27100 1196 > # logger foo (1000x) > 27100 1200 > # logger foo (1000x) > 27100 1204 > # logger foo (1000x) > 27100 1208 > > and so on. > > > This made me wonder, if rsyslog is leaking memory somewhere. > > I also noticed, that for each loaded module, rsyslog resevers exactly > 8 Mb of anoymous memory (pmap -d `pgrep rsyslog`) > With a couple of loaded modules you easily get over 50Mb VSZ. > > > Michael > -- > Why is it that all of the instruments seeking intelligent life in the > universe are pointed away from Earth? > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com From aoz.syn at gmail.com Tue Jan 20 16:39:34 2009 From: aoz.syn at gmail.com (RB) Date: Tue, 20 Jan 2009 08:39:34 -0700 Subject: [rsyslog] Anyone in Computer Forensics? In-Reply-To: <577465F99B41C842AAFBE9ED71E70ABA44F9EF@grfint2.intern.adiscon.com> References: <577465F99B41C842AAFBE9ED71E70ABA44F9EF@grfint2.intern.adiscon.com> Message-ID: <4255c2570901200739v7b60da13qf8fbf425c83503f8@mail.gmail.com> On Tue, Jan 20, 2009 at 06:00, Rainer Gerhards wrote: > are there some folks on this list who are working in the computer > forensics space? I wonder how syslog, and rsyslog in specific, works in > forensics. Could you clarify what you're asking here? There are two clearly delineated portions of the computer forensics space: that which is analyzed and that which performs the analysis. Are you looking more to improve analysis of rsyslog instances or to integrate into back-end tools? > Most importantly, I am interested in what stops acceptance in > the forensics field (or what nurtures it). I am interested in feedback > to help shape the medium to long term schedule for rsyslog (including > those initiatives that I should learn more about). Law Enforcement. LE is by far the biggest driver in industry acceptance, nearly regardless of technology. The "primary" forensics tool, EnCase, is a perfect example: there are many arguably better products on the market, but because huge numbers of extremely non-technical police officers are comfortable with it (since Guidance gives steep LE discounts), it is by far the biggest player. There isn't a huge amount of logging to be done in the analysis space. Although centralized solutions are becoming more prevalent, most of the critical logs are being (or will be) stored with the encrypted/signed forensic data for non-repudiation. Even so, there is more effort going into improving analysis (carvers, documenting formats, etc.) than building up proper logging and storage. From david at lang.hm Tue Jan 20 20:54:13 2009 From: david at lang.hm (david at lang.hm) Date: Tue, 20 Jan 2009 11:54:13 -0800 (PST) Subject: [rsyslog] Anyone in Computer Forensics? In-Reply-To: <4255c2570901200739v7b60da13qf8fbf425c83503f8@mail.gmail.com> References: <577465F99B41C842AAFBE9ED71E70ABA44F9EF@grfint2.intern.adiscon.com> <4255c2570901200739v7b60da13qf8fbf425c83503f8@mail.gmail.com> Message-ID: On Tue, 20 Jan 2009, RB wrote: > On Tue, Jan 20, 2009 at 06:00, Rainer Gerhards wrote: >> are there some folks on this list who are working in the computer >> forensics space? I wonder how syslog, and rsyslog in specific, works in >> forensics. > > Could you clarify what you're asking here? There are two clearly > delineated portions of the computer forensics space: that which is > analyzed and that which performs the analysis. Are you looking more > to improve analysis of rsyslog instances or to integrate into back-end > tools? > >> Most importantly, I am interested in what stops acceptance in >> the forensics field (or what nurtures it). I am interested in feedback >> to help shape the medium to long term schedule for rsyslog (including >> those initiatives that I should learn more about). I think that what he is asking about is what makes logs acceptable or not acceptable when doing forensics, and what configurations of rsyslog would be acceptable. for example, rsyslog can be configured to use disk-based queues on redundant drives and RELP for network communication, and the result will be that rsyslog is _very_ reliable in terms of preserving messages that get to it (at the cost of performance, but you can throw hardware at it to deal with that) this is probably acceptable as a log for forensics type work. but what about the more normal settings? (tcp or udp network communications with memory-based queues). those settings can loose data, but won't under normal conditions (assuming the network isn't so busy that it drops UDP packets) David Lang From jules at visionintel.com Tue Jan 20 20:14:58 2009 From: jules at visionintel.com (Jules Pagna Disso) Date: Tue, 20 Jan 2009 19:14:58 +0000 Subject: [rsyslog] client In-Reply-To: <497492AB.5030901@net-m.de> References: <69544300901190623g77d5f317n885a1e923f134f9f@mail.gmail.com> <497492AB.5030901@net-m.de> Message-ID: <69544300901201114r7671a2f9t3b8b8f27797b1cd7@mail.gmail.com> hi there, thanks for the answer it helped and does what I wanted. Now, I am wonder if there is a sample code how to send log file from a c/c++ code to syslog deamon. thanks Jules 2009/1/19 Patrick Shen > Jules Pagna Disso wrote: > > hi there, > > > > Is there an example of client sending alert to syslog? > > > > is it possible to create and send an alert from the command prompt to > > syslog? > > > > thanks, > > Jules > > Do you mean 'logger' ? > > Try 'man logger'. > > Best regards, > Patrick > > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com > From danson at rackspace.com Tue Jan 20 20:27:45 2009 From: danson at rackspace.com (Daniel Anson) Date: Tue, 20 Jan 2009 13:27:45 -0600 Subject: [rsyslog] client In-Reply-To: <69544300901201114r7671a2f9t3b8b8f27797b1cd7@mail.gmail.com> References: <69544300901190623g77d5f317n885a1e923f134f9f@mail.gmail.com><497492AB.5030901@net-m.de> <69544300901201114r7671a2f9t3b8b8f27797b1cd7@mail.gmail.com> Message-ID: <3435_1232479837_n0KJUY8N004510_96AF20FDF4301D419B33CCE8E3A0132B0ACED7E8@SAT4MX07.RACKSPACE.CORP> I use this: >gcc -o syslog_write syslog_writer.c >./syslog_writer 300 <-- This is the number of messages it will write #include #include #include int main(int argc, char **argv) { int num_syslogs = atoi(argv[1]), i; openlog("syslog_writer", LOG_CONS | LOG_PID, LOG_LOCAL1); for(i=0; i < num_syslogs; i++) { syslog(LOG_NOTICE, "syslog_writer: log number %d", i); } return(1); } -----Original Message----- From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog-bounces at lists.adiscon.com] On Behalf Of Jules Pagna Disso Sent: Tuesday, January 20, 2009 1:15 PM To: rsyslog-users Subject: Re: [rsyslog] client hi there, thanks for the answer it helped and does what I wanted. Now, I am wonder if there is a sample code how to send log file from a c/c++ code to syslog deamon. thanks Jules 2009/1/19 Patrick Shen > Jules Pagna Disso wrote: > > hi there, > > > > Is there an example of client sending alert to syslog? > > > > is it possible to create and send an alert from the command prompt to > > syslog? > > > > thanks, > > Jules > > Do you mean 'logger' ? > > Try 'man logger'. > > Best regards, > Patrick > > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com > _______________________________________________ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com Confidentiality Notice: This e-mail message (including any attached or embedded documents) is intended for the exclusive and confidential use of the individual or entity to which this message is addressed, and unless otherwise expressly indicated, is confidential and privileged information of Rackspace. Any dissemination, distribution or copying of the enclosed material is prohibited. If you receive this transmission in error, please notify us immediately by e-mail at abuse at rackspace.com, and delete the original message. Your cooperation is appreciated. From aoz.syn at gmail.com Wed Jan 21 18:59:42 2009 From: aoz.syn at gmail.com (RB) Date: Wed, 21 Jan 2009 10:59:42 -0700 Subject: [rsyslog] Anyone in Computer Forensics? In-Reply-To: References: <577465F99B41C842AAFBE9ED71E70ABA44F9EF@grfint2.intern.adiscon.com> <4255c2570901200739v7b60da13qf8fbf425c83503f8@mail.gmail.com> Message-ID: <4255c2570901210959t130dd49oc0402ebe7d8c2b69@mail.gmail.com> On Tue, Jan 20, 2009 at 12:54, wrote: > I think that what he is asking about is what makes logs acceptable or not > acceptable when doing forensics, and what configurations of rsyslog would > be acceptable. That's still unclear as to whether the logging instances are being analyzed or they are part of the analysis process (i.e. logging investigator actions, "interesting" items, etc.). > for example, rsyslog can be configured to use disk-based queues on > redundant drives and RELP for network communication, and the result will > be that rsyslog is _very_ reliable in terms of preserving messages that > get to it (at the cost of performance, but you can throw hardware at it to > deal with that) > > this is probably acceptable as a log for forensics type work. > > but what about the more normal settings? (tcp or udp network > communications with memory-based queues). those settings can loose data, > but won't under normal conditions (assuming the network isn't so busy that > it drops UDP packets) Generally speaking, forensics prefers the "save everything, impossible to lose" approach. A single lost message probably won't break a given case, but the possibility is definitely there. RELP with disk queues on hardware-redundant drives would probably be a good start if you're looking to ease future analysis, but it is my opinion that networked logging of the forensic process is both unlikely and overkill, as most analysis processes want their logs integrated instead of held as a separate source. One item I have had on my wish-list for quite some time is the ability to log directly to a UDF VAT filesystem (incremental writes on write-once optical media). Poor man's WORM, if you will. It would enable physical assurance that log data is unmodified up to the point of compromise. Add in the idea of incremental checksums or signing, and you have an extremely controlled, verifiable log source. Of course, it doesn't have to be solved in rsyslog-space, but it'd definitely be useful. RB From david at lang.hm Wed Jan 21 20:55:25 2009 From: david at lang.hm (david at lang.hm) Date: Wed, 21 Jan 2009 11:55:25 -0800 (PST) Subject: [rsyslog] Anyone in Computer Forensics? In-Reply-To: <4255c2570901210959t130dd49oc0402ebe7d8c2b69@mail.gmail.com> References: <577465F99B41C842AAFBE9ED71E70ABA44F9EF@grfint2.intern.adiscon.com> <4255c2570901200739v7b60da13qf8fbf425c83503f8@mail.gmail.com> <4255c2570901210959t130dd49oc0402ebe7d8c2b69@mail.gmail.com> Message-ID: On Wed, 21 Jan 2009, RB wrote: > On Tue, Jan 20, 2009 at 12:54, wrote: >> I think that what he is asking about is what makes logs acceptable or not >> acceptable when doing forensics, and what configurations of rsyslog would >> be acceptable. > > That's still unclear as to whether the logging instances are being > analyzed or they are part of the analysis process (i.e. logging > investigator actions, "interesting" items, etc.). I think it's the logs being analysed, not logging investigator actions (other than the extent that things the investigators do would be logged if anyone did them) >> for example, rsyslog can be configured to use disk-based queues on >> redundant drives and RELP for network communication, and the result will >> be that rsyslog is _very_ reliable in terms of preserving messages that >> get to it (at the cost of performance, but you can throw hardware at it to >> deal with that) >> >> this is probably acceptable as a log for forensics type work. >> >> but what about the more normal settings? (tcp or udp network >> communications with memory-based queues). those settings can loose data, >> but won't under normal conditions (assuming the network isn't so busy that >> it drops UDP packets) > > Generally speaking, forensics prefers the "save everything, impossible > to lose" approach. A single lost message probably won't break a given > case, but the possibility is definitely there. this is the most paranoid/conservative view, and by this definition there are basicly no logs in existance that meet the forensics requirements > RELP with disk queues > on hardware-redundant drives would probably be a good start if you're > looking to ease future analysis, but it is my opinion that networked > logging of the forensic process is both unlikely and overkill, as most > analysis processes want their logs integrated instead of held as a > separate source. > > One item I have had on my wish-list for quite some time is the ability > to log directly to a UDF VAT filesystem (incremental writes on > write-once optical media). Poor man's WORM, if you will. It would > enable physical assurance that log data is unmodified up to the point > of compromise. Add in the idea of incremental checksums or signing, > and you have an extremely controlled, verifiable log source. Of > course, it doesn't have to be solved in rsyslog-space, but it'd > definitely be useful. frankly, if you really need write-only media, the best thing to do (volume permitting) is to dump to a printer. David Lang From aoz.syn at gmail.com Wed Jan 21 21:59:28 2009 From: aoz.syn at gmail.com (RB) Date: Wed, 21 Jan 2009 13:59:28 -0700 Subject: [rsyslog] Anyone in Computer Forensics? In-Reply-To: References: <577465F99B41C842AAFBE9ED71E70ABA44F9EF@grfint2.intern.adiscon.com> <4255c2570901200739v7b60da13qf8fbf425c83503f8@mail.gmail.com> <4255c2570901210959t130dd49oc0402ebe7d8c2b69@mail.gmail.com> Message-ID: <4255c2570901211259o3b54573dp746a4d8f41efee24@mail.gmail.com> On Wed, Jan 21, 2009 at 12:55, wrote: > this is the most paranoid/conservative view, and by this definition there > are basicly no logs in existance that meet the forensics requirements Rather than set an unattainable standard, my intent was to communicate the conservative approach forensics would rather take. Edge cases and mitigating controls are acceptable as long as they are well-documented - that's basic security practice. I would rather see a solution that has 100 well-documented lossy edge cases than one that claims to be lossless with no proofs to back it. > frankly, if you really need write-only media, the best thing to do (volume > permitting) is to dump to a printer. You may want to recalculate; even 6-point font on large (14.875x11.5") tractor-feed paper only fits ~80MB per 3500-sheet box. Or, put another way, 2 512-byte events per second will burn through a $70 case per day. Or 6.5 reams of US Letter per day. Extremely limited volume. From david at lang.hm Wed Jan 21 23:19:01 2009 From: david at lang.hm (david at lang.hm) Date: Wed, 21 Jan 2009 14:19:01 -0800 (PST) Subject: [rsyslog] Anyone in Computer Forensics? In-Reply-To: <4255c2570901211259o3b54573dp746a4d8f41efee24@mail.gmail.com> References: <577465F99B41C842AAFBE9ED71E70ABA44F9EF@grfint2.intern.adiscon.com> <4255c2570901200739v7b60da13qf8fbf425c83503f8@mail.gmail.com> <4255c2570901210959t130dd49oc0402ebe7d8c2b69@mail.gmail.com> <4255c2570901211259o3b54573dp746a4d8f41efee24@mail.gmail.com> Message-ID: On Wed, 21 Jan 2009, RB wrote: > On Wed, Jan 21, 2009 at 12:55, wrote: >> this is the most paranoid/conservative view, and by this definition there >> are basicly no logs in existance that meet the forensics requirements > > Rather than set an unattainable standard, my intent was to communicate > the conservative approach forensics would rather take. Edge cases and > mitigating controls are acceptable as long as they are well-documented > - that's basic security practice. I would rather see a solution that > has 100 well-documented lossy edge cases than one that claims to be > lossless with no proofs to back it. the problem is that so many forensics people list the perfect situation and tell people that anything less won't stand up in court. like everything else, it's a reliability/performance/cost trade-off but we really aren't answering the initial question here (or rather we are demonstrating that there isn't a clear answer to the question) >> franklk, if you really need write-only media, the best thing to do (volume >> permitting) is to dump to a printer. > > You may want to recalculate; even 6-point font on large (14.875x11.5") > tractor-feed paper only fits ~80MB per 3500-sheet box. Or, put > another way, 2 512-byte events per second will burn through a $70 case > per day. Or 6.5 reams of US Letter per day. Extremely limited > volume. that's why I said volume permitting (and for your most critical logs the volume is probably fairly low) David Lang From rgerhards at hq.adiscon.com Wed Jan 21 22:21:08 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Wed, 21 Jan 2009 22:21:08 +0100 Subject: [rsyslog] Anyone in Computer Forensics? In-Reply-To: <4255c2570901211259o3b54573dp746a4d8f41efee24@mail.gmail.com> References: <577465F99B41C842AAFBE9ED71E70ABA44F9EF@grfint2.intern.adiscon.com><4255c2570901200739v7b60da13qf8fbf425c83503f8@mail.gmail.com><4255c2570901210959t130dd49oc0402ebe7d8c2b69@mail.gmail.com> <4255c2570901211259o3b54573dp746a4d8f41efee24@mail.gmail.com> Message-ID: <577465F99B41C842AAFBE9ED71E70ABA44FA0B@grfint2.intern.adiscon.com> Hi all, Sorry for posting the question and then being offline. I had a meeting and was after that a bit more swamped than I expected ;) Thanks for the good answers so far. My question was vague, but that reflected that I actually do not exactly know what to ask for. While I took a look at forensics every now and then, this is not an area where I have really any deep expertise. However, I should have stated that I am primarily interested on the event detection/gathering, transmission and storage part of the picture. That's where rsyslog can play a role (that limits the "event detection" process to listening to whoever wants to talk to it). The analysis part is beyond my scope right now (and probably will be for quite some time). As I said, I do not have an immediate need, but would like to understand the needs a bit better (and you have already provided good advise so far :)). The root cause of my question is that I would like to refine my medium, may be long term vision. While I think I can not implement any of the outcome, it helps my tune the implementation of things I do in a way that facilitates forensic needs (at least in cases where I have a choice). Without that information, I would probably do things in ways that will require much more effort once I get to "forensics-readiness". I hope this clarifies and sorry for not replying sooner. I will probably be a bit swamped 'til the end of the week, but will try to be more responsive now :) Thanks again for all that fine information, please keep it flowing. It is very useful. Rainer > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com > [mailto:rsyslog-bounces at lists.adiscon.com] On Behalf Of RB > Sent: Wednesday, January 21, 2009 9:59 PM > To: rsyslog-users > Subject: Re: [rsyslog] Anyone in Computer Forensics? > > On Wed, Jan 21, 2009 at 12:55, wrote: > > this is the most paranoid/conservative view, and by this > definition there > > are basicly no logs in existance that meet the forensics > requirements > > Rather than set an unattainable standard, my intent was to communicate > the conservative approach forensics would rather take. Edge cases and > mitigating controls are acceptable as long as they are well-documented > - that's basic security practice. I would rather see a solution that > has 100 well-documented lossy edge cases than one that claims to be > lossless with no proofs to back it. > > > frankly, if you really need write-only media, the best > thing to do (volume > > permitting) is to dump to a printer. > > You may want to recalculate; even 6-point font on large (14.875x11.5") > tractor-feed paper only fits ~80MB per 3500-sheet box. Or, put > another way, 2 512-byte events per second will burn through a $70 case > per day. Or 6.5 reams of US Letter per day. Extremely limited > volume. > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com > From milton at calnek.com Thu Jan 22 02:24:48 2009 From: milton at calnek.com (Milton Calnek) Date: Wed, 21 Jan 2009 19:24:48 -0600 Subject: [rsyslog] Multiple devices with same ip address. Message-ID: <4977CAE0.1040403@calnek.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi, I'm running a test lab with gear where every piece of gear under test has the same ip address. I have separated them via vlans, but I want to be able to send syslog from these devices to a central host... but with everything having the same ip address, there doesn't seem to be a way easily separate the logs. I see how to log based on ip, but not MAC nor interface. Before I invest in the development time, I was wondering if you folks have any suggestions? Thanks. - -- Milton Calnek BSc, A/Slt(Ret.) milton at calnek.com 306-717-8737 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.5 (GNU/Linux) Comment: Using GnuPG with CentOS - http://enigmail.mozdev.org iD8DBQFJd8rgHgnbf2T2QqMRArhdAKCCisNIrs+ohNoq2AUiaaiZJdT6SwCfSS3u 4r5JOPJn6SBPWlzMXUBjfQE= =eVoR -----END PGP SIGNATURE----- -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From rsyslog at lists.bod.org Thu Jan 22 03:31:39 2009 From: rsyslog at lists.bod.org (Paul Chambers) Date: Wed, 21 Jan 2009 18:31:39 -0800 Subject: [rsyslog] Multiple devices with same ip address. In-Reply-To: <4977CAE0.1040403@calnek.com> References: <4977CAE0.1040403@calnek.com> Message-ID: <4977DA8B.3010309@lists.bod.org> Couldn't you use NAT on the vlan interfaces? that way traffic on each interface could be mapped to a different IP address as seen by the logging machine. -- Paul Milton Calnek wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Hi, > > I'm running a test lab with gear where every piece of gear > under test has the same ip address. > > I have separated them via vlans, but I want to be able to send syslog > from these devices to a central host... but with everything having the > same ip address, there doesn't seem to be a way easily separate the logs. > I see how to log based on ip, but not MAC nor interface. > > Before I invest in the development time, I was wondering if you folks > have any suggestions? > > Thanks. > - -- > Milton Calnek BSc, A/Slt(Ret.) > milton at calnek.com > 306-717-8737 > > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.4.5 (GNU/Linux) > Comment: Using GnuPG with CentOS - http://enigmail.mozdev.org > > iD8DBQFJd8rgHgnbf2T2QqMRArhdAKCCisNIrs+ohNoq2AUiaaiZJdT6SwCfSS3u > 4r5JOPJn6SBPWlzMXUBjfQE= > =eVoR > -----END PGP SIGNATURE----- > > From milton at calnek.com Thu Jan 22 04:26:25 2009 From: milton at calnek.com (Milton Calnek) Date: Wed, 21 Jan 2009 21:26:25 -0600 Subject: [rsyslog] Multiple devices with same ip address. In-Reply-To: <4977DA8B.3010309@lists.bod.org> References: <4977CAE0.1040403@calnek.com> <4977DA8B.3010309@lists.bod.org> Message-ID: <4977E761.7070903@calnek.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Paul Chambers wrote: > Couldn't you use NAT on the vlan interfaces? that way traffic on each > interface could be mapped to a different IP address as seen by the > logging machine. I tried that. It didn't work for me. I don't remember the details just now, but it had something to do with the order things happen on the linux IP stack. If you can suggest a set of commands, I'll try it out. Thanks. - -- Milton Calnek BSc, A/Slt(Ret.) milton at calnek.com 306-717-8737 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.5 (GNU/Linux) Comment: Using GnuPG with CentOS - http://enigmail.mozdev.org iD8DBQFJd+dhHgnbf2T2QqMRArc9AKCf1tk2gW5XGOM4cCNevVj8QKwV5gCdHKAT 8OETLsF4Csv6d4/gFVlLtjU= =23Dv -----END PGP SIGNATURE----- -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From rsyslog at lists.bod.org Thu Jan 22 05:19:07 2009 From: rsyslog at lists.bod.org (Paul Chambers) Date: Wed, 21 Jan 2009 20:19:07 -0800 Subject: [rsyslog] Multiple devices with same ip address. In-Reply-To: <4977E761.7070903@calnek.com> References: <4977CAE0.1040403@calnek.com> <4977DA8B.3010309@lists.bod.org> <4977E761.7070903@calnek.com> Message-ID: <4977F3BB.6080205@lists.bod.org> Hard to give you specifics without a lot more information (and time's scarce, sorry). Something that helped me understand how netfilter handles packets, and the order the various tables/chains happen, is the documentation for ebtables, specifically: http://ebtables.sourceforge.net/br_fw_ia/br_fw_ia.html I'd be amazed if it's not possible to masquerade/source-NAT each vlan interface to a unique IP addresses. Between netfilter and ebtables, there's an enormous amount of flexibility. -- Paul Milton Calnek wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > > > Paul Chambers wrote: > >> Couldn't you use NAT on the vlan interfaces? that way traffic on each >> interface could be mapped to a different IP address as seen by the >> logging machine. >> > > I tried that. It didn't work for me. I don't remember the details just now, > but it had something to do with the order things happen on the linux IP stack. > > If you can suggest a set of commands, I'll try it out. > > Thanks. > - -- > Milton Calnek BSc, A/Slt(Ret.) > milton at calnek.com > 306-717-8737 > > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.4.5 (GNU/Linux) > Comment: Using GnuPG with CentOS - http://enigmail.mozdev.org > > iD8DBQFJd+dhHgnbf2T2QqMRArc9AKCf1tk2gW5XGOM4cCNevVj8QKwV5gCdHKAT > 8OETLsF4Csv6d4/gFVlLtjU= > =23Dv > -----END PGP SIGNATURE----- > > From david at lang.hm Thu Jan 22 07:48:49 2009 From: david at lang.hm (david at lang.hm) Date: Wed, 21 Jan 2009 22:48:49 -0800 (PST) Subject: [rsyslog] Multiple devices with same ip address. In-Reply-To: <4977CAE0.1040403@calnek.com> References: <4977CAE0.1040403@calnek.com> Message-ID: On Wed, 21 Jan 2009, Milton Calnek wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Hi, > > I'm running a test lab with gear where every piece of gear > under test has the same ip address. > > I have separated them via vlans, but I want to be able to send syslog > from these devices to a central host... but with everything having the > same ip address, there doesn't seem to be a way easily separate the logs. > I see how to log based on ip, but not MAC nor interface. > > Before I invest in the development time, I was wondering if you folks > have any suggestions? if you are running rsyslog on the systems under test, try changing the template that rsyslog uses to sent the messages out from each system puts something unique in it's logs. David Lang From rgerhards at hq.adiscon.com Thu Jan 22 08:46:48 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Thu, 22 Jan 2009 08:46:48 +0100 Subject: [rsyslog] Multiple devices with same ip address. In-Reply-To: References: <4977CAE0.1040403@calnek.com> Message-ID: <577465F99B41C842AAFBE9ED71E70ABA44FA0D@grfint2.intern.adiscon.com> David is right, this is probably the best way to do it. Even if the sender's in question are not powered by rsyslog, it most often is possible to put something unique into the messages. If there are few devices (<= 8), you can also use the local syslog facilities to identify the instances (almost all senders allow to configure that). In any case, you can then use the unique identifier to sort out messages to different bins on the receiver. HTH Rainer > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of david at lang.hm > Sent: Thursday, January 22, 2009 7:49 AM > To: rsyslog-users > Subject: Re: [rsyslog] Multiple devices with same ip address. > > On Wed, 21 Jan 2009, Milton Calnek wrote: > > > -----BEGIN PGP SIGNED MESSAGE----- > > Hash: SHA1 > > > > Hi, > > > > I'm running a test lab with gear where every piece of gear > > under test has the same ip address. > > > > I have separated them via vlans, but I want to be able to send syslog > > from these devices to a central host... but with everything having > the > > same ip address, there doesn't seem to be a way easily separate the > logs. > > I see how to log based on ip, but not MAC nor interface. > > > > Before I invest in the development time, I was wondering if you folks > > have any suggestions? > > if you are running rsyslog on the systems under test, try changing the > template that rsyslog uses to sent the messages out from > each system puts something unique in it's logs. > > David Lang > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com From rgerhards at hq.adiscon.com Thu Jan 22 16:58:24 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Thu, 22 Jan 2009 16:58:24 +0100 Subject: [rsyslog] rsyslog still crashes In-Reply-To: References: <1232046859.22744.39.camel@localhost.localdomain><577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9C2@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9C8@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9CA@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9CB@grfint2.intern.adiscon.com> Message-ID: <577465F99B41C842AAFBE9ED71E70ABA44FA18@grfint2.intern.adiscon.com> Hi folks, just an update on this matter. Lorenzo needed to change his system setup after some problems. We are in contact and expect to conduct further testing soon (hopefully the bug will reappear). Even better news is that I have been able to reproduce the bug 4 times in my lab today. It's not as easy as I would hope, but at least I can get results with some patience. I am also experimenting a bit with Twitter and actually found it useful to keep track of the troubleshooting process. Those of your interested can follow it at http://twitter.com/rgerhards I don't promise (yet) to keep it current at all times, but I will use it during the troubleshooting effort. Rainer > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of Lorenzo M. Catucci > Sent: Friday, January 16, 2009 6:29 PM > To: rsyslog-users > Subject: Re: [rsyslog] rsyslog still crashes > > On Fri, 16 Jan 2009, Rainer Gerhards wrote: > > RG> Ah, ok. Side-note: I got my machine up and it is running some test. > RG> Unfortunately no aborts so far, but is has only 4 cores... I hope > RG> something turns out... > RG> > > I think the real problem is in keeping those cores very busy... I'd try > to > spawn something like 20 loggers each spawning a couple "workers" per > second and logging startup/shutdown of any child. Maybe make each > worker > sleep for a random time before exiting. > > I don't have any Fedora/RedHat system; if nothing else, I'd suggest > doing > your tests on a debian/testing system too. > > Yours, > > lorenzo > > PS still running... > > > +-------------------------+-------------------------------------------- > --+ > | Lorenzo M. Catucci | Centro di Calcolo e Documentazione > | > | catucci at ccd.uniroma2.it | Universit? degli Studi di Roma "Tor > Vergata" | > | | Via O. Raimondo 18 ** I-00173 ROMA ** > ITALY | > | Tel. +39 06 7259 2255 | Fax. +39 06 7259 2125 > | > +-------------------------+-------------------------------------------- > --+ From lorenzo at sancho.ccd.uniroma2.it Thu Jan 22 17:19:15 2009 From: lorenzo at sancho.ccd.uniroma2.it (Lorenzo M. Catucci) Date: Thu, 22 Jan 2009 17:19:15 +0100 (CET) Subject: [rsyslog] rsyslog still crashes In-Reply-To: <577465F99B41C842AAFBE9ED71E70ABA44FA18@grfint2.intern.adiscon.com> References: <577465F99B41C842AAFBE9ED71E70ABA44FA18@grfint2.intern.adiscon.com> Message-ID: On Thu, 22 Jan 2009, Rainer Gerhards wrote: RG> Hi folks, RG> RG> just an update on this matter. Lorenzo needed to change his system RG> setup after some problems. We are in contact and expect to conduct RG> further testing soon (hopefully the bug will reappear). RG> Some administration chores the last couple of days; almost finished, big hopes for the week-end!!! RG> RG> Even better news is that I have been able to reproduce the bug 4 times RG> in my lab today. It's not as easy as I would hope, but at least I can RG> get results with some patience. I am also experimenting a bit with RG> Twitter and actually found it useful to keep track of the RG> troubleshooting process. Those of your interested can follow it at RG> This is really great news! Really, since rsyslog is been running this well since a long time on "normal" systems, and I've been (almost) alone in experiencing the crashes, the critters should have been hiding very well! See you soon, lorenzo +-------------------------+----------------------------------------------+ | Lorenzo M. Catucci | Centro di Calcolo e Documentazione | | catucci at ccd.uniroma2.it | Universit? degli Studi di Roma "Tor Vergata" | | | Via O. Raimondo 18 ** I-00173 ROMA ** ITALY | | Tel. +39 06 7259 2255 | Fax. +39 06 7259 2125 | +-------------------------+----------------------------------------------+ From rgerhards at hq.adiscon.com Thu Jan 22 18:53:44 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Thu, 22 Jan 2009 18:53:44 +0100 Subject: [rsyslog] rsyslog still crashes In-Reply-To: References: <577465F99B41C842AAFBE9ED71E70ABA44FA18@grfint2.intern.adiscon.com> Message-ID: <577465F99B41C842AAFBE9ED71E70ABA44FA1A@grfint2.intern.adiscon.com> OK, an update, full history at http://twitter.com/rgerhards It looks like there is some trouble with GCC atomic operation support. Has anyone seen this race on a non-Debian platform? I am asking because that may narrow down (or not ;)) the issue. Of course, I am not sure if atomic operations are really the root cause. However, replacing them is not very practical at some places and definitely time-consuming. So I'd like to have some feedback before I take that route. Does anyone know if there is a problem with atomic operation support in Debian (no bashing, honest question ;))? Rainer > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of Lorenzo M. Catucci > Sent: Thursday, January 22, 2009 5:19 PM > To: rsyslog-users > Subject: Re: [rsyslog] rsyslog still crashes > > On Thu, 22 Jan 2009, Rainer Gerhards wrote: > > RG> Hi folks, > RG> > RG> just an update on this matter. Lorenzo needed to change his system > RG> setup after some problems. We are in contact and expect to conduct > RG> further testing soon (hopefully the bug will reappear). > RG> > > Some administration chores the last couple of days; almost finished, > big hopes for the week-end!!! > > RG> > RG> Even better news is that I have been able to reproduce the bug 4 > times > RG> in my lab today. It's not as easy as I would hope, but at least I > can > RG> get results with some patience. I am also experimenting a bit with > RG> Twitter and actually found it useful to keep track of the > RG> troubleshooting process. Those of your interested can follow it at > RG> > > This is really great news! Really, since rsyslog is been running this > well > since a long time on "normal" systems, and I've been (almost) alone in > experiencing the crashes, the critters should have been hiding very > well! > > See you soon, > > lorenzo > > > +-------------------------+-------------------------------------------- > --+ > | Lorenzo M. Catucci | Centro di Calcolo e Documentazione > | > | catucci at ccd.uniroma2.it | Universit? degli Studi di Roma "Tor > Vergata" | > | | Via O. Raimondo 18 ** I-00173 ROMA ** > ITALY | > | Tel. +39 06 7259 2255 | Fax. +39 06 7259 2125 > | > +-------------------------+-------------------------------------------- > --+ From mbiebl at gmail.com Thu Jan 22 19:46:30 2009 From: mbiebl at gmail.com (Michael Biebl) Date: Thu, 22 Jan 2009 19:46:30 +0100 Subject: [rsyslog] rsyslog still crashes In-Reply-To: <577465F99B41C842AAFBE9ED71E70ABA44FA1A@grfint2.intern.adiscon.com> References: <577465F99B41C842AAFBE9ED71E70ABA44FA18@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44FA1A@grfint2.intern.adiscon.com> Message-ID: 2009/1/22 Rainer Gerhards : > OK, an update, full history at http://twitter.com/rgerhards > > It looks like there is some trouble with GCC atomic operation support. Has anyone seen this race on a non-Debian platform? I am asking because that may narrow down (or not ;)) the issue. Of course, I am not sure if atomic operations are really the root cause. However, replacing them is not very practical at some places and definitely time-consuming. So I'd like to have some feedback before I take that route. > > Does anyone know if there is a problem with atomic operation support in Debian (no bashing, honest question ;))? This would be a compiler (GCC) problem then, right? I'm not aware of any such problem. FWIW Debian is using GCC 4.3 in lenny/sid I've checked the bugs reported against the Debian gcc package [1] and the Debian specific patches on top of gcc [2], but I didn't find anything obvious. Rainer, if you have a more specific question, I could forward that question to the Debian GCC maintainers. Cheers, Michael [1] http://bugs.debian.org/cgi-bin/pkgreport.cgi?src=gcc-4.3&repeatmerged=no [2] http://patch-tracking.debian.net/package/gcc-4.3/4.3.2-1.1 -- Why is it that all of the instruments seeking intelligent life in the universe are pointed away from Earth? From rgerhards at hq.adiscon.com Thu Jan 22 21:18:19 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Thu, 22 Jan 2009 21:18:19 +0100 Subject: [rsyslog] rsyslog still crashes In-Reply-To: References: <577465F99B41C842AAFBE9ED71E70ABA44FA18@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44FA1A@grfint2.intern.adiscon.com> Message-ID: <577465F99B41C842AAFBE9ED71E70ABA44FA1B@grfint2.intern.adiscon.com> > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com > [mailto:rsyslog-bounces at lists.adiscon.com] On Behalf Of Michael Biebl > Sent: Thursday, January 22, 2009 7:47 PM > To: rsyslog-users > Subject: Re: [rsyslog] rsyslog still crashes > > 2009/1/22 Rainer Gerhards : > > OK, an update, full history at http://twitter.com/rgerhards > > > > It looks like there is some trouble with GCC atomic > operation support. Has anyone seen this race on a non-Debian > platform? I am asking because that may narrow down (or not > ;)) the issue. Of course, I am not sure if atomic operations > are really the root cause. However, replacing them is not > very practical at some places and definitely time-consuming. > So I'd like to have some feedback before I take that route. > > > > Does anyone know if there is a problem with atomic > operation support in Debian (no bashing, honest question ;))? > > This would be a compiler (GCC) problem then, right? Excatly > > I'm not aware of any such problem. FWIW Debian is using GCC > 4.3 in lenny/sid > I've checked the bugs reported against the Debian gcc package [1] and > the Debian specific patches on top of gcc [2], > but I didn't find anything obvious. > > Rainer, if you have a more specific question, I could forward that > question to the Debian GCC maintainers. Thanks, Michael. But I think before we ask other's for their time, I'll try to do my homework. So far, I am just guessing. As I now seem to be able to repro the problem, I can look further into it. Tomorrow, I'll first check what it takes to replace the atomic operations by mutex calls. I think that's quite some work, but hopefully I am wrong. Thanks to the info you provided, this seems to be useful work. I keep you posted. Rainer > > Cheers, > Michael > > [1] > http://bugs.debian.org/cgi-bin/pkgreport.cgi?src=gcc-4.3&repea > tmerged=no > [2] http://patch-tracking.debian.net/package/gcc-4.3/4.3.2-1.1 > > -- > Why is it that all of the instruments seeking intelligent life in the > universe are pointed away from Earth? > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com > From danson at rackspace.com Mon Jan 26 21:51:13 2009 From: danson at rackspace.com (Daniel Anson) Date: Mon, 26 Jan 2009 14:51:13 -0600 Subject: [rsyslog] UNIX timestamp Message-ID: <7161_1233003195_n0QKr81o012376_96AF20FDF4301D419B33CCE8E3A0132B0AE1CFD1@SAT4MX07.RACKSPACE.CORP> Is there a convention in rsyslog whereby I can get a UNIX timestamp instead of the other RFC time standards? Daniel M. Anson Linux Systems Engineer Rackspace danson at rackspace.com Office: (210)312-5114 Confidentiality Notice: This e-mail message (including any attached or embedded documents) is intended for the exclusive and confidential use of the individual or entity to which this message is addressed, and unless otherwise expressly indicated, is confidential and privileged information of Rackspace. Any dissemination, distribution or copying of the enclosed material is prohibited. If you receive this transmission in error, please notify us immediately by e-mail at abuse at rackspace.com, and delete the original message. Your cooperation is appreciated. From hks.private at gmail.com Mon Jan 26 22:10:06 2009 From: hks.private at gmail.com ((private) HKS) Date: Mon, 26 Jan 2009 16:10:06 -0500 Subject: [rsyslog] UNIX timestamp In-Reply-To: <7161_1233003195_n0QKr81o012376_96AF20FDF4301D419B33CCE8E3A0132B0AE1CFD1@SAT4MX07.RACKSPACE.CORP> References: <7161_1233003195_n0QKr81o012376_96AF20FDF4301D419B33CCE8E3A0132B0AE1CFD1@SAT4MX07.RACKSPACE.CORP> Message-ID: On Mon, Jan 26, 2009 at 3:51 PM, Daniel Anson wrote: > Is there a convention in rsyslog whereby I can get a UNIX timestamp > instead of the other RFC time standards? > > > > Daniel M. Anson > Linux Systems Engineer > Rackspace > danson at rackspace.com > Office: (210)312-5114 Unfortunately, no. You can find a serious discussion about it at http://kb.monitorware.com/post14653.html, but in a word, it's complicated. -HKS > > > > > Confidentiality Notice: This e-mail message (including any attached or > embedded documents) is intended for the exclusive and confidential use of the > individual or entity to which this message is addressed, and unless otherwise > expressly indicated, is confidential and privileged information of Rackspace. > Any dissemination, distribution or copying of the enclosed material is prohibited. > If you receive this transmission in error, please notify us immediately by e-mail > at abuse at rackspace.com, and delete the original message. > Your cooperation is appreciated. > > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com > From danson at rackspace.com Mon Jan 26 22:16:18 2009 From: danson at rackspace.com (Daniel Anson) Date: Mon, 26 Jan 2009 15:16:18 -0600 Subject: [rsyslog] UNIX timestamp In-Reply-To: References: <7161_1233003195_n0QKr81o012376_96AF20FDF4301D419B33CCE8E3A0132B0AE1CFD1@SAT4MX07.RACKSPACE.CORP> Message-ID: <15897_1233004899_n0QLLcFR018661_96AF20FDF4301D419B33CCE8E3A0132B0AE1CFEB@SAT4MX07.RACKSPACE.CORP> I figured as much but I thought I would ask. In essence, writing a UNIX timestamp would go against the RFC standard especially if an rsyslog server were set up as a relay. I am using MySQL UNIX_TIMESTAMP() function to get what I need but thought this may be available locally in rsyslog. Thx for the reply, Daniel -----Original Message----- From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog-bounces at lists.adiscon.com] On Behalf Of (private) HKS Sent: Monday, January 26, 2009 3:10 PM To: rsyslog-users Subject: Re: [rsyslog] UNIX timestamp On Mon, Jan 26, 2009 at 3:51 PM, Daniel Anson wrote: > Is there a convention in rsyslog whereby I can get a UNIX timestamp > instead of the other RFC time standards? > > > > Daniel M. Anson > Linux Systems Engineer > Rackspace > danson at rackspace.com > Office: (210)312-5114 Unfortunately, no. You can find a serious discussion about it at http://kb.monitorware.com/post14653.html, but in a word, it's complicated. -HKS > > > > > Confidentiality Notice: This e-mail message (including any attached or > embedded documents) is intended for the exclusive and confidential use of the > individual or entity to which this message is addressed, and unless otherwise > expressly indicated, is confidential and privileged information of Rackspace. > Any dissemination, distribution or copying of the enclosed material is prohibited. > If you receive this transmission in error, please notify us immediately by e-mail > at abuse at rackspace.com, and delete the original message. > Your cooperation is appreciated. > > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com > _______________________________________________ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com Confidentiality Notice: This e-mail message (including any attached or embedded documents) is intended for the exclusive and confidential use of the individual or entity to which this message is addressed, and unless otherwise expressly indicated, is confidential and privileged information of Rackspace. Any dissemination, distribution or copying of the enclosed material is prohibited. If you receive this transmission in error, please notify us immediately by e-mail at abuse at rackspace.com, and delete the original message. Your cooperation is appreciated. From sur5r at sur5r.net Tue Jan 27 19:07:09 2009 From: sur5r at sur5r.net (Jakob Haufe) Date: Tue, 27 Jan 2009 19:07:09 +0100 Subject: [rsyslog] Is rsyslog leaking memory? References: <1232276513.22744.45.camel@localhost.localdomain> Message-ID: <20090127190709.40a2b81b@mp-atlantis3.ziti.uni-heidelberg.de> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Sun, 18 Jan 2009 12:01:53 +0100 Rainer Gerhards wrote: > From what I have seen so far, I, too, doubt there is a leak. However, > there are various levels of testing. For example, the postgres output > module and the GSSAPI code is contributed and I do not even have a > test environment. So these are not checked using that procedure. The > libdbi code is only checked every now and then and not with all > backends (e.g. no Oracle at hand ... and so on...). If I ever get > over to a full testing suite (no collaborators found so far...), I'll > probably be able to do more consitent testing of all modules. As I'm the one who wrote (or rather ported) the postgres module, I would be willing to help debugging/valgrinding it. Unfortunately, I have not yet completely understood how the files tests/ work. To be honest, I have just started looking at it. What would you suggest as a way to test ompgqsl in particular? Simply run rsyslogd with valgrind and throw messages against it? Regarding GSSAPI: As I'm a big fan of Kerberos I will definitely give it a try as soon as I have some spare time, maybe I can help in valgrinding it, too. Regards, Jakob (aka sur5r) -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (GNU/Linux) iEYEARECAAYFAkl/TU0ACgkQ1YAhDic+ada31QCgu1f54fx4XMNpLjrASZ2fGIJ8 V8sAoKD8hRx7tuRzpwkajg5PPCDkwnLY =luw3 -----END PGP SIGNATURE----- From rsyslog at clark-communications.com Wed Jan 28 02:19:45 2009 From: rsyslog at clark-communications.com (Don Jackson) Date: Tue, 27 Jan 2009 17:19:45 -0800 Subject: [rsyslog] UPDATE: sysutils/rsyslog-3.20.3 Message-ID: Port updated to the recent 3.20.3 release of rsyslog. Tested on OpenBSD 4.4, amd64 and i386. It would be great if someone would commit this to the OpenBSD ports tree. $ cat ./pkg/DESCR A syslogd replacement -------------- next part -------------- From rgerhards at hq.adiscon.com Wed Jan 28 18:32:04 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Wed, 28 Jan 2009 18:32:04 +0100 Subject: [rsyslog] rsyslog still crashes In-Reply-To: <577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com> References: <1232046859.22744.39.camel@localhost.localdomain><577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com> Message-ID: <577465F99B41C842AAFBE9ED71E70ABA44FA8F@grfint2.intern.adiscon.com> Hi all, thanks to Lorenzo's help, we made good progress. It is too much to post inside a mail, please have a look at my analysis of the bug: http://blog.gerhards.net/2009/01/rsyslog-data-race-analysis.html The short story is that we have at least improved the situation very much and I hope to have fixes for all branches within the next couple of days. Rainer > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of Rainer Gerhards > Sent: Friday, January 16, 2009 3:22 PM > To: rsyslog-users > Subject: Re: [rsyslog] rsyslog still crashes > > Lorenzo, > > I have created a new branch "raceDebug" and done a first commit to it. > The change is very lightweight. Please pull, compile as usual and give > it a try. It spits out some info to stdout from time to time > (hopefully). I am not sure if it aborts, depending on the output it may > or may not. Even if we get messages, they are probably not enough to > pinpoint the bug, but I wanted to do something very light to see if the > bug stays. > > Feedback appreciated. > > Rainer From david at lang.hm Thu Jan 29 09:36:41 2009 From: david at lang.hm (david at lang.hm) Date: Thu, 29 Jan 2009 00:36:41 -0800 (PST) Subject: [rsyslog] rsyslog still crashes In-Reply-To: <577465F99B41C842AAFBE9ED71E70ABA44FA8F@grfint2.intern.adiscon.com> References: <1232046859.22744.39.camel@localhost.localdomain><577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44FA8F@grfint2.intern.adiscon.com> Message-ID: On Wed, 28 Jan 2009, Rainer Gerhards wrote: > Hi all, > > thanks to Lorenzo's help, we made good progress. It is too much to post > inside a mail, please have a look at my analysis of the bug: > > http://blog.gerhards.net/2009/01/rsyslog-data-race-analysis.html > > The short story is that we have at least improved the situation very > much and I hope to have fixes for all branches within the next couple of > days. I just finished reading through this excellant write-up one small thing. you quote the spec Accesses to cacheable memory that are split across bus widths, cache lines, and page boundaries are not guaranteed to be atomic and then conclude that So aligned word-access does not guarantee (not even enhance the chance) of atomicity. I read that to mean that the alignment requirements are more complicated, not that alignment is useless. you should also look at the code that's generated by -Os, with the heavily cached systems that we have nowdays it's common that the code being smaller (and therefor more of the code fitting into the L1 cache) is more of an advantage than the optimizations that -O3 provides. congradulations on tracking down a nasty and subtle issue. David Lang > Rainer > >> -----Original Message----- >> From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- >> bounces at lists.adiscon.com] On Behalf Of Rainer Gerhards >> Sent: Friday, January 16, 2009 3:22 PM >> To: rsyslog-users >> Subject: Re: [rsyslog] rsyslog still crashes >> >> Lorenzo, >> >> I have created a new branch "raceDebug" and done a first commit to it. >> The change is very lightweight. Please pull, compile as usual and give >> it a try. It spits out some info to stdout from time to time >> (hopefully). I am not sure if it aborts, depending on the output it > may >> or may not. Even if we get messages, they are probably not enough to >> pinpoint the bug, but I wanted to do something very light to see if > the >> bug stays. >> >> Feedback appreciated. >> >> Rainer > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com > From rgerhards at hq.adiscon.com Thu Jan 29 10:42:48 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Thu, 29 Jan 2009 10:42:48 +0100 Subject: [rsyslog] rsyslog still crashes In-Reply-To: <577465F99B41C842AAFBE9ED71E70ABA44FA8F@grfint2.intern.adiscon.com> References: <1232046859.22744.39.camel@localhost.localdomain><577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44FA8F@grfint2.intern.adiscon.com> Message-ID: <577465F99B41C842AAFBE9ED71E70ABA44FA9A@grfint2.intern.adiscon.com> Hi all, I had another interesting discussion with Lorenzo today. Those of you interested in details my find the chatlog interesting: http://blog.gerhards.net/2009/01/some-more-on-rsyslog-data-race.html Rainer > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of Rainer Gerhards > Sent: Wednesday, January 28, 2009 6:32 PM > To: rsyslog-users > Subject: Re: [rsyslog] rsyslog still crashes > > Hi all, > > thanks to Lorenzo's help, we made good progress. It is too much to post > inside a mail, please have a look at my analysis of the bug: > > http://blog.gerhards.net/2009/01/rsyslog-data-race-analysis.html > > The short story is that we have at least improved the situation very > much and I hope to have fixes for all branches within the next couple > of > days. > > Rainer > > > -----Original Message----- > > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > > bounces at lists.adiscon.com] On Behalf Of Rainer Gerhards > > Sent: Friday, January 16, 2009 3:22 PM > > To: rsyslog-users > > Subject: Re: [rsyslog] rsyslog still crashes > > > > Lorenzo, > > > > I have created a new branch "raceDebug" and done a first commit to > it. > > The change is very lightweight. Please pull, compile as usual and > give > > it a try. It spits out some info to stdout from time to time > > (hopefully). I am not sure if it aborts, depending on the output it > may > > or may not. Even if we get messages, they are probably not enough to > > pinpoint the bug, but I wanted to do something very light to see if > the > > bug stays. > > > > Feedback appreciated. > > > > Rainer > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com From david at lang.hm Thu Jan 29 12:06:03 2009 From: david at lang.hm (david at lang.hm) Date: Thu, 29 Jan 2009 03:06:03 -0800 (PST) Subject: [rsyslog] rsyslog still crashes In-Reply-To: <577465F99B41C842AAFBE9ED71E70ABA44FA9A@grfint2.intern.adiscon.com> References: <1232046859.22744.39.camel@localhost.localdomain><577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44FA8F@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44FA9A@grfint2.intern.adiscon.com> Message-ID: On Thu, 29 Jan 2009, Rainer Gerhards wrote: > Hi all, > > I had another interesting discussion with Lorenzo today. Those of you > interested in details my find the chatlog interesting: > > http://blog.gerhards.net/2009/01/some-more-on-rsyslog-data-race.html so, distilling this down I think I am reading the following. 1. mixing mutex and atomic operations is a problem, one or the other is safe 2. reliable duplication of the problem requires fast machine multiple cores _not_ sharing L1 cache (early Intel 4-core machines or multi-socket machines) a complex rsyslog config that uses multiple thread heavily high traffic log volume to heavily load rsyslog high system load external to rsyslog increases the chancesof the race question, have you tried enabling/disabling preemption in the kernel on these systems to see if that affects the probability of having a problem? I'm eagerly waiting for the fixes to appear in the 4.1 branch to test them out. David Lang > Rainer > >> -----Original Message----- >> From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- >> bounces at lists.adiscon.com] On Behalf Of Rainer Gerhards >> Sent: Wednesday, January 28, 2009 6:32 PM >> To: rsyslog-users >> Subject: Re: [rsyslog] rsyslog still crashes >> >> Hi all, >> >> thanks to Lorenzo's help, we made good progress. It is too much to > post >> inside a mail, please have a look at my analysis of the bug: >> >> http://blog.gerhards.net/2009/01/rsyslog-data-race-analysis.html >> >> The short story is that we have at least improved the situation very >> much and I hope to have fixes for all branches within the next couple >> of >> days. >> >> Rainer >> >>> -----Original Message----- >>> From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- >>> bounces at lists.adiscon.com] On Behalf Of Rainer Gerhards >>> Sent: Friday, January 16, 2009 3:22 PM >>> To: rsyslog-users >>> Subject: Re: [rsyslog] rsyslog still crashes >>> >>> Lorenzo, >>> >>> I have created a new branch "raceDebug" and done a first commit to >> it. >>> The change is very lightweight. Please pull, compile as usual and >> give >>> it a try. It spits out some info to stdout from time to time >>> (hopefully). I am not sure if it aborts, depending on the output it >> may >>> or may not. Even if we get messages, they are probably not enough to >>> pinpoint the bug, but I wanted to do something very light to see if >> the >>> bug stays. >>> >>> Feedback appreciated. >>> >>> Rainer >> _______________________________________________ >> rsyslog mailing list >> http://lists.adiscon.net/mailman/listinfo/rsyslog >> http://www.rsyslog.com > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com > From rgerhards at hq.adiscon.com Thu Jan 29 11:08:04 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Thu, 29 Jan 2009 11:08:04 +0100 Subject: [rsyslog] rsyslog still crashes In-Reply-To: References: <1232046859.22744.39.camel@localhost.localdomain><577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44FA8F@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44FA9A@grfint2.intern.adiscon.com> Message-ID: <577465F99B41C842AAFBE9ED71E70ABA44FA9B@grfint2.intern.adiscon.com> A full answer follows soon, but in essence you got it :) I will be working on the 4.1 version today, thus the brief reply ;) > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of david at lang.hm > Sent: Thursday, January 29, 2009 12:06 PM > To: rsyslog-users > Subject: Re: [rsyslog] rsyslog still crashes > > On Thu, 29 Jan 2009, Rainer Gerhards wrote: > > > Hi all, > > > > I had another interesting discussion with Lorenzo today. Those of you > > interested in details my find the chatlog interesting: > > > > http://blog.gerhards.net/2009/01/some-more-on-rsyslog-data-race.html > > so, distilling this down I think I am reading the following. > > 1. mixing mutex and atomic operations is a problem, one or the other is > safe > > 2. reliable duplication of the problem requires > > fast machine > multiple cores _not_ sharing L1 cache (early Intel 4-core machines or > multi-socket machines) > a complex rsyslog config that uses multiple thread heavily > high traffic log volume to heavily load rsyslog > high system load external to rsyslog increases the chancesof the race > > question, have you tried enabling/disabling preemption in the kernel on > these systems to see if that affects the probability of having a > problem? > > I'm eagerly waiting for the fixes to appear in the 4.1 branch to test > them > out. > > David Lang > > > > Rainer > > > >> -----Original Message----- > >> From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > >> bounces at lists.adiscon.com] On Behalf Of Rainer Gerhards > >> Sent: Wednesday, January 28, 2009 6:32 PM > >> To: rsyslog-users > >> Subject: Re: [rsyslog] rsyslog still crashes > >> > >> Hi all, > >> > >> thanks to Lorenzo's help, we made good progress. It is too much to > > post > >> inside a mail, please have a look at my analysis of the bug: > >> > >> http://blog.gerhards.net/2009/01/rsyslog-data-race-analysis.html > >> > >> The short story is that we have at least improved the situation very > >> much and I hope to have fixes for all branches within the next > couple > >> of > >> days. > >> > >> Rainer > >> > >>> -----Original Message----- > >>> From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > >>> bounces at lists.adiscon.com] On Behalf Of Rainer Gerhards > >>> Sent: Friday, January 16, 2009 3:22 PM > >>> To: rsyslog-users > >>> Subject: Re: [rsyslog] rsyslog still crashes > >>> > >>> Lorenzo, > >>> > >>> I have created a new branch "raceDebug" and done a first commit to > >> it. > >>> The change is very lightweight. Please pull, compile as usual and > >> give > >>> it a try. It spits out some info to stdout from time to time > >>> (hopefully). I am not sure if it aborts, depending on the output it > >> may > >>> or may not. Even if we get messages, they are probably not enough > to > >>> pinpoint the bug, but I wanted to do something very light to see if > >> the > >>> bug stays. > >>> > >>> Feedback appreciated. > >>> > >>> Rainer > >> _______________________________________________ > >> rsyslog mailing list > >> http://lists.adiscon.net/mailman/listinfo/rsyslog > >> http://www.rsyslog.com > > _______________________________________________ > > rsyslog mailing list > > http://lists.adiscon.net/mailman/listinfo/rsyslog > > http://www.rsyslog.com > > > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com From mrdemeanour at jackpot.uk.net Thu Jan 29 12:12:41 2009 From: mrdemeanour at jackpot.uk.net (Mr. Demeanour) Date: Thu, 29 Jan 2009 11:12:41 +0000 Subject: [rsyslog] rsyslog still crashes In-Reply-To: <577465F99B41C842AAFBE9ED71E70ABA44FA8F@grfint2.intern.adiscon.com> References: <1232046859.22744.39.camel@localhost.localdomain><577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44FA8F@grfint2.intern.adiscon.com> Message-ID: <49818F29.7070000@jackpot.uk.net> Rainer Gerhards wrote: > Hi all, > > thanks to Lorenzo's help, we made good progress. It is too much to post > inside a mail, please have a look at my analysis of the bug: > > http://blog.gerhards.net/2009/01/rsyslog-data-race-analysis.html > > The short story is that we have at least improved the situation very > much and I hope to have fixes for all branches within the next couple of > days. Bravo, Rainer! That is the most challenging and tricky to nail of all kinds of bug, and I'm very impressed. -- Jack. From friedl at hq.adiscon.com Thu Jan 29 17:16:57 2009 From: friedl at hq.adiscon.com (Florian Riedl) Date: Thu, 29 Jan 2009 17:16:57 +0100 Subject: [rsyslog] rsyslog 4.1.4 (devel) released Message-ID: <577465F99B41C842AAFBE9ED71E70ABA44FABC@grfint2.intern.adiscon.com> Hi all, rsyslog 4.1.4, a member of the development branch, has been released today. rsyslog 4.1.4, a member of the development branch, has been released today. It is primarily a stability update. Most importantly, this version addresses a potential segfault which occurred rather seldom and primarily on very fast and busy systems. The only other change is a fix for the $PreserveFQDN config directive, which did not properly affect locally emitted messages. This is a recommended update for all users of the development branch. Download http://www.rsyslog.com/Downloads-req-viewdownloaddetails-lid-147.phtml Changelog http://www.rsyslog.com/Article341.phtml As always, feedback is appreciated. Florian Riedl -- Support ======= Improving rsyslog is costly, but you can help! We are looking for organizations that find rsyslog useful and wish to contribute back. You can contribute by reporting bugs, improve the software, or donate money or equipment. Commercial support contracts for rsyslog are available, and they help finance continued maintenance. Adiscon GmbH, a privately held German company, is currently funding rsyslog development. We are always looking for interesting development projects. For details on how to help, please see http://www.rsyslog.com/doc-how2help.html . From rgerhards at hq.adiscon.com Thu Jan 29 17:36:41 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Thu, 29 Jan 2009 17:36:41 +0100 Subject: [rsyslog] rsyslog still crashes In-Reply-To: References: <1232046859.22744.39.camel@localhost.localdomain> <577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44FA8F@grfint2.intern.adiscon.com> Message-ID: <1233247001.19733.14.camel@rf10up.intern.adiscon.com> On Thu, 2009-01-29 at 00:36 -0800, david at lang.hm wrote: > On Wed, 28 Jan 2009, Rainer Gerhards wrote: > > > Hi all, > > > > thanks to Lorenzo's help, we made good progress. It is too much to post > > inside a mail, please have a look at my analysis of the bug: > > > > http://blog.gerhards.net/2009/01/rsyslog-data-race-analysis.html > > > > The short story is that we have at least improved the situation very > > much and I hope to have fixes for all branches within the next couple of > > days. > > I just finished reading through this excellant write-up > > one small thing. > > you quote the spec > > Accesses to cacheable memory that are split across bus widths, cache > lines, and page boundaries are not guaranteed to be atomic > > and then conclude that > > So aligned word-access does not guarantee (not even enhance the chance) of > atomicity. > > I read that to mean that the alignment requirements are more complicated, > not that alignment is useless. I should probably have quoted more of Intel's manual. But in essence you need to read at least the first full two pages to get the in-depth idea. The issue is not alignment requirements. As hardware gets more and more parallel, and caches get to more and more levels, and on-chip cores coexist with those from other sockets ... keeping memory coherent is a costly job. In early CPUs, Intel made memory access atomic if some alignment requirements were met. That was cheap. In new CPUs that atomicity is expensive. On the other hand, most data access do not need atomicity. So why incur the cost for many operations when only few need it? In the end result, Intel has remove guaranteed atomicity from those memory accesses. In order to get atomicity, the program must tell the CPU *explicitly* that it wants that feature. To do so, a "LOCK" prefix (opcode) must be placed before the actual opcode (note that this is only supported for some operations). So you get the best of two world: fast execution time for the majority of code and atomicity where you need it (but it then incurs the cost). The bottom line is that what was an atomic operation on an old CPU is no longer an atomic operation on a new CPU. If you need that, you need to include that extra "LOCK" opcode. As I briefly said in the blogpost, I have not check old Intel manuals. So I do not know if they formerly guaranteed, as part of the instruction set architecture, that these operations were atomic. I guess they did not. If so, I as a programmer made some assumptions about the micro-architecture that no longer hold true. My fault... But even if it is Intel's fault, the C programming language does not guarantee atomicity nor does the compiler guarantee a specific translation to machine code. So I, working on the C level, used assumptions that were not valid (and as I said I knew it was dangerous, but it worked too well for too long... ;)) > > you should also look at the code that's generated by -Os, with the heavily > cached systems that we have nowdays it's common that the code being > smaller (and therefor more of the code fitting into the L1 cache) is more > of an advantage than the optimizations that -O3 provides. That's a good reminder. I've just checked the gcc docs. There are some things that I do not like about -Os, especially as it disables proper alignment of many structures, including code. That can lead to sub-optimal cache performance. On the other hand -O3 does things like loop unrolling, which definitely is a bad idea with modern cache systems. My preliminarily conclusion is that -O2 is probably best, and may be tuned by turning on and off specific optimizations via their specific compiler switches. > > congradulations on tracking down a nasty and subtle issue. Thanks - but let's first see if this was the only issue and if things run smooth everywhere. But it looks very promising. Rainer > > David Lang > > > > Rainer > > > >> -----Original Message----- > >> From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > >> bounces at lists.adiscon.com] On Behalf Of Rainer Gerhards > >> Sent: Friday, January 16, 2009 3:22 PM > >> To: rsyslog-users > >> Subject: Re: [rsyslog] rsyslog still crashes > >> > >> Lorenzo, > >> > >> I have created a new branch "raceDebug" and done a first commit to it. > >> The change is very lightweight. Please pull, compile as usual and give > >> it a try. It spits out some info to stdout from time to time > >> (hopefully). I am not sure if it aborts, depending on the output it > > may > >> or may not. Even if we get messages, they are probably not enough to > >> pinpoint the bug, but I wanted to do something very light to see if > > the > >> bug stays. > >> > >> Feedback appreciated. > >> > >> Rainer > > _______________________________________________ > > rsyslog mailing list > > http://lists.adiscon.net/mailman/listinfo/rsyslog > > http://www.rsyslog.com > > > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com From david at lang.hm Fri Jan 30 04:51:28 2009 From: david at lang.hm (david at lang.hm) Date: Thu, 29 Jan 2009 19:51:28 -0800 (PST) Subject: [rsyslog] rsyslog still crashes In-Reply-To: <1233247001.19733.14.camel@rf10up.intern.adiscon.com> References: <1232046859.22744.39.camel@localhost.localdomain> <577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44FA8F@grfint2.intern.adiscon.com> <1233247001.19733.14.camel@rf10up.intern.adiscon.com> Message-ID: On Thu, 29 Jan 2009, Rainer Gerhards wrote: > On Thu, 2009-01-29 at 00:36 -0800, david at lang.hm wrote: >> On Wed, 28 Jan 2009, Rainer Gerhards wrote: >> >>> Hi all, >>> >>> thanks to Lorenzo's help, we made good progress. It is too much to post >>> inside a mail, please have a look at my analysis of the bug: >>> >>> http://blog.gerhards.net/2009/01/rsyslog-data-race-analysis.html >>> >>> The short story is that we have at least improved the situation very >>> much and I hope to have fixes for all branches within the next couple of >>> days. >> >> I just finished reading through this excellant write-up >> >> one small thing. >> >> you quote the spec >> >> Accesses to cacheable memory that are split across bus widths, cache >> lines, and page boundaries are not guaranteed to be atomic >> >> and then conclude that >> >> So aligned word-access does not guarantee (not even enhance the chance) of >> atomicity. >> >> I read that to mean that the alignment requirements are more complicated, >> not that alignment is useless. > > I should probably have quoted more of Intel's manual. But in essence you > need to read at least the first full two pages to get the in-depth idea. > The issue is not alignment requirements. As hardware gets more and more > parallel, and caches get to more and more levels, and on-chip cores > coexist with those from other sockets ... keeping memory coherent is a > costly job. > > In early CPUs, Intel made memory access atomic if some alignment > requirements were met. That was cheap. In new CPUs that atomicity is > expensive. On the other hand, most data access do not need atomicity. So > why incur the cost for many operations when only few need it? In the end > result, Intel has remove guaranteed atomicity from those memory > accesses. In order to get atomicity, the program must tell the CPU > *explicitly* that it wants that feature. To do so, a "LOCK" prefix > (opcode) must be placed before the actual opcode (note that this is only > supported for some operations). So you get the best of two world: fast > execution time for the majority of code and atomicity where you need it > (but it then incurs the cost). > > The bottom line is that what was an atomic operation on an old CPU is no > longer an atomic operation on a new CPU. If you need that, you need to > include that extra "LOCK" opcode. > > As I briefly said in the blogpost, I have not check old Intel manuals. > So I do not know if they formerly guaranteed, as part of the instruction > set architecture, that these operations were atomic. I guess they did > not. If so, I as a programmer made some assumptions about the > micro-architecture that no longer hold true. My fault... But even if it > is Intel's fault, the C programming language does not guarantee > atomicity nor does the compiler guarantee a specific translation to > machine code. So I, working on the C level, used assumptions that were > not valid (and as I said I knew it was dangerous, but it worked too well > for too long... ;)) the new C0x standard will add atomic ops and guarentees (some of which are not nessasarily provided by the chip, but have to be provided by the compiler/library instead), so watch for it, but test the performance of them before you trust them >> >> you should also look at the code that's generated by -Os, with the heavily >> cached systems that we have nowdays it's common that the code being >> smaller (and therefor more of the code fitting into the L1 cache) is more >> of an advantage than the optimizations that -O3 provides. > > That's a good reminder. I've just checked the gcc docs. There are some > things that I do not like about -Os, especially as it disables proper > alignment of many structures, including code. That can lead to > sub-optimal cache performance. I know the linux kernel has many things where the alignment is critical for proper functioning, but they are still able to support -Os, so there is some way to specify alignment even for -Os > On the other hand -O3 does things like loop unrolling, which definitely > is a bad idea with modern cache systems. > > My preliminarily conclusion is that -O2 is probably best, and may be > tuned by turning on and off specific optimizations via their specific > compiler switches. this has been the prevailing wisdom for many years, but I've seen myself many cases where -Os has ended up being faster in the real world, in spite of the various things that -O2 does 'better' is it the case that -Os would break things? or just that you think it's alignment may not be as good? David Lang >> congradulations on tracking down a nasty and subtle issue. > > Thanks - but let's first see if this was the only issue and if things > run smooth everywhere. But it looks very promising. > > Rainer >> >> David Lang >> >> >>> Rainer >>> >>>> -----Original Message----- >>>> From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- >>>> bounces at lists.adiscon.com] On Behalf Of Rainer Gerhards >>>> Sent: Friday, January 16, 2009 3:22 PM >>>> To: rsyslog-users >>>> Subject: Re: [rsyslog] rsyslog still crashes >>>> >>>> Lorenzo, >>>> >>>> I have created a new branch "raceDebug" and done a first commit to it. >>>> The change is very lightweight. Please pull, compile as usual and give >>>> it a try. It spits out some info to stdout from time to time >>>> (hopefully). I am not sure if it aborts, depending on the output it >>> may >>>> or may not. Even if we get messages, they are probably not enough to >>>> pinpoint the bug, but I wanted to do something very light to see if >>> the >>>> bug stays. >>>> >>>> Feedback appreciated. >>>> >>>> Rainer >>> _______________________________________________ >>> rsyslog mailing list >>> http://lists.adiscon.net/mailman/listinfo/rsyslog >>> http://www.rsyslog.com >>> >> _______________________________________________ >> rsyslog mailing list >> http://lists.adiscon.net/mailman/listinfo/rsyslog >> http://www.rsyslog.com > > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com > From david at lang.hm Fri Jan 30 05:56:55 2009 From: david at lang.hm (david at lang.hm) Date: Thu, 29 Jan 2009 20:56:55 -0800 (PST) Subject: [rsyslog] rsyslog still crashes In-Reply-To: <1233247001.19733.14.camel@rf10up.intern.adiscon.com> References: <1232046859.22744.39.camel@localhost.localdomain> <577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44FA8F@grfint2.intern.adiscon.com> <1233247001.19733.14.camel@rf10up.intern.adiscon.com> Message-ID: On Thu, 29 Jan 2009, Rainer Gerhards wrote: >> >> congradulations on tracking down a nasty and subtle issue. > > Thanks - but let's first see if this was the only issue and if things > run smooth everywhere. But it looks very promising. > bad news, on my system the HUP doesn't always reopen the files now. high speed box receiving messages via UDP, idle except for a gzip compressing the files (which are rotated once a min), the system runs fine for a few min (higher performance than before, it's now writing ~93,000 messages/sec instead of ~78,000 messages/sec), but it sometimes mangles handling a HUP and gets stuck. I have to do a kill -9 to kill and restart it. this is with the new HUP behavior. David Lang From david at lang.hm Fri Jan 30 06:13:07 2009 From: david at lang.hm (david at lang.hm) Date: Thu, 29 Jan 2009 21:13:07 -0800 (PST) Subject: [rsyslog] rsyslog still crashes In-Reply-To: References: <1232046859.22744.39.camel@localhost.localdomain> <577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44FA8F@grfint2.intern.adiscon.com> <1233247001.19733.14.camel@rf10up.intern.adiscon.com> Message-ID: On Thu, 29 Jan 2009, david at lang.hm wrote: > On Thu, 29 Jan 2009, Rainer Gerhards wrote: > >>> >>> congradulations on tracking down a nasty and subtle issue. >> >> Thanks - but let's first see if this was the only issue and if things >> run smooth everywhere. But it looks very promising. >> > > bad news, on my system the HUP doesn't always reopen the files now. > > high speed box receiving messages via UDP, idle except for a gzip > compressing the files (which are rotated once a min), the system runs fine > for a few min (higher performance than before, it's now writing ~93,000 > messages/sec instead of ~78,000 messages/sec), but it sometimes mangles > handling a HUP and gets stuck. I have to do a kill -9 to kill and restart > it. > > this is with the new HUP behavior. interesting note on memory useage. I'm using the default fixed array queue type on this box with a 1K max message length. if I hammer the box with a steady ~120K messages/sec (while it can write 93K/sec) the queue builds up to where it takes ~12G of ram. at this point the throughput takes a nose dive (not just dropping inbound packets, but also the number of packets written is much less) if I kill the sender, it starts emptying it's queue (interestingly, not quite as fast as if it is also recieving some messages), but the memory isn't freed up until I start sending it messages again. David Lang From rgerhards at hq.adiscon.com Thu Jan 29 19:34:50 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Thu, 29 Jan 2009 19:34:50 +0100 Subject: [rsyslog] rsyslog still crashes In-Reply-To: References: <1232046859.22744.39.camel@localhost.localdomain> <577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44FA8F@grfint2.intern.adiscon.com> <1233247001.19733.14.camel@rf10up.intern.adiscon.com> Message-ID: <1233254090.19733.22.camel@rf10up.intern.adiscon.com> On Thu, 2009-01-29 at 21:13 -0800, david at lang.hm wrote: > On Thu, 29 Jan 2009, david at lang.hm wrote: > interesting note on memory useage. > > I'm using the default fixed array queue type on this box with a 1K max > message length. if I hammer the box with a steady ~120K messages/sec > (while it can write 93K/sec) the queue builds up to where it takes ~12G of > ram. at this point the throughput takes a nose dive (not just dropping > inbound packets, but also the number of packets written is much less) > > if I kill the sender, it starts emptying it's queue (interestingly, not > quite as fast as if it is also recieving some messages), but the memory > isn't freed up until I start sending it messages again. This actually is expected behavior - and it has lots to do with "last message repeated n time". In order to implement that functionality, I need to hold on the the last message until a new one comes in (so that I can compare new to old). As such, a message that is fully processed can not immediately be freed. This happens, when the next message comes in - whenever this be. Note that each output has separate "last message..." status, so each action keeps a copy of the previous message until a new one arrives. What now happens is that when the queue builds up, malloc extends the data segment size. It is fair to assume that the last message received - on a very busy system will probably end up at a high location in the data segment (but note it is just a probability - it may even receive a very low location, if that was just freed immediately before). When the queue is now drained, we free everything but this message. As the message is still referenced for "last m...", it can not be freed. As it has a high address, the data segment size can not be reduced. As such, rsyslog still holds the whole data segement, with it containing almost no actually allocated memory. I do not know if the runtime system has a way to tell the OS it now uses a "sparse data segement", but I guess it doesn't do that. When the next message comes in (hours later?), the previous message can be freed, and the runtime can then reduce the data segment size (which should result in a sharp decrease of memory usage seen). This is one of the reasons I don't like "last message...". I hope this clarifies. Rainer From rgerhards at hq.adiscon.com Thu Jan 29 20:40:33 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Thu, 29 Jan 2009 20:40:33 +0100 Subject: [rsyslog] rsyslog still crashes In-Reply-To: References: <1232046859.22744.39.camel@localhost.localdomain> <577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44FA8F@grfint2.intern.adiscon.com> <1233247001.19733.14.camel@rf10up.intern.adiscon.com> Message-ID: <1233258033.19733.27.camel@rf10up.intern.adiscon.com> Hi David, thanks for this note, but I think it is not related to the fix (I'll think a bit harder about that, but so far I can not find any connection between the two). The way the HUP is done is sub-optimal. Under typical load (one hup a day), you don't see any issue. If you hup very frequently (like the once a min you do) and have heavy traffic, that's another story. To solve that case, some rework on the hup internals, actually even on the interface definition, is needed. I'd hold all such work unless I found a solution to the race bug - because it would have made the environment even more different. Now that I have at least one issue, I think I can go ahead and begin to introduce more intrusive changes again. In any case, I'll have a more in-depth look at the hup handlers. The new non-restart type of hup should be almost resistant against the issue you report. Rainer On Thu, 2009-01-29 at 20:56 -0800, david at lang.hm wrote: > On Thu, 29 Jan 2009, Rainer Gerhards wrote: > > >> > >> congradulations on tracking down a nasty and subtle issue. > > > > Thanks - but let's first see if this was the only issue and if things > > run smooth everywhere. But it looks very promising. > > > > bad news, on my system the HUP doesn't always reopen the files now. > > high speed box receiving messages via UDP, idle except for a gzip > compressing the files (which are rotated once a min), the system runs fine > for a few min (higher performance than before, it's now writing ~93,000 > messages/sec instead of ~78,000 messages/sec), but it sometimes mangles > handling a HUP and gets stuck. I have to do a kill -9 to kill and restart > it. > > this is with the new HUP behavior. > > David Lang > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com From rgerhards at hq.adiscon.com Thu Jan 29 21:25:27 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Thu, 29 Jan 2009 21:25:27 +0100 Subject: [rsyslog] rsyslog still crashes In-Reply-To: References: <1232046859.22744.39.camel@localhost.localdomain> <577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44FA8F@grfint2.intern.adiscon.com> <1233247001.19733.14.camel@rf10up.intern.adiscon.com> Message-ID: <1233260727.19733.71.camel@rf10up.intern.adiscon.com> On Thu, 2009-01-29 at 19:51 -0800, david at lang.hm wrote: > the new C0x standard will add atomic ops and guarentees (some of which are > not nessasarily provided by the chip, but have to be provided by the > compiler/library instead), so watch for it, but test the performance of > them before you trust them This is very important work, especially if you think about future advances in hardware design. However, I think we will be years away from the point where one can actually use this and hope to be somewhat portable. Same for performance: early implementation will probably be sub-optimal (though it should be fairly simple to map current compiler-specific options for atomic ops to the new standard once... but we know what happens when new standards come out...). > > On the other hand -O3 does things like loop unrolling, which definitely > > is a bad idea with modern cache systems. > > > > My preliminarily conclusion is that -O2 is probably best, and may be > > tuned by turning on and off specific optimizations via their specific > > compiler switches. > > this has been the prevailing wisdom for many years, but I've seen myself > many cases where -Os has ended up being faster in the real world, in spite > of the various things that -O2 does 'better' I think the phrase "it depends on the scenario" is very important here. > is it the case that -Os would break things? or just that you think it's > alignment may not be as good? It does not break things. The alignment for any structures that are passed as part of the API should be properly contained in the header files. However, I have not specifically tested this. The point is just that, at least on some machines, non-aligned addresses severely hit cache performance. So optimizing for size, and as a side-effect generating unaligned data accesses, can be a real performance drawback. It may well cost more performance than the improved L1 (or trace cache) performance offers. In any case, if we go down to that level, I think there are better places to test and optimize - not to mention that on the upper layer (OS calls!) there is still room for improvement. On of my favorite CPU-level optimizations is the "exception system" that is currently in use in rsyslog. Thanks to your message, I've finally written down some information on it. I've done that on the forum, so that I can easily keep a permanent record of the discussion (and in an easier-to-follow form than with the mail archive): http://kb.monitorware.com/optimizing-exception-handling-t8911.html Feedback is appreciated. Rainer From rgerhards at hq.adiscon.com Fri Jan 30 14:34:07 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Fri, 30 Jan 2009 14:34:07 +0100 Subject: [rsyslog] rsyslog 4.1.4 - one (small) bug left Message-ID: <577465F99B41C842AAFBE9ED71E70ABA44FADC@grfint2.intern.adiscon.com> Hi all, I have now basically ported the race bugfix to all branches (verification and double-check still in the works). While doing this, I noticed that one small issues slipped my attention with yesterday's 4.1.4 version. If compiled with atomics, I unlock an already unlocked mutex (which is destroyed with the very next statement) in msgDestruct. That should not have any really bad effects (but you never know...). The master branch is now updated, so you may want to pull a fixed version from there. I will not do a new release just for this reason - it'll be included in the next version. Please note that git as of now already contains all the race fix for all branches, but mostly untested. Just in case if you'd like to get them quickly. I will keep you posted. Rainer From rgerhards at hq.adiscon.com Fri Jan 30 16:47:55 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Fri, 30 Jan 2009 16:47:55 +0100 Subject: [rsyslog] hang on HUP - was: rsyslog still crashes In-Reply-To: References: <1232046859.22744.39.camel@localhost.localdomain> <577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44FA8F@grfint2.intern.adiscon.com> <1233247001.19733.14.camel@rf10up.intern.adiscon.com> Message-ID: <1233330475.19733.88.camel@rf10up.intern.adiscon.com> On Thu, 2009-01-29 at 20:56 -0800, david at lang.hm wrote: > high speed box receiving messages via UDP, idle except for a gzip > compressing the files (which are rotated once a min), the system runs fine > for a few min (higher performance than before, it's now writing ~93,000 > messages/sec instead of ~78,000 messages/sec), but it sometimes mangles > handling a HUP and gets stuck. I have to do a kill -9 to kill and restart > it. > > this is with the new HUP behavior. I cross-checked the HUP processing. So far, I do not see why it hangs (and if it is related to the HUP processing). Can you reproduce it with debug log running. I guess no, but if so, could you provide me a log with ~1000 log lines before the hang? If debug log is no option, a stack trace from the abort would be great. Rainer From david at lang.hm Fri Jan 30 18:19:21 2009 From: david at lang.hm (david at lang.hm) Date: Fri, 30 Jan 2009 09:19:21 -0800 (PST) Subject: [rsyslog] rsyslog still crashes In-Reply-To: <1233258033.19733.27.camel@rf10up.intern.adiscon.com> References: <1232046859.22744.39.camel@localhost.localdomain> <577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44FA8F@grfint2.intern.adiscon.com> <1233247001.19733.14.camel@rf10up.intern.adiscon.com> <1233258033.19733.27.camel@rf10up.intern.adiscon.com> Message-ID: On Thu, 29 Jan 2009, Rainer Gerhards wrote: > Hi David, > > thanks for this note, but I think it is not related to the fix (I'll > think a bit harder about that, but so far I can not find any connection > between the two). > > The way the HUP is done is sub-optimal. Under typical load (one hup a > day), you don't see any issue. If you hup very frequently (like the once > a min you do) and have heavy traffic, that's another story. To solve > that case, some rework on the hup internals, actually even on the > interface definition, is needed. I'd hold all such work unless I found a > solution to the race bug - because it would have made the environment > even more different. Now that I have at least one issue, I think I can > go ahead and begin to introduce more intrusive changes again. > > In any case, I'll have a more in-depth look at the hup handlers. The new > non-restart type of hup should be almost resistant against the issue you > report. I was using the new non-restart type. I'll be doing more testing today and over the weekend. it's posible that I ended up with mixed versions with the modules again (just before going home last night I deleted them all and then did the install to make sure) David Lang > Rainer > > On Thu, 2009-01-29 at 20:56 -0800, david at lang.hm wrote: >> On Thu, 29 Jan 2009, Rainer Gerhards wrote: >> >>>> >>>> congradulations on tracking down a nasty and subtle issue. >>> >>> Thanks - but let's first see if this was the only issue and if things >>> run smooth everywhere. But it looks very promising. >>> >> >> bad news, on my system the HUP doesn't always reopen the files now. >> >> high speed box receiving messages via UDP, idle except for a gzip >> compressing the files (which are rotated once a min), the system runs fine >> for a few min (higher performance than before, it's now writing ~93,000 >> messages/sec instead of ~78,000 messages/sec), but it sometimes mangles >> handling a HUP and gets stuck. I have to do a kill -9 to kill and restart >> it. >> >> this is with the new HUP behavior. >> >> David Lang >> _______________________________________________ >> rsyslog mailing list >> http://lists.adiscon.net/mailman/listinfo/rsyslog >> http://www.rsyslog.com > > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com > From david at lang.hm Fri Jan 30 18:28:56 2009 From: david at lang.hm (david at lang.hm) Date: Fri, 30 Jan 2009 09:28:56 -0800 (PST) Subject: [rsyslog] rsyslog 4.1.4 - one (small) bug left In-Reply-To: <577465F99B41C842AAFBE9ED71E70ABA44FADC@grfint2.intern.adiscon.com> References: <577465F99B41C842AAFBE9ED71E70ABA44FADC@grfint2.intern.adiscon.com> Message-ID: On Fri, 30 Jan 2009, Rainer Gerhards wrote: > Hi all, > > I have now basically ported the race bugfix to all branches > (verification and double-check still in the works). While doing this, I > noticed that one small issues slipped my attention with yesterday's > 4.1.4 version. If compiled with atomics, I unlock an already unlocked > mutex (which is destroyed with the very next statement) in msgDestruct. > That should not have any really bad effects (but you never know...). The > master branch is now updated, so you may want to pull a fixed version > from there. I will not do a new release just for this reason - it'll be > included in the next version. so 4.1.4 should be using the atomics for queue management not mutexes? David Lang > Please note that git as of now already contains all the race fix for all > branches, but mostly untested. Just in case if you'd like to get them > quickly. > > I will keep you posted. > > Rainer > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com > From rgerhards at hq.adiscon.com Fri Jan 30 17:28:47 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Fri, 30 Jan 2009 17:28:47 +0100 Subject: [rsyslog] rsyslog 4.1.4 - one (small) bug left In-Reply-To: References: <577465F99B41C842AAFBE9ED71E70ABA44FADC@grfint2.intern.adiscon.com> Message-ID: <577465F99B41C842AAFBE9ED71E70ABA44FAE1@grfint2.intern.adiscon.com> > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of david at lang.hm > Sent: Friday, January 30, 2009 6:29 PM > To: rsyslog-users > Subject: Re: [rsyslog] rsyslog 4.1.4 - one (small) bug left > > On Fri, 30 Jan 2009, Rainer Gerhards wrote: > > > Hi all, > > > > I have now basically ported the race bugfix to all branches > > (verification and double-check still in the works). While doing this, > I > > noticed that one small issues slipped my attention with yesterday's > > 4.1.4 version. If compiled with atomics, I unlock an already unlocked > > mutex (which is destroyed with the very next statement) in > msgDestruct. > > That should not have any really bad effects (but you never know...). > The > > master branch is now updated, so you may want to pull a fixed version > > from there. I will not do a new release just for this reason - it'll > be > > included in the next version. > > so 4.1.4 should be using the atomics for queue management not mutexes? It depends... If atomics are available, they are the preferred method. If not available, the code falls back to mutexes. Rainer From theinric at redhat.com Mon Jan 5 15:52:38 2009 From: theinric at redhat.com (Tomas Heinrich) Date: Mon, 05 Jan 2009 15:52:38 +0100 Subject: [rsyslog] suggested tweak to rsyslog In-Reply-To: <577465F99B41C842AAFBE9ED71E70ABA44F916@grfint2.intern.adiscon.com> References: <1229626907.12594.19.camel@localhost.localdomain><1229627751.12594.23.camel@localhost.localdomain> <577465F99B41C842AAFBE9ED71E70ABA44F916@grfint2.intern.adiscon.com> Message-ID: <49621EB6.9010504@redhat.com> On 12/19/2008 12:57 PM, Rainer Gerhards wrote: > David, > > one thing I can do rather quickly. Maybe it's good enough. I've done a > tester, which lacks proper configuration, but I would appreciate > feedback on it: > > http://git.adiscon.com/?p=rsyslog.git;a=commitdiff;h=a185665be4cf6997525 > 89d81ef6e396dd61f68b6 > > Details in git commit comment. > > Rainer Hi, I think there's a small bug in the new code: - snprintf((char*)szRepMsg, sizeof(szRepMsg), "message repeated %d times: [%.800]", + snprintf((char*)szRepMsg, sizeof(szRepMsg), "message repeated %d times: [%.800s]", Tomas From theinric at redhat.com Tue Jan 6 18:02:37 2009 From: theinric at redhat.com (Tomas Heinrich) Date: Tue, 06 Jan 2009 18:02:37 +0100 Subject: [rsyslog] redundant message in log files Message-ID: <49638EAD.5080104@redhat.com> Hi, we've received a bug report [1] regarding a message that started to appear in the log files. The bug first appeared in version 3.21.5. This patch [2] should fix it. Tomas [1] https://bugzilla.redhat.com/show_bug.cgi?id=478612 [2] http://pastebin.ca/1301001 From mikel at irontec.com Sun Jan 11 21:41:11 2009 From: mikel at irontec.com (Mikel Jimenez Fernandez) Date: Sun, 11 Jan 2009 21:41:11 +0100 Subject: [rsyslog] [Fwd: Re: milliseconds timestamp] Message-ID: <496A5967.1050805@irontec.com> Dear Andre and Rainer Any progress in this? Thanks From rgerhards at hq.adiscon.com Mon Jan 12 08:56:10 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Mon, 12 Jan 2009 08:56:10 +0100 Subject: [rsyslog] [Fwd: Re: milliseconds timestamp] In-Reply-To: <496A5967.1050805@irontec.com> References: <496A5967.1050805@irontec.com> Message-ID: <577465F99B41C842AAFBE9ED71E70ABA44F964@grfint2.intern.adiscon.com> Hi, please quote what exactly you are looking for, I am no longer able to trace the question back to an issue ;) Rainer > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of Mikel Jimenez Fernandez > Sent: Sunday, January 11, 2009 9:41 PM > To: rsyslog-users > Subject: [rsyslog] [Fwd: Re: milliseconds timestamp] > > Dear Andre and Rainer > > Any progress in this? > > Thanks From rgerhards at hq.adiscon.com Mon Jan 12 11:11:10 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Mon, 12 Jan 2009 11:11:10 +0100 Subject: [rsyslog] suggested tweak to rsyslog In-Reply-To: <49621EB6.9010504@redhat.com> References: <1229626907.12594.19.camel@localhost.localdomain><1229627751.12594.23.camel@localhost.localdomain> <577465F99B41C842AAFBE9ED71E70ABA44F916@grfint2.intern.adiscon.com> <49621EB6.9010504@redhat.com> Message-ID: <577465F99B41C842AAFBE9ED71E70ABA44F96B@grfint2.intern.adiscon.com> Hi Tomas, thanks for the patch, looks like I have forgotten a commit ;) David and others: do you find this functionality useful? If I do not receive any further comments, I'll conclude it is not and will not further work on it. Thanks all, Rainer > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of Tomas Heinrich > Sent: Monday, January 05, 2009 3:53 PM > To: rsyslog-users > Subject: Re: [rsyslog] suggested tweak to rsyslog > > On 12/19/2008 12:57 PM, Rainer Gerhards wrote: > > David, > > > > one thing I can do rather quickly. Maybe it's good enough. I've done > a > > tester, which lacks proper configuration, but I would appreciate > > feedback on it: > > > > > http://git.adiscon.com/?p=rsyslog.git;a=commitdiff;h=a185665be4cf699752 > 5 > > 89d81ef6e396dd61f68b6 > > > > Details in git commit comment. > > > > Rainer > > Hi, > > I think there's a small bug in the new code: > > - snprintf((char*)szRepMsg, sizeof(szRepMsg), "message repeated %d > times: [%.800]", > + snprintf((char*)szRepMsg, sizeof(szRepMsg), "message repeated %d > times: [%.800s]", > > Tomas > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com From pieter.thysebaert at intec.ugent.be Wed Jan 14 13:37:31 2009 From: pieter.thysebaert at intec.ugent.be (pieter.thysebaert at intec.ugent.be) Date: Wed, 14 Jan 2009 13:37:31 +0100 (CET) Subject: [rsyslog] Property filter - output formatting Message-ID: <29691.212.190.198.36.1231936651.squirrel@webserver6.intec.ugent.be> Hello, I've started exploring rsyslog 3.20.2 As I have been toying around and looking at the example configurations, I have not been able to solve the following problem: how can I use a property filter to select an output file AND format the output using a defined template For instance: $template testtemplate,"%msg%" :syslogtag, contains, "test" /tmp/test.log;testtemplate Doesn't seem to be a supported syntax (it works when I leave off the ;testtemplate). I'm sorry if this is obvious, but how can I filter based on properties AND specify output formatting at the same time? Thanks, Pieter From rgerhards at hq.adiscon.com Wed Jan 14 00:08:27 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Wed, 14 Jan 2009 00:08:27 +0100 Subject: [rsyslog] Property filter - output formatting In-Reply-To: <29691.212.190.198.36.1231936651.squirrel@webserver6.intec.ugent.be> References: <29691.212.190.198.36.1231936651.squirrel@webserver6.intec.ugent.be> Message-ID: <1231888107.22744.19.camel@localhost.localdomain> Hi Pieter, I just tried this out in lab. For me, it works. If I generate a message with logger -t test my message the message is properly dispatched. I guess that the problem actually is the tag, which I guess does not contain what you think it does (a frequent problem with many senders). Try this template $template testtemplate,"tag: '%syslogtag%', rawmsg: '%rawmsg%'\n" *.* /some/file;testtemplate and let us know the result. HTH Rainer On Wed, 2009-01-14 at 13:37 +0100, pieter.thysebaert at intec.ugent.be wrote: > Hello, > > I've started exploring rsyslog 3.20.2 > > As I have been toying around and looking at the example configurations, I > have not been able to solve the following problem: > > how can I use a property filter to select an output file AND format the > output using a defined template > > For instance: > > $template testtemplate,"%msg%" > > :syslogtag, contains, "test" /tmp/test.log;testtemplate > > Doesn't seem to be a supported syntax (it works when I leave off the > ;testtemplate). > > I'm sorry if this is obvious, but how can I filter based on properties AND > specify output formatting at the same time? > > Thanks, > Pieter > > > > > > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com From rgerhards at hq.adiscon.com Wed Jan 14 17:14:45 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Wed, 14 Jan 2009 17:14:45 +0100 Subject: [rsyslog] rsyslog on LinkedIn Message-ID: <577465F99B41C842AAFBE9ED71E70ABA44F996@grfint2.intern.adiscon.com> Hi all, please pardon the shameless self-promotion. I have just created a rsyslog group on LinkedIn: http://www.linkedin.com/e/gis/1761607 It is an experiment. I've seen so many project creating groups on that platform that I wonder if if would make sense to create one for rsyslog. My intent is not to replace any of our technical and discussion forums, but open a new networking opportunity for those that are interested. I do not yet know if that's a good idea or not, but why not give it a try? ;) Back to our regular programming... Rainer From ray at jhax.net Wed Jan 14 12:50:48 2009 From: ray at jhax.net (Ray Whitmer) Date: Wed, 14 Jan 2009 04:50:48 -0700 Subject: [rsyslog] Use of application-level acks in RELP. Message-ID: <20090114045048.o2wpiannk4okcgw4@webmail.xmission.com> In my research of rsyslog to determine its suitability for a particular situation I have some questions left unanswered. I need relatively-guaranteed delivery. I will continue to review the available info including source code to see if I can answer the questions, but I hope it may be productive to ask questions here. In the documentation, you describe the situation where syslog silently loses tcp messages, not because the tcp protocol permits it but because the send function returns after delivering the message to a local buffer before it is actually delivered. But there is a more-fundamental reason an application-level ack is required. An application can fail (someone trips over the power cord) between when the application receives the data and when it records it. 1. Does rsyslog send the ack in the RELP protocol occur after the message has been safely recorded in whatever queue has been configured or forwarded on so its delivery status is as safe as it will get (of course how safe depends upon options chosen), or was it only intended to solve the case of TCP buffering-based unreliability? 2. Presumably there is a client API that speaks RELP. Can it be configured to return an error to the client if there is no ACK (i.e. if the log it sent did not make it into the configured safe location which could be on a disk-based queue), or does it only retry? Where is this API? Certainly the TCP caching case you mention in your pages is one a user is more likely to be able to reproduce, but that is all the more reason for me to be concerned that the less-reproducible situations that could cause a message to occasionally become lost are handled correctly. From rgerhards at hq.adiscon.com Thu Jan 15 09:16:36 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Thu, 15 Jan 2009 09:16:36 +0100 Subject: [rsyslog] Use of application-level acks in RELP. In-Reply-To: <20090114045048.o2wpiannk4okcgw4@webmail.xmission.com> References: <20090114045048.o2wpiannk4okcgw4@webmail.xmission.com> Message-ID: <1232007397.22744.27.camel@localhost.localdomain> Hi Ray, thanks for your excellent questions. I've also made a blog post out of them, as I think this needs some better visibility (and can be used for future reference). Just if you are curios: http://blog.gerhards.net/2009/01/use-of-application-level-acks-in-relp.html (no need to read, all answers are inline below) On Wed, 2009-01-14 at 04:50 -0700, Ray Whitmer wrote: > In my research of rsyslog to determine its suitability for a > particular situation I have some questions left unanswered. I need > relatively-guaranteed delivery. I will continue to review the > available info including source code to see if I can answer the > questions, but I hope it may be productive to ask questions here. > > In the documentation, you describe the situation where syslog silently > loses tcp messages, not because the tcp protocol permits it but > because the send function returns after delivering the message to a > local buffer before it is actually delivered. > > But there is a more-fundamental reason an application-level ack is > required. An application can fail (someone trips over the power cord) > between when the application receives the data and when it records it. > > 1. Does rsyslog send the ack in the RELP protocol occur after the > message has been safely recorded in whatever queue has been configured > or forwarded on so its delivery status is as safe as it will get (of > course how safe depends upon options chosen), or was it only intended > to solve the case of TCP buffering-based unreliability? RELP is designed to provide end-to-end reliability. The TCP buffering issue is just highlighted because it is so subtle that most people tend to overlook it. An application abort seems to be more obvious and RELP handles that. HOWEVER, that does not mean messages are necessarily recorded when the ACK is sent. It depends on the configuration. In RELP, the acknowledgment is sent after the reception callback has been called. This can be seen in the relevant RELP module. For rsyslog's imrelp, this means the callback returns after the message has been enqueued in the main message queue. It now depends on how that queue is configured. By default, messages are buffered in main memory. So when rsyslog aborts for some reason (or is terminated by user request) before this message is being processed, it is lost - while the sender still got a positive ACK. This is how things are done by default, and it is useful for many scenarios. Of course, it does not provide the audit-grade reliability that RELP aims for. But the default config needs to take care of the usual use case and this is not audit-grade reliablity (just think of the numerous home systems that run rsyslog and should do so in the least intrusive way). If you are serious about your logs, you need to configure the engine to be fully reliable. The most important thing is a good understanding of the queue engine. You need to read and understand the rsyslog queue ( http://www.rsyslog.com/doc-queues.html ) docs, as they form the basis on which reliability can be built. The other thing you need to know is your exact requirements. Asking for reliability is easy, implementing it is not. The more you near 100% reliability (which you will never reach for one reason or the other) the more complex scenarios get. I am sure the original post knows quite well what he want, but I am often approached by people who just want to have it "totally reliable" ... but don't want to spent the fortune it requires (really - ever thought about the redundant data centers, power plants, satellite and sea links et all you need for that?). So it is absolutely vital to have good requirements, which also includes of when loss is acceptable, and at what cost this comes. Once you have these requirements, a rsyslog configuration that matches them can be designed. At this point, I'd like to note that it may also be useful to consider rsyslog professional services ( http://www.rsyslog.com/doc-professional_support.html ) as it provides valuable aid during design and probably deployment of a solution (I can't go into the full depth of enterprise requirements here). To go back to the original question: RELP has almost everything that is needed, but configuring the whole system in an audit-grade way requires (ample) work. > 2. Presumably there is a client API that speaks RELP. Can it be > configured to return an error to the client if there is no ACK (i.e. > if the log it sent did not make it into the configured safe location > which could be on a disk-based queue), or does it only retry? Where is > this API? The API is in librelp ( http://www.librelp.com/ ). But actually this is not what you are looking for. In rsyslog, an output module (here: omrelp) provides the status back to the caller. Then, configuration decides what happens. Messages may be discarded, sent to a different destination or retried. With omrelp, I think we have some hardcoded ways to preserve the message, but I have no time yet to look this up in detail. In any case, RELP will not loose messages but may duplicate few of them (within the current unacked window) if the remote peer simply dies. Again, this requires proper configuration of the rsyslog components. Even with that, you may loose messages if the local rsyslogd dies (not terminates, but dies for some unexpected reason, e.g. a segfault, kill -9 or whatever) but still has messages in a not persisted queue. Again, this can be mitigated by proper configuration, but that must be designed. Also, it is very costly in terms of performance. A good reading on the subtleties can be in the rsyslog mailing list archive (http://lists.adiscon.net/pipermail/rsyslog/2008-October/001224.html ). I suggest to have a look at it. > > Certainly the TCP caching case you mention in your pages is one a user > is more likely to be able to reproduce, but that is all the more > reason for me to be concerned that the less-reproducible situations > that could cause a message to occasionally become lost are handled > correctly. I don't think app-abort is less reproducable kill -9 `cat /var/run/rsyslog.pid` will do nicely. Actually, from feedback I received, many users seem to understand the implications of a program/system abort. But far fewer understand the issues inherent in TCP. Thus I am focusing so much on the later. But of course, everything needs to be considered. Read the thread about the reliable queue (really!). It goes great lengths, but still does not offer a full solution. Getting things reliable (or secure) is very, very challenging and requires in-depth knowledge. So I am glad you asked and provided an opportunity for this to be written :) Rainer From rgerhards at hq.adiscon.com Thu Jan 15 13:00:37 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Thu, 15 Jan 2009 13:00:37 +0100 Subject: [rsyslog] redundant message in log files In-Reply-To: <49638EAD.5080104@redhat.com> References: <49638EAD.5080104@redhat.com> Message-ID: <1232020837.22744.28.camel@localhost.localdomain> Thanks, this one now finally is corrected, too (still catching up with vacation mail ;)). Will release it as part of 3.21.10. Rainer On Tue, 2009-01-06 at 18:02 +0100, Tomas Heinrich wrote: > Hi, > > we've received a bug report [1] regarding a message that started to > appear in the log files. The bug first appeared in version 3.21.5. > This patch [2] should fix it. > > Tomas > > [1] https://bugzilla.redhat.com/show_bug.cgi?id=478612 > [2] http://pastebin.ca/1301001 > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com From fjianella at gmail.com Thu Jan 15 15:45:53 2009 From: fjianella at gmail.com (Frank Ianella) Date: Thu, 15 Jan 2009 09:45:53 -0500 Subject: [rsyslog] uclibc compile failure Message-ID: <9f1ad2df0901150645u5cd90986k6b92a473beb73257@mail.gmail.com> hello all compiling stable and dev versions of rsyslog against uclibc-0.9.30 results in the following error: /home/build/project/sources/rsyslog-3.21.9/tools/syslogd.c:2995: undefined reference to `rpl_malloc' rsyslogd-syslogd.o: In function `legacyOptsEnq': /home/build/project/sources/rsyslog-3.21.9/tools/syslogd.c:1742: undefined reference to `rpl_malloc' rsyslogd-syslogd.o: In function `crunch_list': /home/build/project/sources/rsyslog-3.21.9/tools/syslogd.c:490: undefined reference to `rpl_malloc' /home/build/project/sources/rsyslog-3.21.9/tools/syslogd.c:502: undefined reference to `rpl_malloc' /home/build/project/sources/rsyslog-3.21.9/tools/syslogd.c:512: undefined reference to `rpl_malloc' rsyslogd-syslogd.o:/home/build/project/sources/rsyslog-3.21.9/tools/syslogd.c:1319: more undefined references to `rpl_malloc' follow ../runtime/.libs/librsyslog.a(librsyslog_la-wtp.o): In function `wtpStartWrkr': /home/build/project/sources/rsyslog-3.21.9/runtime/wtp.c:487: undefined reference to `pthread_yield' ../runtime/.libs/librsyslog.a(librsyslog_la-wtp.o): In function `wtpConstructFinalize': /home/build/project/sources/rsyslog-3.21.9/runtime/wtp.c:109: undefined reference to `rpl_malloc' ../runtime/.libs/librsyslog.a(librsyslog_la-wti.o): In function `wtiSetDbgHdr': /home/build/project/sources/rsyslog-3.21.9/runtime/wti.c:456: undefined reference to `rpl_malloc' ../runtime/.libs/librsyslog.a(librsyslog_la-wti.o): In function `wtiWorker': /home/build/project/sources/rsyslog-3.21.9/runtime/wti.c:370: undefined reference to `pthread_yield' ../runtime/.libs/librsyslog.a(librsyslog_la-queue.o): In function `queueAddLinkedList': /home/build/project/sources/rsyslog-3.21.9/runtime/queue.c:528: undefined reference to `rpl_malloc' /home/build/project/sources/rsyslog-3.21.9/runtime/queue.c:528: undefined reference to `rpl_malloc' ../runtime/.libs/librsyslog.a(librsyslog_la-queue.o): In function `qConstructFixedArray': /home/build/project/sources/rsyslog-3.21.9/runtime/queue.c:459: undefined reference to `rpl_malloc' ../runtime/.libs/librsyslog.a(librsyslog_la-queue.o): In function `queueSetFilePrefix': /home/build/project/sources/rsyslog-3.21.9/runtime/queue.c:2081: undefined reference to `rpl_malloc' ../runtime/.libs/librsyslog.a(librsyslog_la-queue.o): In function `queueStart': /home/build/project/sources/rsyslog-3.21.9/runtime/queue.c:1794: undefined reference to `rpl_malloc' ../runtime/.libs/librsyslog.a(librsyslog_la-threads.o):/home/build/project/sources/rsyslog-3.21.9/runtime/../threads.c:60: more undefined references to `rpl_malloc' follow I recompiled uclibc with MALLOC_GLIBC_COMPAT=y but the result was the same. The only reference to this that I can find is in the rsyslog bug tracker but the patch listed there does not allow it to compile. Just wondering if anybody has a working patch or suggestion. TIA -Frank From danson at rackspace.com Thu Jan 15 18:28:57 2009 From: danson at rackspace.com (Daniel Anson) Date: Thu, 15 Jan 2009 11:28:57 -0600 Subject: [rsyslog] Baclogged files to disk are pretty slow Message-ID: <21149_1232040593_n0FHSsbU031125_96AF20FDF4301D419B33CCE8E3A0132B0AB699D2@SAT4MX07.RACKSPACE.CORP> I have been dealing with this problem for a few days now and perhaps I will be able to solicit some advice or help. Here is the issue. I have an rsyslog relay writing to a remote database server and caching to disk. The write to the database uses a MySQL stored procedure that can write about 4000 records per second. The rsyslog.conf parts are set up like so: $ModLoad immark $ModLoadd imudp $UDPServerAddress 172.16.12.138 $UDPServerRun 514 $ModLoad imtcp $ModLoad imuxsock $ModLoad imklog $ModLoad ommysql.so $template template1,"CALL SAT2_RSYSLOG_EVENT_INSERT('%timestamp:::date-mysql%', '%timegenerated:::date-mysql%', '%syslogfacility%', '%syslogpriority%', '%hostname%', '%syslogtag%', '%msg%')", sql $WorkDirectory /rsyslog/work $ActionQueueType LinkedList # use asynchronous processing $ActionQueueFileName dbq # set file name, also enables disk mode $ActionResumeRetryCount -1 # infinite retries on insert failure *.* >172.16.2.238,rsyslog,syslogwriter,topsecret;template1 If I turn off the database, in this case I turned it off for almost a day, it backlogs nearly a 1 GB worth of information. The problem is that it takes nearly 6 hours to catch back up from this. While catching up, it only uses about 1% of the proc. Bandwidth is not an issue as the fibre link is only about 50% saturated. Is there a way to force rsyslogd to consume more of the proc and move faster. I have placed a -20 nice value on the process in hopes that would help but it really has not. Is there a way to force rsyslogd to use a pool of MySQL connections or intiate a new connection each time a record is written? Daniel M. Anson Linux Systems Engineer Rackspace Managed Hosting danson at rackspace.com Confidentiality Notice: This e-mail message (including any attached or embedded documents) is intended for the exclusive and confidential use of the individual or entity to which this message is addressed, and unless otherwise expressly indicated, is confidential and privileged information of Rackspace. Any dissemination, distribution or copying of the enclosed material is prohibited. If you receive this transmission in error, please notify us immediately by e-mail at abuse at rackspace.com, and delete the original message. Your cooperation is appreciated. From rgerhards at hq.adiscon.com Thu Jan 15 18:45:09 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Thu, 15 Jan 2009 18:45:09 +0100 Subject: [rsyslog] Baclogged files to disk are pretty slow In-Reply-To: <21149_1232040593_n0FHSsbU031125_96AF20FDF4301D419B33CCE8E3A0132B0AB699D2@SAT4MX07.RACKSPACE.CORP> References: <21149_1232040593_n0FHSsbU031125_96AF20FDF4301D419B33CCE8E3A0132B0AB699D2@SAT4MX07.RACKSPACE.CORP> Message-ID: <577465F99B41C842AAFBE9ED71E70ABA44F9A8@grfint2.intern.adiscon.com> Mhhh... with the current design, it submits messages individually to the database. I think what you experience is simply the turn-around from the database call (no other idea what it could be). It doesn't use more CPU because the database layer seems not to return any faster. There has been some discussion on batching multiple statements together, but this is non-trivial. I lost funding and things like this need a corporate sponsor now (they are not of importance for the non-commercial user field...). You could try to run the action on its own queue and with multiple workers. That could (could!) improve performance. But it is just a guess. Do you have any chance to see how long the query takes inside the SQL engine? Rainer > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of Daniel Anson > Sent: Thursday, January 15, 2009 6:29 PM > To: rsyslog at lists.adiscon.com > Subject: [rsyslog] Baclogged files to disk are pretty slow > > I have been dealing with this problem for a few days now and perhaps I > will be able to solicit some advice or help. Here is the issue. I > have > an rsyslog relay writing to a remote database server and caching to > disk. The write to the database uses a MySQL stored procedure that can > write about 4000 records per second. The rsyslog.conf parts are set up > like so: > > $ModLoad immark > $ModLoadd imudp > $UDPServerAddress 172.16.12.138 > $UDPServerRun 514 > $ModLoad imtcp > $ModLoad imuxsock > $ModLoad imklog > $ModLoad ommysql.so > > $template template1,"CALL > SAT2_RSYSLOG_EVENT_INSERT('%timestamp:::date-mysql%', > '%timegenerated:::date-mysql%', '%syslogfacility%', '%syslogpriority%', > '%hostname%', '%syslogtag%', '%msg%')", sql > > $WorkDirectory /rsyslog/work > $ActionQueueType LinkedList # use asynchronous processing > $ActionQueueFileName dbq # set file name, also enables disk mode > $ActionResumeRetryCount -1 # infinite retries on insert failure > > *.* >172.16.2.238,rsyslog,syslogwriter,topsecret;template1 > > If I turn off the database, in this case I turned it off for almost a > day, it backlogs nearly a 1 GB worth of information. The problem is > that it takes nearly 6 hours to catch back up from this. While > catching > up, it only uses about 1% of the proc. Bandwidth is not an issue as > the > fibre link is only about 50% saturated. Is there a way to force > rsyslogd to consume more of the proc and move faster. I have placed a > -20 nice value on the process in hopes that would help but it really > has > not. Is there a way to force rsyslogd to use a pool of MySQL > connections or intiate a new connection each time a record is written? > > > Daniel M. Anson > Linux Systems Engineer > Rackspace Managed Hosting > danson at rackspace.com > > > > > Confidentiality Notice: This e-mail message (including any attached or > embedded documents) is intended for the exclusive and confidential use > of the > individual or entity to which this message is addressed, and unless > otherwise > expressly indicated, is confidential and privileged information of > Rackspace. > Any dissemination, distribution or copying of the enclosed material is > prohibited. > If you receive this transmission in error, please notify us immediately > by e-mail > at abuse at rackspace.com, and delete the original message. > Your cooperation is appreciated. > > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com From lorenzo at sancho.ccd.uniroma2.it Thu Jan 15 18:58:37 2009 From: lorenzo at sancho.ccd.uniroma2.it (Lorenzo M. Catucci) Date: Thu, 15 Jan 2009 18:58:37 +0100 (CET) Subject: [rsyslog] rsyslog still crashes Message-ID: I've just tried again rsyslog on my 8 core mail server, and got the very same crash from september/october. I've restarted the server under valgrind control, and all seems to be running well... A good 2009 to all! Yours, lorenzo +-------------------------+----------------------------------------------+ | Lorenzo M. Catucci | Centro di Calcolo e Documentazione | | catucci at ccd.uniroma2.it | Universit? degli Studi di Roma "Tor Vergata" | | | Via O. Raimondo 18 ** I-00173 ROMA ** ITALY | | Tel. +39 06 7259 2255 | Fax. +39 06 7259 2125 | +-------------------------+----------------------------------------------+ -------------- next part -------------- GNU gdb 6.8-debian Copyright (C) 2008 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-linux-gnu"... Reading symbols from /usr/lib/libz.so.1...done. Loaded symbols for /usr/lib/libz.so.1 Reading symbols from /lib/libpthread.so.0...Reading symbols from /usr/lib/debug/lib/libpthread-2.7.so...done. done. Loaded symbols for /lib/libpthread.so.0 Reading symbols from /lib/libdl.so.2...Reading symbols from /usr/lib/debug/lib/libdl-2.7.so...done. done. Loaded symbols for /lib/libdl.so.2 Reading symbols from /lib/librt.so.1...Reading symbols from /usr/lib/debug/lib/librt-2.7.so...done. done. Loaded symbols for /lib/librt.so.1 Reading symbols from /lib/libc.so.6...Reading symbols from /usr/lib/debug/lib/libc-2.7.so...done. done. Loaded symbols for /lib/libc.so.6 Reading symbols from /lib/ld-linux-x86-64.so.2...Reading symbols from /usr/lib/debug/lib/ld-2.7.so...done. done. Loaded symbols for /lib64/ld-linux-x86-64.so.2 Reading symbols from /usr/lib/rsyslog/lmnet.so...done. Loaded symbols for /usr/lib/rsyslog/lmnet.so Reading symbols from /lib/libnss_files.so.2...Reading symbols from /usr/lib/debug/lib/libnss_files-2.7.so...done. done. Loaded symbols for /lib/libnss_files.so.2 Reading symbols from /usr/lib/rsyslog/imuxsock.so...done. Loaded symbols for /usr/lib/rsyslog/imuxsock.so Reading symbols from /usr/lib/rsyslog/imklog.so...done. Loaded symbols for /usr/lib/rsyslog/imklog.so Reading symbols from /lib/libnss_compat.so.2...Reading symbols from /usr/lib/debug/lib/libnss_compat-2.7.so...done. done. Loaded symbols for /lib/libnss_compat.so.2 Reading symbols from /lib/libnsl.so.1...Reading symbols from /usr/lib/debug/lib/libnsl-2.7.so...done. done. Loaded symbols for /lib/libnsl.so.1 Reading symbols from /lib/libnss_nis.so.2...Reading symbols from /usr/lib/debug/lib/libnss_nis-2.7.so...done. done. Loaded symbols for /lib/libnss_nis.so.2 Reading symbols from /usr/lib/rsyslog/lmnetstrms.so...done. Loaded symbols for /usr/lib/rsyslog/lmnetstrms.so Reading symbols from /usr/lib/rsyslog/lmtcpclt.so...done. Loaded symbols for /usr/lib/rsyslog/lmtcpclt.so Reading symbols from /usr/lib/rsyslog/lmnsd_ptcp.so...done. Loaded symbols for /usr/lib/rsyslog/lmnsd_ptcp.so Core was generated by `rsyslogd -c4'. Program terminated with signal 6, Aborted. [New process 22774] [New process 22776] [New process 22775] [New process 22773] [New process 22772] #0 0x00002b6037978ed5 in raise () from /lib/libc.so.6 (gdb) Thread 5 (process 22772): #0 0x00002b6037a0fce2 in select () from /lib/libc.so.6 #1 0x000000000040db53 in mainThread () at syslogd.c:2704 #2 0x000000000040ee56 in realMain (argc=, argv=) at syslogd.c:3631 #3 0x00002b60379651a6 in __libc_start_main () from /lib/libc.so.6 #4 0x000000000040a219 in _start () Thread 4 (process 22773): #0 0x00002b6037327fad in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib/libpthread.so.0 #1 0x0000000000432f5f in wtiWorker (pThis=0x685140) at wti.c:406 #2 0x000000000043172a in wtpWorker (arg=0x685140) at wtp.c:425 #3 0x00002b6037323fc7 in start_thread () from /lib/libpthread.so.0 #4 0x00002b6037a165ad in clone () from /lib/libc.so.6 #5 0x0000000000000000 in ?? () Thread 3 (process 22775): #0 0x00002b6037a0fce2 in select () from /lib/libc.so.6 #1 0x00002b60380b59fd in runInput (pThrd=) at imuxsock.c:280 #2 0x00000000004436ff in thrdStarter (arg=0x6a5c80) at ../threads.c:139 #3 0x00002b6037323fc7 in start_thread () from /lib/libpthread.so.0 #4 0x00002b6037a165ad in clone () from /lib/libc.so.6 #5 0x0000000000000000 in ?? () Thread 2 (process 22776): #0 0x00002b603732a7db in read () from /lib/libpthread.so.0 #1 0x00002b60382ba1ef in klogLogKMsg () at linux.c:449 #2 0x00002b60382b9594 in runInput (pThrd=0x6aafc0) at imklog.c:224 #3 0x00000000004436ff in thrdStarter (arg=0x6aafc0) at ../threads.c:139 #4 0x00002b6037323fc7 in start_thread () from /lib/libpthread.so.0 #5 0x00002b6037a165ad in clone () from /lib/libc.so.6 #6 0x0000000000000000 in ?? () Thread 1 (process 22774): #0 0x00002b6037978ed5 in raise () from /lib/libc.so.6 #1 0x00002b603797a3f3 in abort () from /lib/libc.so.6 #2 0x0000000000423657 in sigsegvHdlr (signum=6) at debug.c:759 #3 #4 0x00002b6037978ed5 in raise () from /lib/libc.so.6 #5 0x00002b603797a3f3 in abort () from /lib/libc.so.6 #6 0x00002b6037971dc9 in __assert_fail () from /lib/libc.so.6 #7 0x000000000041ce78 in msgDestruct (ppThis=0x68ace8) at msg.c:330 #8 0x0000000000443036 in actionCallAction (pAction=0x68ac70, pMsg=0x6b2010) at ../action.c:774 #9 0x000000000040b2c7 in processMsgDoActions (pData=0x68ac70, pParam=0x41000e90) at syslogd.c:1140 #10 0x000000000041de78 in llExecFunc (pThis=0x68aae0, pFunc=0x40b270 , pParam=0x41000e90) at linkedlist.c:391 #11 0x000000000040add9 in msgConsumer (notNeeded=, pUsr=) at syslogd.c:1183 #12 0x000000000043c4f7 in queueConsumerReg (pThis=0x68ff20, pWti=0x6a3bb0, iCancelStateSave=) at queue.c:1598 #13 0x0000000000432fd0 in wtiWorker (pThis=0x6a3bb0) at wti.c:416 #14 0x000000000043172a in wtpWorker (arg=0x6a3bb0) at wtp.c:425 #15 0x00002b6037323fc7 in start_thread () from /lib/libpthread.so.0 #16 0x00002b6037a165ad in clone () from /lib/libc.so.6 #17 0x0000000000000000 in ?? () (gdb) quit From hks.private at gmail.com Thu Jan 15 19:44:45 2009 From: hks.private at gmail.com ((private) HKS) Date: Thu, 15 Jan 2009 13:44:45 -0500 Subject: [rsyslog] rsyslog still crashes In-Reply-To: References: Message-ID: On Thu, Jan 15, 2009 at 12:58 PM, Lorenzo M. Catucci wrote: > I've just tried again rsyslog on my 8 core mail server, and got the very > same crash from september/october. I've restarted the server under valgrind > control, and all seems to be running well... > > A good 2009 to all! > > Yours, > > lorenzo Version you're using? -HKS From aoz.syn at gmail.com Thu Jan 15 20:11:09 2009 From: aoz.syn at gmail.com (RB) Date: Thu, 15 Jan 2009 12:11:09 -0700 Subject: [rsyslog] Baclogged files to disk are pretty slow In-Reply-To: <21149_1232040593_n0FHSsbU031125_96AF20FDF4301D419B33CCE8E3A0132B0AB699D2@SAT4MX07.RACKSPACE.CORP> References: <21149_1232040593_n0FHSsbU031125_96AF20FDF4301D419B33CCE8E3A0132B0AB699D2@SAT4MX07.RACKSPACE.CORP> Message-ID: <4255c2570901151111n6696fbc9md66a30c9bc9b4a10@mail.gmail.com> Many more measurements are needed before declaring a conclusive cause, but on the surface it seems that your bottleneck is not rsyslog or the sending server but the database itself. Comments below. On Thu, Jan 15, 2009 at 10:28, Daniel Anson wrote: > I have been dealing with this problem for a few days now and perhaps I > will be able to solicit some advice or help. Here is the issue. I have > an rsyslog relay writing to a remote database server and caching to > disk. The write to the database uses a MySQL stored procedure that can > write about 4000 records per second. The rsyslog.conf parts are set up Is that 4000 TPS burst or sustained speed? > If I turn off the database, in this case I turned it off for almost a > day, it backlogs nearly a 1 GB worth of information. The problem is Roughly how many records? > that it takes nearly 6 hours to catch back up from this. While catching > up, it only uses about 1% of the proc. Bandwidth is not an issue as the What's the processor and disk load look like on your MySQL server? > fibre link is only about 50% saturated. Is there a way to force Presuming 50% is your bps, what was your PPS? Depending on how large your average event/transaction are, you may never see 100% due to small packets. > not. Is there a way to force rsyslogd to use a pool of MySQL > connections or intiate a new connection each time a record is written? Ranier confirmed my suspicion that rsyslog executes a single transaction per event, which is (as he also notes) sub-optimal for performance. Batching really should be about the same logic as the MARK functionality: every N foo, output "bar". Multiple actions per transaction (batching) is a classic query tuning technique and can be approached many ways, but you probably need to verify your database I/O is indeed the bottleneck. From danson at rackspace.com Fri Jan 16 00:01:19 2009 From: danson at rackspace.com (Daniel Anson) Date: Thu, 15 Jan 2009 17:01:19 -0600 Subject: [rsyslog] Baclogged files to disk are pretty slow In-Reply-To: <4255c2570901151111n6696fbc9md66a30c9bc9b4a10@mail.gmail.com> References: <21149_1232040593_n0FHSsbU031125_96AF20FDF4301D419B33CCE8E3A0132B0AB699D2@SAT4MX07.RACKSPACE.CORP> <4255c2570901151111n6696fbc9md66a30c9bc9b4a10@mail.gmail.com> Message-ID: <19646_1232060570_n0FN2mSc020768_96AF20FDF4301D419B33CCE8E3A0132B0ACECB55@SAT4MX07.RACKSPACE.CORP> A few things about the MySQL server itself, I have eliminated bandwidth, proc speed, disk I/O as potential bottlenecks. The obvious bottleneck is the MySQL server. For a temporary solution, I have placed an rsyslog relay on the MySQL server. So: Client_message -> local_datacenter_relay -> remote_datacenter_relay -> MySQL_server The messages are traveling much faster (kudos to the socket programming there) as the remote relay writes to a local MySQL server. I do not believe this to be an optimal solution. In an earlier email, Rainer mentions and I quote: "You could try to run the action on its own queue and with multiple workers. That could (could!) improve performance. But it is just a guess. Do you have any chance to see how long the query takes inside the SQL engine?" MySQL will run about 4000 inserts per second (constant speed). I am willing to try what Rainer suggests; however, I am unsure how to direct specific actions to act on a queue. Any help s appreciated. I know I could add the two following lines and create worker threads: $ActionQueueWorkerThreads 20 $MainMsgQueueWorkerThreads 20 Would I have to add additional lines to the config. My config once again looks like so: $ModLoad immark $ModLoadd imudp $UDPServerAddress 172.16.12.138 $UDPServerRun 514 $ModLoad imtcp $ModLoad imuxsock $ModLoad imklog $ModLoad ommysql.so $template template1,"CALL SAT2_RSYSLOG_EVENT_INSERT('%timestamp:::date-mysql%', '%timegenerated:::date-mysql%', '%syslogfacility%', syslogpriority%', '%hostname%', '%syslogtag%', '%msg%')", sql $WorkDirectory /rsyslog/work $ActionQueueType LinkedList # use asynchronous processing $ActionQueueFileName dbq # set file name, also enables disk mode $ActionResumeRetryCount -1 # infinite retries on insert failure *.* >172.16.2.238,rsyslog,syslogwriter,topsecret;template1 I would hope that there is an easy solution as my next idea is to write some type of daemonized process that can insert messages from a pool of MySQL connections. I can achieve this in C but would rather hopefully find a solution inside of the configuration. Daniel M. Anson Linux Systems Engineer Rackspace Managed Hosting danson at rackspace.com -----Original Message----- From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog-bounces at lists.adiscon.com] On Behalf Of RB Sent: Thursday, January 15, 2009 1:11 PM To: rsyslog-users Subject: Re: [rsyslog] Baclogged files to disk are pretty slow Many more measurements are needed before declaring a conclusive cause, but on the surface it seems that your bottleneck is not rsyslog or the sending server but the database itself. Comments below. On Thu, Jan 15, 2009 at 10:28, Daniel Anson wrote: > I have been dealing with this problem for a few days now and perhaps I > will be able to solicit some advice or help. Here is the issue. I have > an rsyslog relay writing to a remote database server and caching to > disk. The write to the database uses a MySQL stored procedure that can > write about 4000 records per second. The rsyslog.conf parts are set up Is that 4000 TPS burst or sustained speed? > If I turn off the database, in this case I turned it off for almost a > day, it backlogs nearly a 1 GB worth of information. The problem is Roughly how many records? > that it takes nearly 6 hours to catch back up from this. While catching > up, it only uses about 1% of the proc. Bandwidth is not an issue as the What's the processor and disk load look like on your MySQL server? > fibre link is only about 50% saturated. Is there a way to force Presuming 50% is your bps, what was your PPS? Depending on how large your average event/transaction are, you may never see 100% due to small packets. > not. Is there a way to force rsyslogd to use a pool of MySQL > connections or intiate a new connection each time a record is written? Ranier confirmed my suspicion that rsyslog executes a single transaction per event, which is (as he also notes) sub-optimal for performance. Batching really should be about the same logic as the MARK functionality: every N foo, output "bar". Multiple actions per transaction (batching) is a classic query tuning technique and can be approached many ways, but you probably need to verify your database I/O is indeed the bottleneck. _______________________________________________ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com Confidentiality Notice: This e-mail message (including any attached or embedded documents) is intended for the exclusive and confidential use of the individual or entity to which this message is addressed, and unless otherwise expressly indicated, is confidential and privileged information of Rackspace. Any dissemination, distribution or copying of the enclosed material is prohibited. If you receive this transmission in error, please notify us immediately by e-mail at abuse at rackspace.com, and delete the original message. Your cooperation is appreciated. From mbiebl at gmail.com Fri Jan 16 01:20:22 2009 From: mbiebl at gmail.com (Michael Biebl) Date: Fri, 16 Jan 2009 01:20:22 +0100 Subject: [rsyslog] rsyslog still crashes In-Reply-To: References: Message-ID: 2009/1/15 (private) HKS : > On Thu, Jan 15, 2009 at 12:58 PM, Lorenzo M. Catucci > wrote: >> I've just tried again rsyslog on my 8 core mail server, and got the very >> same crash from september/october. I've restarted the server under valgrind >> control, and all seems to be running well... >> >> A good 2009 to all! >> >> Yours, >> >> lorenzo > > > Version you're using? Given the -c4 command line argument, I'd expect it to be 4.1.3. Sounds familiar to http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=509292 (which is 3.18.6). It seems to be a more general problem with multi core (= very fast??) systems. Cheers, Michael -- Why is it that all of the instruments seeking intelligent life in the universe are pointed away from Earth? From lorenzo at sancho.ccd.uniroma2.it Fri Jan 16 01:37:14 2009 From: lorenzo at sancho.ccd.uniroma2.it (Lorenzo M. Catucci) Date: Fri, 16 Jan 2009 01:37:14 +0100 (CET) Subject: [rsyslog] rsyslog still crashes In-Reply-To: References: Message-ID: On Thu, 15 Jan 2009, (private) HKS wrote: pH> pH> Version you're using? pH> git origin/master branch as of today. Sorry for forgetting to mention! +-------------------------+----------------------------------------------+ | Lorenzo M. Catucci | Centro di Calcolo e Documentazione | | catucci at ccd.uniroma2.it | Universit? degli Studi di Roma "Tor Vergata" | | | Via O. Raimondo 18 ** I-00173 ROMA ** ITALY | | Tel. +39 06 7259 2255 | Fax. +39 06 7259 2125 | +-------------------------+----------------------------------------------+ From rgerhards at hq.adiscon.com Thu Jan 15 20:06:06 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Thu, 15 Jan 2009 20:06:06 +0100 Subject: [rsyslog] rsyslog still crashes In-Reply-To: References: Message-ID: <1232046366.22744.34.camel@localhost.localdomain> On Fri, 2009-01-16 at 01:20 +0100, Michael Biebl wrote: > Given the -c4 command line argument, I'd expect it to be 4.1.3. > > Sounds familiar to > http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=509292 (which is > 3.18.6). > > It seems to be a more general problem with multi core (= very fast??) systems. Yes, that is what my analysis so far points to. It's also part of the problem, because I do not have very fast hardware to reproduce the issue (and it is also not easy to reliably reproduce if you have...). I've gotten a couple of reports (I think most on the mailing list) on such problems and all they have in common is 4+ core machines. I'll try to get hold based on what Lorenzo submits. In his environment, the problem seems to occur most reliably (he probably has the fastest machine...). Lorenzo: details follow soon. Rainer From rgerhards at hq.adiscon.com Thu Jan 15 20:14:18 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Thu, 15 Jan 2009 20:14:18 +0100 Subject: [rsyslog] rsyslog still crashes In-Reply-To: References: Message-ID: <1232046859.22744.39.camel@localhost.localdomain> On Thu, 2009-01-15 at 18:58 +0100, Lorenzo M. Catucci wrote: > I've just tried again rsyslog on my 8 core mail server, and got the very > same crash from september/october. So, without valgrind, can you reproduce the issue each time you start it? That would be very useful. > I've restarted the server under > valgrind control, and all seems to be running well... I guess the issue here is that valgrind slows down things and also simulates (I think) 2 CPUs only. > A good 2009 to all! same to you! Thanks for being persistent with this issue (it begins to drive me crazy). >From what I have learned so far we seem to have a race condition that causes memory corrupt. The backtrace you include also points into that direction. Those few cases where I got a usable backtrace all point to the very same location. However, that does not mean this location has the bug. It seems to occur some time earlier, and manifests when the message is destructed. It could be a double-free or even some wild memory access that accidently overwrites some structures. If we are able to get a stable repro, and we are able to run with at least some minimal diagnostics, we may be much better of tackeling that beast. First step is to see that we get a stable repro. If we do, I need to think about minimal debug. The full debugging system makes the bug disappear, I think because it changes the timing. Rainer From lorenzo at sancho.ccd.uniroma2.it Fri Jan 16 12:28:59 2009 From: lorenzo at sancho.ccd.uniroma2.it (Lorenzo M. Catucci) Date: Fri, 16 Jan 2009 12:28:59 +0100 (CET) Subject: [rsyslog] rsyslog still crashes In-Reply-To: <1232046859.22744.39.camel@localhost.localdomain> References: <1232046859.22744.39.camel@localhost.localdomain> Message-ID: On Thu, 15 Jan 2009, Rainer Gerhards wrote: RG> On Thu, 2009-01-15 at 18:58 +0100, Lorenzo M. Catucci wrote: RG> > I've just tried again rsyslog on my 8 core mail server, and got the very RG> > same crash from september/october. RG> RG> So, without valgrind, can you reproduce the issue each time you start RG> it? That would be very useful. RG> Yes: any time I start a free-running instance, I get the very same segmentation fault and core-file to backtrace. RG> RG> > I've restarted the server under RG> > valgrind control, and all seems to be running well... RG> RG> I guess the issue here is that valgrind slows down things and also RG> simulates (I think) 2 CPUs only. RG> Right, I didn't know valgrind both limited the CPU bandwidth and the (v)CPU number, but any of them would hide the existing race condition RG> RG> From what I have learned so far we seem to have a race condition that RG> causes memory corrupt. The backtrace you include also points into that RG> direction. Those few cases where I got a usable backtrace all point to RG> the very same location. However, that does not mean this location has RG> the bug. It seems to occur some time earlier, and manifests when the RG> message is destructed. It could be a double-free or even some wild RG> memory access that accidently overwrites some structures. RG> RG> If we are able to get a stable repro, and we are able to run with at RG> least some minimal diagnostics, we may be much better of tackeling that RG> beast. RG> RG> First step is to see that we get a stable repro. If we do, I need to RG> think about minimal debug. The full debugging system makes the bug RG> disappear, I think because it changes the timing. RG> I don't think we could hope for a stable reproducer for an heisen-bug... all I can provide is a very high throughput system generating a very high local message rate. As a matter of facts, this rsyslog instance is acting as a forwader to a remote instance that didn't suffer any crash. The only differences between the engines' configurations are: 1. the remote logs to a postgres instance instead of spool files, 2. the remote does just run the postgresql instance and the logger My gut feeling is that the different behaviour doesn't come from any of these differences, but from the different memory-path taken from the messages, which in the remote case are serialised from the underlying network transport. We'll see! Yours, lorenzo +-------------------------+----------------------------------------------+ | Lorenzo M. Catucci | Centro di Calcolo e Documentazione | | catucci at ccd.uniroma2.it | Universit? degli Studi di Roma "Tor Vergata" | | | Via O. Raimondo 18 ** I-00173 ROMA ** ITALY | | Tel. +39 06 7259 2255 | Fax. +39 06 7259 2125 | +-------------------------+----------------------------------------------+ From rgerhards at hq.adiscon.com Fri Jan 16 12:44:53 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Fri, 16 Jan 2009 12:44:53 +0100 Subject: [rsyslog] rsyslog still crashes In-Reply-To: References: <1232046859.22744.39.camel@localhost.localdomain> Message-ID: <577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com> > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of Lorenzo M. Catucci > Sent: Friday, January 16, 2009 12:29 PM > To: rsyslog-users > Subject: Re: [rsyslog] rsyslog still crashes > > On Thu, 15 Jan 2009, Rainer Gerhards wrote: > > RG> On Thu, 2009-01-15 at 18:58 +0100, Lorenzo M. Catucci wrote: > RG> > I've just tried again rsyslog on my 8 core mail server, and got > the very > RG> > same crash from september/october. > RG> > RG> So, without valgrind, can you reproduce the issue each time you > start > RG> it? That would be very useful. > RG> > > Yes: any time I start a free-running instance, I get the very same > segmentation fault and core-file to backtrace. > > RG> > RG> > I've restarted the server under > RG> > valgrind control, and all seems to be running well... > RG> > RG> I guess the issue here is that valgrind slows down things and also > RG> simulates (I think) 2 CPUs only. > RG> > > Right, I didn't know valgrind both limited the CPU bandwidth and the > (v)CPU number, but any of them would hide the existing race condition Actually, valgrind executes the app in a virtual CPU/Memory environment. So this is *quite different* from the real machine, but nevertheless extremely useful in most cases. While in theory so the actual hardware should not affect the valgrind outcome, my former debugging has shown it does. Thus my first try is always valgrind. But it seems not to help here as we have seen... > RG> > RG> From what I have learned so far we seem to have a race condition > that > RG> causes memory corrupt. The backtrace you include also points into > that > RG> direction. Those few cases where I got a usable backtrace all point > to > RG> the very same location. However, that does not mean this location > has > RG> the bug. It seems to occur some time earlier, and manifests when > the > RG> message is destructed. It could be a double-free or even some wild > RG> memory access that accidently overwrites some structures. > RG> > RG> If we are able to get a stable repro, and we are able to run with > at > RG> least some minimal diagnostics, we may be much better of tackeling > that > RG> beast. > RG> > RG> First step is to see that we get a stable repro. If we do, I need > to > RG> think about minimal debug. The full debugging system makes the bug > RG> disappear, I think because it changes the timing. > RG> > > I don't think we could hope for a stable reproducer for an heisen- > bug... Of course not 100%. But what you have sounds good enough. I must now see that/how I can change the system so that we have some additional instrumentation while the bug is still there. I'll first look at some compile options. Is it OK for you if I just send some messages to stdout? > all I can provide is a very high throughput system generating a very > high > local message rate. As a matter of facts, this rsyslog instance is > acting as a forwader to a remote instance that didn't suffer any crash. > > The only differences between the engines' configurations are: > 1. the remote logs to a postgres instance instead of spool files, > 2. the remote does just run the postgresql instance and the logger > > My gut feeling is that the different behaviour doesn't come from any of > these differences, but from the different memory-path taken from the > messages, which in the remote case are serialised from the underlying > network transport. This may be... Rainer From lorenzo at sancho.ccd.uniroma2.it Fri Jan 16 13:01:47 2009 From: lorenzo at sancho.ccd.uniroma2.it (Lorenzo M. Catucci) Date: Fri, 16 Jan 2009 13:01:47 +0100 (CET) Subject: [rsyslog] rsyslog still crashes In-Reply-To: <577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com> References: <1232046859.22744.39.camel@localhost.localdomain> <577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com> Message-ID: On Fri, 16 Jan 2009, Rainer Gerhards wrote: RG> RG> Of course not 100%. But what you have sounds good enough. I must now see RG> that/how I can change the system so that we have some additional RG> instrumentation while the bug is still there. I'll first look at some RG> compile options. Is it OK for you if I just send some messages to RG> stdout? RG> Yes, be it stdout... I'm eager to have an rsyslog instance running well, since I've really liked what I've seen (with the small exception of the crashes!) See you soon, lorenzo +-------------------------+----------------------------------------------+ | Lorenzo M. Catucci | Centro di Calcolo e Documentazione | | catucci at ccd.uniroma2.it | Universit? degli Studi di Roma "Tor Vergata" | | | Via O. Raimondo 18 ** I-00173 ROMA ** ITALY | | Tel. +39 06 7259 2255 | Fax. +39 06 7259 2125 | +-------------------------+----------------------------------------------+ From rgerhards at hq.adiscon.com Fri Jan 16 15:22:02 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Fri, 16 Jan 2009 15:22:02 +0100 Subject: [rsyslog] rsyslog still crashes In-Reply-To: References: <1232046859.22744.39.camel@localhost.localdomain><577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com> Message-ID: <577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com> Lorenzo, I have created a new branch "raceDebug" and done a first commit to it. The change is very lightweight. Please pull, compile as usual and give it a try. It spits out some info to stdout from time to time (hopefully). I am not sure if it aborts, depending on the output it may or may not. Even if we get messages, they are probably not enough to pinpoint the bug, but I wanted to do something very light to see if the bug stays. Feedback appreciated. Rainer > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of Lorenzo M. Catucci > Sent: Friday, January 16, 2009 1:02 PM > To: rsyslog-users > Subject: Re: [rsyslog] rsyslog still crashes > > On Fri, 16 Jan 2009, Rainer Gerhards wrote: > RG> > RG> Of course not 100%. But what you have sounds good enough. I must > now see > RG> that/how I can change the system so that we have some additional > RG> instrumentation while the bug is still there. I'll first look at > some > RG> compile options. Is it OK for you if I just send some messages to > RG> stdout? > RG> > > Yes, be it stdout... I'm eager to have an rsyslog instance running > well, > since I've really liked what I've seen (with the small exception of the > crashes!) > > See you soon, > > lorenzo > > > +-------------------------+-------------------------------------------- > --+ > | Lorenzo M. Catucci | Centro di Calcolo e Documentazione > | > | catucci at ccd.uniroma2.it | Universit? degli Studi di Roma "Tor > Vergata" | > | | Via O. Raimondo 18 ** I-00173 ROMA ** > ITALY | > | Tel. +39 06 7259 2255 | Fax. +39 06 7259 2125 > | > +-------------------------+-------------------------------------------- > --+ From pieter.thysebaert at intec.ugent.be Fri Jan 16 15:07:19 2009 From: pieter.thysebaert at intec.ugent.be (pieter.thysebaert at intec.ugent.be) Date: Fri, 16 Jan 2009 15:07:19 +0100 (CET) Subject: [rsyslog] (no subject) Message-ID: <56908.212.190.198.36.1232114839.squirrel@webserver6.intec.ugent.be> Hello, I've found on-line claims that rsyslog can be compiled (and maybe even runs ok?) on HP-UX. However, I've not found too much information about this, so I'd like to ask: has anyone been able to compile (and run) rsyslog 3.20.2 on HP-UX 11? If so, does it need patching? What packages are required to build it successfully? (only HP software or gcc + gnu tools?) I'm asking because a colleague briefly attempted to configure the package on hpux UX11.11, and configure ended with > checking for pthread.h... yes > checking for pthread_create in -lpthread... no Any success stories out there? Thanks! Pieter From aoz.syn at gmail.com Fri Jan 16 16:19:39 2009 From: aoz.syn at gmail.com (RB) Date: Fri, 16 Jan 2009 08:19:39 -0700 Subject: [rsyslog] Baclogged files to disk are pretty slow In-Reply-To: <19646_1232060570_n0FN2mSc020768_96AF20FDF4301D419B33CCE8E3A0132B0ACECB55@SAT4MX07.RACKSPACE.CORP> References: <21149_1232040593_n0FHSsbU031125_96AF20FDF4301D419B33CCE8E3A0132B0AB699D2@SAT4MX07.RACKSPACE.CORP> <4255c2570901151111n6696fbc9md66a30c9bc9b4a10@mail.gmail.com> <19646_1232060570_n0FN2mSc020768_96AF20FDF4301D419B33CCE8E3A0132B0ACECB55@SAT4MX07.RACKSPACE.CORP> Message-ID: <4255c2570901160719o4aa3bc6bk9813225374bfc53c@mail.gmail.com> On Thu, Jan 15, 2009 at 16:01, Daniel Anson wrote: > I would hope that there is an easy solution as my next idea is to write > some type of daemonized process that can insert messages from a pool of > MySQL connections. I can achieve this in C but would rather hopefully > find a solution inside of the configuration. Short of implementing the queue/worker configuration (no idea how), it seems the only current option would be to implement something of the sort, either by an update to the ommysql module (optimal, as it gets your code supported by someone else for its lifetim) or by some external program. I'd think an optimal external solution would be some sort of relp2mysql bridge, but suspect that would end up reimplementing a good chunk of rsyslog. From lorenzo at sancho.ccd.uniroma2.it Fri Jan 16 16:22:45 2009 From: lorenzo at sancho.ccd.uniroma2.it (Lorenzo M. Catucci) Date: Fri, 16 Jan 2009 16:22:45 +0100 (CET) Subject: [rsyslog] rsyslog still crashes In-Reply-To: <577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com> References: <1232046859.22744.39.camel@localhost.localdomain><577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com> Message-ID: On Fri, 16 Jan 2009, Rainer Gerhards wrote: RG> Lorenzo, RG> RG> I have created a new branch "raceDebug" and done a first commit to it. RG> The change is very lightweight. Please pull, compile as usual and give RG> it a try. It spits out some info to stdout from time to time RG> (hopefully). I am not sure if it aborts, depending on the output it RG> may or may not. Even if we get messages, they are probably not enough RG> to pinpoint the bug, but I wanted to do something very light to see if RG> the bug stays. RG> RG> Feedback appreciated. RG> Rainer, I've just checked-out the branch; I've run configure with the following command line: ./configure --prefix=/usr --enable-mysql --enable-pgsql --enable-mail --enable-imfile --enable-debug --enable-rtinst --enable-valgrind --no-create --no-recursion From "git diff -r HEAD^ HEAD" I've seen an #if 0 section in the commit. Let me know if you'd prefer if I change it to #if 1. I've just started rsyslogd with rsyslogd -c4 -n on a screen session, with the same configuration files I'm using since september. Since both the "rsyslogd -c4 -n" and the later "rsyslogd -c4 -d" invocation crashed very quickly, I've restarted it once more with stdout redirected to a a logfile, and now it's running. Will let you know if it crashes once more. Yours, lorenzo +-------------------------+----------------------------------------------+ | Lorenzo M. Catucci | Centro di Calcolo e Documentazione | | catucci at ccd.uniroma2.it | Universit? degli Studi di Roma "Tor Vergata" | | | Via O. Raimondo 18 ** I-00173 ROMA ** ITALY | | Tel. +39 06 7259 2255 | Fax. +39 06 7259 2125 | +-------------------------+----------------------------------------------+ From rgerhards at hq.adiscon.com Fri Jan 16 16:33:04 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Fri, 16 Jan 2009 16:33:04 +0100 Subject: [rsyslog] rsyslog still crashes In-Reply-To: References: <1232046859.22744.39.camel@localhost.localdomain><577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com> Message-ID: <577465F99B41C842AAFBE9ED71E70ABA44F9C2@grfint2.intern.adiscon.com> > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of Lorenzo M. Catucci > Sent: Friday, January 16, 2009 4:23 PM > To: rsyslog-users > Subject: Re: [rsyslog] rsyslog still crashes > > On Fri, 16 Jan 2009, Rainer Gerhards wrote: > > RG> Lorenzo, > RG> > RG> I have created a new branch "raceDebug" and done a first commit to > it. > RG> The change is very lightweight. Please pull, compile as usual and > give > RG> it a try. It spits out some info to stdout from time to time > RG> (hopefully). I am not sure if it aborts, depending on the output it > RG> may or may not. Even if we get messages, they are probably not > enough > RG> to pinpoint the bug, but I wanted to do something very light to see > if > RG> the bug stays. > RG> > RG> Feedback appreciated. > RG> > > Rainer, I've just checked-out the branch; I've run configure with the > following command line: > > ./configure --prefix=/usr --enable-mysql --enable-pgsql --enable-mail > --enable-imfile --enable-debug --enable-rtinst --enable-valgrind > --no-create --no-recursion > > From "git diff -r HEAD^ HEAD" I've seen an #if 0 section in the > commit. > Let me know if you'd prefer if I change it to #if 1. Mmmhh... you can use debug. Yes, please then change it to 1. > > I've just started rsyslogd with rsyslogd -c4 -n on a screen session, > with > the same configuration files I'm using since september. > > Since both the "rsyslogd -c4 -n" and the later "rsyslogd -c4 -d" > invocation crashed very quickly, I've restarted it once more with > stdout > redirected to a a logfile, and now it's running. Will let you know if > it > crashes once more. That sounds good. Do you happen to have the output from those crashes? Anyway, I will be interested in what it now comes up with. As a side-note, I have introduced another race by calling the library functions. There is always some good and bad. The regular debugging system prevents this problem by protecting the writes with mutexes. That, however, affects the timing and thus we do not see the real issue. So what I have done is bad, but may be useful. I forgot to mention that with my last post... Rainer > > Yours, > > lorenzo > > > +-------------------------+-------------------------------------------- > --+ > | Lorenzo M. Catucci | Centro di Calcolo e Documentazione > | > | catucci at ccd.uniroma2.it | Universit? degli Studi di Roma "Tor > Vergata" | > | | Via O. Raimondo 18 ** I-00173 ROMA ** > ITALY | > | Tel. +39 06 7259 2255 | Fax. +39 06 7259 2125 > | > +-------------------------+-------------------------------------------- > --+ From lorenzo at sancho.ccd.uniroma2.it Fri Jan 16 17:07:30 2009 From: lorenzo at sancho.ccd.uniroma2.it (Lorenzo M. Catucci) Date: Fri, 16 Jan 2009 17:07:30 +0100 (CET) Subject: [rsyslog] rsyslog still crashes In-Reply-To: <577465F99B41C842AAFBE9ED71E70ABA44F9C2@grfint2.intern.adiscon.com> References: <1232046859.22744.39.camel@localhost.localdomain><577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44F9C2@grfint2.intern.adiscon.com> Message-ID: On Fri, 16 Jan 2009, Rainer Gerhards wrote: RG> RG> That sounds good. Do you happen to have the output from those crashes? RG> The -n crash was completely silent; the -d run was chatty (as expected); with stdout redirected, it took a lot more time to crash, but here are both the logfile and the gdb backtrace. Yours, lorenzo +-------------------------+----------------------------------------------+ | Lorenzo M. Catucci | Centro di Calcolo e Documentazione | | catucci at ccd.uniroma2.it | Universit? degli Studi di Roma "Tor Vergata" | | | Via O. Raimondo 18 ** I-00173 ROMA ** ITALY | | Tel. +39 06 7259 2255 | Fax. +39 06 7259 2125 | +-------------------------+----------------------------------------------+ -------------- next part -------------- GNU gdb 6.8-debian Copyright (C) 2008 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-linux-gnu"... Reading symbols from /usr/lib/libz.so.1...done. Loaded symbols for /usr/lib/libz.so.1 Reading symbols from /lib/libpthread.so.0...Reading symbols from /usr/lib/debug/lib/libpthread-2.7.so...done. done. Loaded symbols for /lib/libpthread.so.0 Reading symbols from /lib/libdl.so.2...Reading symbols from /usr/lib/debug/lib/libdl-2.7.so...done. done. Loaded symbols for /lib/libdl.so.2 Reading symbols from /lib/librt.so.1...Reading symbols from /usr/lib/debug/lib/librt-2.7.so...done. done. Loaded symbols for /lib/librt.so.1 Reading symbols from /lib/libc.so.6...Reading symbols from /usr/lib/debug/lib/libc-2.7.so...done. done. Loaded symbols for /lib/libc.so.6 Reading symbols from /lib/ld-linux-x86-64.so.2...Reading symbols from /usr/lib/debug/lib/ld-2.7.so...done. done. Loaded symbols for /lib64/ld-linux-x86-64.so.2 Reading symbols from /usr/lib/rsyslog/lmnet.so...done. Loaded symbols for /usr/lib/rsyslog/lmnet.so Reading symbols from /lib/libnss_files.so.2...Reading symbols from /usr/lib/debug/lib/libnss_files-2.7.so...done. done. Loaded symbols for /lib/libnss_files.so.2 Reading symbols from /usr/lib/rsyslog/imuxsock.so...done. Loaded symbols for /usr/lib/rsyslog/imuxsock.so Reading symbols from /usr/lib/rsyslog/imklog.so...done. Loaded symbols for /usr/lib/rsyslog/imklog.so Reading symbols from /lib/libnss_compat.so.2...Reading symbols from /usr/lib/debug/lib/libnss_compat-2.7.so...done. done. Loaded symbols for /lib/libnss_compat.so.2 Reading symbols from /lib/libnsl.so.1...Reading symbols from /usr/lib/debug/lib/libnsl-2.7.so...done. done. Loaded symbols for /lib/libnsl.so.1 Reading symbols from /lib/libnss_nis.so.2...Reading symbols from /usr/lib/debug/lib/libnss_nis-2.7.so...done. done. Loaded symbols for /lib/libnss_nis.so.2 Reading symbols from /usr/lib/rsyslog/lmnetstrms.so...done. Loaded symbols for /usr/lib/rsyslog/lmnetstrms.so Reading symbols from /usr/lib/rsyslog/lmtcpclt.so...done. Loaded symbols for /usr/lib/rsyslog/lmtcpclt.so Reading symbols from /usr/lib/rsyslog/lmnsd_ptcp.so...done. Loaded symbols for /usr/lib/rsyslog/lmnsd_ptcp.so Core was generated by `rsyslogd -c4 -n'. Program terminated with signal 11, Segmentation fault. [New process 19309] [New process 19311] [New process 19310] [New process 19308] [New process 19307] #0 0x000000000041cb79 in msgDestruct (ppThis=0x68ae18) at msg.c:354 354 if(strcmp((char*)(((obj_t*)pThis)->pObjInfo->pszID), "msg")) { (gdb) Thread 5 (process 19307): #0 0x00002af4d1020ce2 in select () from /lib/libc.so.6 #1 0x000000000040db93 in mainThread () at syslogd.c:2704 #2 0x000000000040ee96 in realMain (argc=, argv=) at syslogd.c:3631 #3 0x00002af4d0f761a6 in __libc_start_main () from /lib/libc.so.6 #4 0x000000000040a259 in _start () Thread 4 (process 19308): #0 0x00002af4d0938fad in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib/libpthread.so.0 #1 0x0000000000432f9f in wtiWorker (pThis=0x685270) at wti.c:406 #2 0x000000000043176a in wtpWorker (arg=0x685270) at wtp.c:425 #3 0x00002af4d0934fc7 in start_thread () from /lib/libpthread.so.0 #4 0x00002af4d10275ad in clone () from /lib/libc.so.6 #5 0x0000000000000000 in ?? () Thread 3 (process 19310): #0 0x00002af4d1020ce2 in select () from /lib/libc.so.6 #1 0x00002af4d16c69fd in runInput (pThrd=) at imuxsock.c:280 #2 0x000000000044373f in thrdStarter (arg=0x6a5db0) at ../threads.c:139 #3 0x00002af4d0934fc7 in start_thread () from /lib/libpthread.so.0 #4 0x00002af4d10275ad in clone () from /lib/libc.so.6 #5 0x0000000000000000 in ?? () Thread 2 (process 19311): #0 0x00002af4d093b7db in read () from /lib/libpthread.so.0 #1 0x00002af4d18cb1ef in klogLogKMsg () at linux.c:449 #2 0x00002af4d18ca594 in runInput (pThrd=0x6a9020) at imklog.c:224 #3 0x000000000044373f in thrdStarter (arg=0x6a9020) at ../threads.c:139 #4 0x00002af4d0934fc7 in start_thread () from /lib/libpthread.so.0 #5 0x00002af4d10275ad in clone () from /lib/libc.so.6 #6 0x0000000000000000 in ?? () Thread 1 (process 19309): #0 0x000000000041cb79 in msgDestruct (ppThis=0x68ae18) at msg.c:354 #1 0x0000000000443076 in actionCallAction (pAction=0x68ada0, pMsg=0x2aaaac0008c0) at ../action.c:774 #2 0x000000000040b307 in processMsgDoActions (pData=0x68ada0, pParam=0x41000e90) at syslogd.c:1140 #3 0x000000000041deb8 in llExecFunc (pThis=0x68ac10, pFunc=0x40b2b0 , pParam=0x41000e90) at linkedlist.c:391 #4 0x000000000040ae19 in msgConsumer (notNeeded=, pUsr=) at syslogd.c:1183 #5 0x000000000043c537 in queueConsumerReg (pThis=0x690050, pWti=0x6a3ce0, iCancelStateSave=) at queue.c:1598 #6 0x0000000000433010 in wtiWorker (pThis=0x6a3ce0) at wti.c:416 #7 0x000000000043176a in wtpWorker (arg=0x6a3ce0) at wtp.c:425 #8 0x00002af4d0934fc7 in start_thread () from /lib/libpthread.so.0 #9 0x00002af4d10275ad in clone () from /lib/libc.so.6 #10 0x0000000000000000 in ?? () (gdb) quit -------------- next part -------------- GNU gdb 6.8-debian Copyright (C) 2008 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-linux-gnu"... Reading symbols from /usr/lib/libz.so.1...done. Loaded symbols for /usr/lib/libz.so.1 Reading symbols from /lib/libpthread.so.0...Reading symbols from /usr/lib/debug/lib/libpthread-2.7.so...done. done. Loaded symbols for /lib/libpthread.so.0 Reading symbols from /lib/libdl.so.2...Reading symbols from /usr/lib/debug/lib/libdl-2.7.so...done. done. Loaded symbols for /lib/libdl.so.2 Reading symbols from /lib/librt.so.1...Reading symbols from /usr/lib/debug/lib/librt-2.7.so...done. done. Loaded symbols for /lib/librt.so.1 Reading symbols from /lib/libc.so.6...Reading symbols from /usr/lib/debug/lib/libc-2.7.so...done. done. Loaded symbols for /lib/libc.so.6 Reading symbols from /lib/ld-linux-x86-64.so.2...Reading symbols from /usr/lib/debug/lib/ld-2.7.so...done. done. Loaded symbols for /lib64/ld-linux-x86-64.so.2 Reading symbols from /usr/lib/rsyslog/lmnet.so...done. Loaded symbols for /usr/lib/rsyslog/lmnet.so Reading symbols from /lib/libnss_files.so.2...Reading symbols from /usr/lib/debug/lib/libnss_files-2.7.so...done. done. Loaded symbols for /lib/libnss_files.so.2 Reading symbols from /usr/lib/rsyslog/imuxsock.so...done. Loaded symbols for /usr/lib/rsyslog/imuxsock.so Reading symbols from /usr/lib/rsyslog/imklog.so...done. Loaded symbols for /usr/lib/rsyslog/imklog.so Reading symbols from /lib/libnss_compat.so.2...Reading symbols from /usr/lib/debug/lib/libnss_compat-2.7.so...done. done. Loaded symbols for /lib/libnss_compat.so.2 Reading symbols from /lib/libnsl.so.1...Reading symbols from /usr/lib/debug/lib/libnsl-2.7.so...done. done. Loaded symbols for /lib/libnsl.so.1 Reading symbols from /lib/libnss_nis.so.2...Reading symbols from /usr/lib/debug/lib/libnss_nis-2.7.so...done. done. Loaded symbols for /lib/libnss_nis.so.2 Reading symbols from /usr/lib/rsyslog/lmnetstrms.so...done. Loaded symbols for /usr/lib/rsyslog/lmnetstrms.so Reading symbols from /usr/lib/rsyslog/lmtcpclt.so...done. Loaded symbols for /usr/lib/rsyslog/lmtcpclt.so Reading symbols from /usr/lib/rsyslog/lmnsd_ptcp.so...done. Loaded symbols for /usr/lib/rsyslog/lmnsd_ptcp.so Core was generated by `rsyslogd -c4 -d'. Program terminated with signal 11, Segmentation fault. [New process 20676] [New process 20678] [New process 20677] [New process 20675] [New process 20674] #0 0x000000000041cb79 in msgDestruct (ppThis=0x68ace8) at msg.c:354 354 if(strcmp((char*)(((obj_t*)pThis)->pObjInfo->pszID), "msg")) { (gdb) Thread 5 (process 20674): #0 0x00002ab1af573ce2 in select () from /lib/libc.so.6 #1 0x000000000040db93 in mainThread () at syslogd.c:2704 #2 0x000000000040ee96 in realMain (argc=, argv=) at syslogd.c:3631 #3 0x00002ab1af4c91a6 in __libc_start_main () from /lib/libc.so.6 #4 0x000000000040a259 in _start () Thread 4 (process 20675): #0 0x00002ab1aee8bfad in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib/libpthread.so.0 #1 0x0000000000432f9f in wtiWorker (pThis=0x685140) at wti.c:406 #2 0x000000000043176a in wtpWorker (arg=0x685140) at wtp.c:425 #3 0x00002ab1aee87fc7 in start_thread () from /lib/libpthread.so.0 #4 0x00002ab1af57a5ad in clone () from /lib/libc.so.6 #5 0x0000000000000000 in ?? () Thread 3 (process 20677): #0 0x00002ab1af573ce2 in select () from /lib/libc.so.6 #1 0x00002ab1afc199fd in runInput (pThrd=) at imuxsock.c:280 #2 0x000000000044373f in thrdStarter (arg=0x6a5c80) at ../threads.c:139 #3 0x00002ab1aee87fc7 in start_thread () from /lib/libpthread.so.0 #4 0x00002ab1af57a5ad in clone () from /lib/libc.so.6 #5 0x0000000000000000 in ?? () Thread 2 (process 20678): #0 0x00002ab1aee8e7db in read () from /lib/libpthread.so.0 #1 0x00002ab1afe1e1ef in klogLogKMsg () at linux.c:449 #2 0x00002ab1afe1d594 in runInput (pThrd=0x6a8ef0) at imklog.c:224 #3 0x000000000044373f in thrdStarter (arg=0x6a8ef0) at ../threads.c:139 #4 0x00002ab1aee87fc7 in start_thread () from /lib/libpthread.so.0 #5 0x00002ab1af57a5ad in clone () from /lib/libc.so.6 #6 0x0000000000000000 in ?? () Thread 1 (process 20676): #0 0x000000000041cb79 in msgDestruct (ppThis=0x68ace8) at msg.c:354 #1 0x0000000000443076 in actionCallAction (pAction=0x68ac70, pMsg=0x6aee30) at ../action.c:774 #2 0x000000000040b307 in processMsgDoActions (pData=0x68ac70, pParam=0x41000e90) at syslogd.c:1140 #3 0x000000000041deb8 in llExecFunc (pThis=0x68aae0, pFunc=0x40b2b0 , pParam=0x41000e90) at linkedlist.c:391 #4 0x000000000040ae19 in msgConsumer (notNeeded=, pUsr=) at syslogd.c:1183 #5 0x000000000043c537 in queueConsumerReg (pThis=0x68ff20, pWti=0x6a3bb0, iCancelStateSave=) at queue.c:1598 #6 0x0000000000433010 in wtiWorker (pThis=0x6a3bb0) at wti.c:416 #7 0x000000000043176a in wtpWorker (arg=0x6a3bb0) at wtp.c:425 #8 0x00002ab1aee87fc7 in start_thread () from /lib/libpthread.so.0 #9 0x00002ab1af57a5ad in clone () from /lib/libc.so.6 #10 0x0000000000000000 in ?? () (gdb) quit -------------- next part -------------- GNU gdb 6.8-debian Copyright (C) 2008 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-linux-gnu"... Reading symbols from /usr/lib/libz.so.1...done. Loaded symbols for /usr/lib/libz.so.1 Reading symbols from /lib/libpthread.so.0...Reading symbols from /usr/lib/debug/lib/libpthread-2.7.so...done. done. Loaded symbols for /lib/libpthread.so.0 Reading symbols from /lib/libdl.so.2...Reading symbols from /usr/lib/debug/lib/libdl-2.7.so...done. done. Loaded symbols for /lib/libdl.so.2 Reading symbols from /lib/librt.so.1...Reading symbols from /usr/lib/debug/lib/librt-2.7.so...done. done. Loaded symbols for /lib/librt.so.1 Reading symbols from /lib/libc.so.6...Reading symbols from /usr/lib/debug/lib/libc-2.7.so...done. done. Loaded symbols for /lib/libc.so.6 Reading symbols from /lib/ld-linux-x86-64.so.2...Reading symbols from /usr/lib/debug/lib/ld-2.7.so...done. done. Loaded symbols for /lib64/ld-linux-x86-64.so.2 Reading symbols from /usr/lib/rsyslog/lmnet.so...done. Loaded symbols for /usr/lib/rsyslog/lmnet.so Reading symbols from /lib/libnss_files.so.2...Reading symbols from /usr/lib/debug/lib/libnss_files-2.7.so...done. done. Loaded symbols for /lib/libnss_files.so.2 Reading symbols from /usr/lib/rsyslog/imuxsock.so...done. Loaded symbols for /usr/lib/rsyslog/imuxsock.so Reading symbols from /usr/lib/rsyslog/imklog.so...done. Loaded symbols for /usr/lib/rsyslog/imklog.so Reading symbols from /lib/libnss_compat.so.2...Reading symbols from /usr/lib/debug/lib/libnss_compat-2.7.so...done. done. Loaded symbols for /lib/libnss_compat.so.2 Reading symbols from /lib/libnsl.so.1...Reading symbols from /usr/lib/debug/lib/libnsl-2.7.so...done. done. Loaded symbols for /lib/libnsl.so.1 Reading symbols from /lib/libnss_nis.so.2...Reading symbols from /usr/lib/debug/lib/libnss_nis-2.7.so...done. done. Loaded symbols for /lib/libnss_nis.so.2 Reading symbols from /usr/lib/rsyslog/lmnetstrms.so...done. Loaded symbols for /usr/lib/rsyslog/lmnetstrms.so Reading symbols from /usr/lib/rsyslog/lmtcpclt.so...done. Loaded symbols for /usr/lib/rsyslog/lmtcpclt.so Reading symbols from /usr/lib/rsyslog/lmnsd_ptcp.so...done. Loaded symbols for /usr/lib/rsyslog/lmnsd_ptcp.so Core was generated by `rsyslogd -c4 -d'. Program terminated with signal 6, Aborted. [New process 21096] [New process 21098] [New process 21097] [New process 21095] [New process 21094] #0 0x00002ac0a65dded5 in raise () from /lib/libc.so.6 (gdb) Thread 5 (process 21094): #0 0x00002ac0a6674ce2 in select () from /lib/libc.so.6 #1 0x000000000040db93 in mainThread () at syslogd.c:2704 #2 0x000000000040ee96 in realMain (argc=, argv=) at syslogd.c:3631 #3 0x00002ac0a65ca1a6 in __libc_start_main () from /lib/libc.so.6 #4 0x000000000040a259 in _start () Thread 4 (process 21095): #0 0x00002ac0a5f8cfad in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib/libpthread.so.0 #1 0x0000000000432f9f in wtiWorker (pThis=0x685140) at wti.c:406 #2 0x000000000043176a in wtpWorker (arg=0x685140) at wtp.c:425 #3 0x00002ac0a5f88fc7 in start_thread () from /lib/libpthread.so.0 #4 0x00002ac0a667b5ad in clone () from /lib/libc.so.6 #5 0x0000000000000000 in ?? () Thread 3 (process 21097): #0 0x00002ac0a6674ce2 in select () from /lib/libc.so.6 #1 0x00002ac0a6d1a9fd in runInput (pThrd=) at imuxsock.c:280 #2 0x000000000044373f in thrdStarter (arg=0x6a5c80) at ../threads.c:139 #3 0x00002ac0a5f88fc7 in start_thread () from /lib/libpthread.so.0 #4 0x00002ac0a667b5ad in clone () from /lib/libc.so.6 #5 0x0000000000000000 in ?? () Thread 2 (process 21098): #0 0x00002ac0a5f8f7db in read () from /lib/libpthread.so.0 #1 0x00002ac0a6f1f1ef in klogLogKMsg () at linux.c:449 #2 0x00002ac0a6f1e594 in runInput (pThrd=0x6a8ef0) at imklog.c:224 #3 0x000000000044373f in thrdStarter (arg=0x6a8ef0) at ../threads.c:139 #4 0x00002ac0a5f88fc7 in start_thread () from /lib/libpthread.so.0 #5 0x00002ac0a667b5ad in clone () from /lib/libc.so.6 #6 0x0000000000000000 in ?? () Thread 1 (process 21096): #0 0x00002ac0a65dded5 in raise () from /lib/libc.so.6 #1 0x00002ac0a65df3f3 in abort () from /lib/libc.so.6 #2 0x0000000000423697 in sigsegvHdlr (signum=6) at debug.c:759 #3 #4 0x00002ac0a65dded5 in raise () from /lib/libc.so.6 #5 0x00002ac0a65df3f3 in abort () from /lib/libc.so.6 #6 0x00002ac0a65d6dc9 in __assert_fail () from /lib/libc.so.6 #7 0x000000000043a4be in queueChkDiscardMsg (pThis=0x68ff20, iQueueSize=0, bRunsDA=0, pUsr=0x2aaaac002e30) at queue.c:1393 #8 0x000000000043bde3 in queueDequeueConsumable (pThis=0x68ff20, pWti=0x6a3bb0, iCancelStateSave=0) at queue.c:1478 #9 0x000000000043c4f1 in queueConsumerReg (pThis=0x68ff20, pWti=0x6a3bb0, iCancelStateSave=0) at queue.c:1597 #10 0x0000000000433010 in wtiWorker (pThis=0x6a3bb0) at wti.c:416 #11 0x000000000043176a in wtpWorker (arg=0x6a3bb0) at wtp.c:425 #12 0x00002ac0a5f88fc7 in start_thread () from /lib/libpthread.so.0 #13 0x00002ac0a667b5ad in clone () from /lib/libc.so.6 #14 0x0000000000000000 in ?? () (gdb) quit From lorenzo at sancho.ccd.uniroma2.it Fri Jan 16 17:10:29 2009 From: lorenzo at sancho.ccd.uniroma2.it (Lorenzo M. Catucci) Date: Fri, 16 Jan 2009 17:10:29 +0100 (CET) Subject: [rsyslog] rsyslog still crashes In-Reply-To: References: <1232046859.22744.39.camel@localhost.localdomain><577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44F9C2@grfint2.intern.adiscon.com> Message-ID: On Fri, 16 Jan 2009, Lorenzo M. Catucci wrote: LMC> LMC> The -n crash was completely silent; the -d run was chatty (as expected); LMC> with stdout redirected, it took a lot more time to crash, but here are LMC> both the logfile and the gdb backtrace. LMC> As for the last crash, I found on the screen session the line: rsyslogd: queue.c:1393: queueChkDiscardMsg: Assertion `(unsigned) ((obj_t*)(pUsr))->iObjCooCKiE == (unsigned) 0xBADEFEE' failed. since I forgot redirecting stderr too. Yours, lorenzo +-------------------------+----------------------------------------------+ | Lorenzo M. Catucci | Centro di Calcolo e Documentazione | | catucci at ccd.uniroma2.it | Universit? degli Studi di Roma "Tor Vergata" | | | Via O. Raimondo 18 ** I-00173 ROMA ** ITALY | | Tel. +39 06 7259 2255 | Fax. +39 06 7259 2125 | +-------------------------+----------------------------------------------+ From rgerhards at hq.adiscon.com Fri Jan 16 17:17:25 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Fri, 16 Jan 2009 17:17:25 +0100 Subject: [rsyslog] rsyslog still crashes In-Reply-To: References: <1232046859.22744.39.camel@localhost.localdomain><577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9C2@grfint2.intern.adiscon.com> Message-ID: <577465F99B41C842AAFBE9ED71E70ABA44F9C7@grfint2.intern.adiscon.com> Ok, this together with the others is evidence that something runs really wild and overwrites memory blocks. The reason this message did not appear earlier is that I disable the check in DestroyMsg() and permit it to return even though I then know memory is corrupted. So what you see here is a follow-up error. The good news, I think, is that it looks (but may fool me) like the issue seems to be in temporal proximity of the abort. That would be really good news. Let me think a bit about the situation, I'll probably come up with another instrumentation. The issue is that I'd potentially need to output one or even two log lines per message, and that creates other sync issues. Plus, I don't know if I overrun your disk with that (depending on workload, which seems to be quite high). Rainer > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of Lorenzo M. Catucci > Sent: Friday, January 16, 2009 5:10 PM > To: rsyslog-users > Subject: Re: [rsyslog] rsyslog still crashes > > On Fri, 16 Jan 2009, Lorenzo M. Catucci wrote: > > LMC> > LMC> The -n crash was completely silent; the -d run was chatty (as > expected); > LMC> with stdout redirected, it took a lot more time to crash, but here > are > LMC> both the logfile and the gdb backtrace. > LMC> > > As for the last crash, I found on the screen session the line: > > rsyslogd: queue.c:1393: queueChkDiscardMsg: Assertion `(unsigned) > ((obj_t*)(pUsr))->iObjCooCKiE == (unsigned) 0xBADEFEE' failed. > > since I forgot redirecting stderr too. > > Yours, > > lorenzo > > +-------------------------+-------------------------------------------- > --+ > | Lorenzo M. Catucci | Centro di Calcolo e Documentazione > | > | catucci at ccd.uniroma2.it | Universit? degli Studi di Roma "Tor > Vergata" | > | | Via O. Raimondo 18 ** I-00173 ROMA ** > ITALY | > | Tel. +39 06 7259 2255 | Fax. +39 06 7259 2125 > | > +-------------------------+-------------------------------------------- > --+ From rgerhards at hq.adiscon.com Fri Jan 16 17:19:34 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Fri, 16 Jan 2009 17:19:34 +0100 Subject: [rsyslog] rsyslog still crashes In-Reply-To: References: <1232046859.22744.39.camel@localhost.localdomain><577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9C2@grfint2.intern.adiscon.com> Message-ID: <577465F99B41C842AAFBE9ED71E70ABA44F9C8@grfint2.intern.adiscon.com> Lorenzo, one thing: can you change the actionqueuemode to "direct" just for a short period. I would be very interested to see what happens. Rainer > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of Lorenzo M. Catucci > Sent: Friday, January 16, 2009 5:10 PM > To: rsyslog-users > Subject: Re: [rsyslog] rsyslog still crashes > > On Fri, 16 Jan 2009, Lorenzo M. Catucci wrote: > > LMC> > LMC> The -n crash was completely silent; the -d run was chatty (as > expected); > LMC> with stdout redirected, it took a lot more time to crash, but here > are > LMC> both the logfile and the gdb backtrace. > LMC> > > As for the last crash, I found on the screen session the line: > > rsyslogd: queue.c:1393: queueChkDiscardMsg: Assertion `(unsigned) > ((obj_t*)(pUsr))->iObjCooCKiE == (unsigned) 0xBADEFEE' failed. > > since I forgot redirecting stderr too. > > Yours, > > lorenzo > > +-------------------------+-------------------------------------------- > --+ > | Lorenzo M. Catucci | Centro di Calcolo e Documentazione > | > | catucci at ccd.uniroma2.it | Universit? degli Studi di Roma "Tor > Vergata" | > | | Via O. Raimondo 18 ** I-00173 ROMA ** > ITALY | > | Tel. +39 06 7259 2255 | Fax. +39 06 7259 2125 > | > +-------------------------+-------------------------------------------- > --+ From rgerhards at hq.adiscon.com Fri Jan 16 17:47:02 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Fri, 16 Jan 2009 17:47:02 +0100 Subject: [rsyslog] rsyslog still crashes In-Reply-To: <577465F99B41C842AAFBE9ED71E70ABA44F9C8@grfint2.intern.adiscon.com> References: <1232046859.22744.39.camel@localhost.localdomain><577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9C2@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44F9C8@grfint2.intern.adiscon.com> Message-ID: <577465F99B41C842AAFBE9ED71E70ABA44F9C9@grfint2.intern.adiscon.com> Lorenzo and others: I hopefully got a system today where I can reproduce. I am setting it up right now. I also have written a stub wiki page with information useful to hunt this bug: http://wiki.rsyslog.com/index.php/V3_Race_Condition_Hunt_Page Lorenzo, can you please double-check I have used the right config indeed. All others: if you can add scenarios/information, please do. I'll try to repro the problem as soon as the system is ready. Hope it will work... Rainer > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of Rainer Gerhards > Sent: Friday, January 16, 2009 5:20 PM > To: rsyslog-users > Subject: Re: [rsyslog] rsyslog still crashes > > Lorenzo, > > one thing: can you change the actionqueuemode to "direct" just for a > short period. I would be very interested to see what happens. > > Rainer > > > -----Original Message----- > > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > > bounces at lists.adiscon.com] On Behalf Of Lorenzo M. Catucci > > Sent: Friday, January 16, 2009 5:10 PM > > To: rsyslog-users > > Subject: Re: [rsyslog] rsyslog still crashes > > > > On Fri, 16 Jan 2009, Lorenzo M. Catucci wrote: > > > > LMC> > > LMC> The -n crash was completely silent; the -d run was chatty (as > > expected); > > LMC> with stdout redirected, it took a lot more time to crash, but > here > > are > > LMC> both the logfile and the gdb backtrace. > > LMC> > > > > As for the last crash, I found on the screen session the line: > > > > rsyslogd: queue.c:1393: queueChkDiscardMsg: Assertion `(unsigned) > > ((obj_t*)(pUsr))->iObjCooCKiE == (unsigned) 0xBADEFEE' failed. > > > > since I forgot redirecting stderr too. > > > > Yours, > > > > lorenzo > > > > +-------------------------+------------------------------------------ > -- > > --+ > > | Lorenzo M. Catucci | Centro di Calcolo e Documentazione > > | > > | catucci at ccd.uniroma2.it | Universit? degli Studi di Roma "Tor > > Vergata" | > > | | Via O. Raimondo 18 ** I-00173 ROMA ** > > ITALY | > > | Tel. +39 06 7259 2255 | Fax. +39 06 7259 2125 > > | > > +-------------------------+------------------------------------------ > -- > > --+ > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com From lorenzo at sancho.ccd.uniroma2.it Fri Jan 16 17:52:28 2009 From: lorenzo at sancho.ccd.uniroma2.it (Lorenzo M. Catucci) Date: Fri, 16 Jan 2009 17:52:28 +0100 (CET) Subject: [rsyslog] rsyslog still crashes In-Reply-To: <577465F99B41C842AAFBE9ED71E70ABA44F9C8@grfint2.intern.adiscon.com> References: <1232046859.22744.39.camel@localhost.localdomain><577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9C2@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44F9C8@grfint2.intern.adiscon.com> Message-ID: On Fri, 16 Jan 2009, Rainer Gerhards wrote: RG> Lorenzo, RG> RG> one thing: can you change the actionqueuemode to "direct" just for a RG> short period. I would be very interested to see what happens. RG> Very short period... it crashed about as soon as started... I'm enclosing both the log and the backtrace. See you soon, lorenzo +-------------------------+----------------------------------------------+ | Lorenzo M. Catucci | Centro di Calcolo e Documentazione | | catucci at ccd.uniroma2.it | Universit? degli Studi di Roma "Tor Vergata" | | | Via O. Raimondo 18 ** I-00173 ROMA ** ITALY | | Tel. +39 06 7259 2255 | Fax. +39 06 7259 2125 | +-------------------------+----------------------------------------------+ -------------- next part -------------- GNU gdb 6.8-debian Copyright (C) 2008 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-linux-gnu"... Reading symbols from /usr/lib/libz.so.1...done. Loaded symbols for /usr/lib/libz.so.1 Reading symbols from /lib/libpthread.so.0...Reading symbols from /usr/lib/debug/lib/libpthread-2.7.so...done. done. Loaded symbols for /lib/libpthread.so.0 Reading symbols from /lib/libdl.so.2...Reading symbols from /usr/lib/debug/lib/libdl-2.7.so...done. done. Loaded symbols for /lib/libdl.so.2 Reading symbols from /lib/librt.so.1...Reading symbols from /usr/lib/debug/lib/librt-2.7.so...done. done. Loaded symbols for /lib/librt.so.1 Reading symbols from /lib/libc.so.6...Reading symbols from /usr/lib/debug/lib/libc-2.7.so...done. done. Loaded symbols for /lib/libc.so.6 Reading symbols from /lib/ld-linux-x86-64.so.2...Reading symbols from /usr/lib/debug/lib/ld-2.7.so...done. done. Loaded symbols for /lib64/ld-linux-x86-64.so.2 Reading symbols from /usr/lib/rsyslog/lmnet.so...done. Loaded symbols for /usr/lib/rsyslog/lmnet.so Reading symbols from /lib/libnss_files.so.2...Reading symbols from /usr/lib/debug/lib/libnss_files-2.7.so...done. done. Loaded symbols for /lib/libnss_files.so.2 Reading symbols from /usr/lib/rsyslog/imuxsock.so...done. Loaded symbols for /usr/lib/rsyslog/imuxsock.so Reading symbols from /usr/lib/rsyslog/imklog.so...done. Loaded symbols for /usr/lib/rsyslog/imklog.so Reading symbols from /lib/libnss_compat.so.2...Reading symbols from /usr/lib/debug/lib/libnss_compat-2.7.so...done. done. Loaded symbols for /lib/libnss_compat.so.2 Reading symbols from /lib/libnsl.so.1...Reading symbols from /usr/lib/debug/lib/libnsl-2.7.so...done. done. Loaded symbols for /lib/libnsl.so.1 Reading symbols from /lib/libnss_nis.so.2...Reading symbols from /usr/lib/debug/lib/libnss_nis-2.7.so...done. done. Loaded symbols for /lib/libnss_nis.so.2 Reading symbols from /usr/lib/rsyslog/lmnetstrms.so...done. Loaded symbols for /usr/lib/rsyslog/lmnetstrms.so Reading symbols from /usr/lib/rsyslog/lmtcpclt.so...done. Loaded symbols for /usr/lib/rsyslog/lmtcpclt.so Core was generated by `rsyslogd -c4 -d'. Program terminated with signal 11, Segmentation fault. [New process 27339] [New process 27341] [New process 27340] [New process 27338] #0 0x00002b034900f030 in strlen () from /lib/libc.so.6 (gdb) Thread 4 (process 27338): #0 0x00002b03489774c5 in __lll_unlock_wake () from /lib/libpthread.so.0 #1 0x00002b0348973ff9 in _L_unlock_56 () from /lib/libpthread.so.0 #2 0x00002b0348973c56 in __pthread_mutex_unlock_usercnt () from /lib/libpthread.so.0 #3 0x0000000000422a09 in dbgprint (pObj=, pszMsg=0x7fff6256dea0 " X ", lenMsg=3) at debug.c:157 #4 0x0000000000422c33 in dbgprintf (fmt=) at debug.c:892 #5 0x000000000040d522 in init () at syslogd.c:2207 #6 0x000000000040da79 in mainThread () at syslogd.c:2954 #7 0x000000000040ee96 in realMain (argc=, argv=) at syslogd.c:3631 #8 0x00002b0348fb21a6 in __libc_start_main () from /lib/libc.so.6 #9 0x000000000040a259 in _start () Thread 3 (process 27340): #0 0x00002b034905cce2 in select () from /lib/libc.so.6 #1 0x00002b03497029fd in runInput (pThrd=) at imuxsock.c:280 #2 0x000000000044377f in thrdStarter (arg=0x6a3b90) at ../threads.c:139 #3 0x00002b0348970fc7 in start_thread () from /lib/libpthread.so.0 #4 0x00002b03490635ad in clone () from /lib/libc.so.6 #5 0x0000000000000000 in ?? () Thread 2 (process 27341): #0 0x00002b03489777db in read () from /lib/libpthread.so.0 #1 0x00002b03499071ef in klogLogKMsg () at linux.c:449 #2 0x00002b0349906594 in runInput (pThrd=0x6a6b90) at imklog.c:224 #3 0x000000000044377f in thrdStarter (arg=0x6a6b90) at ../threads.c:139 #4 0x00002b0348970fc7 in start_thread () from /lib/libpthread.so.0 #5 0x00002b03490635ad in clone () from /lib/libc.so.6 #6 0x0000000000000000 in ?? () Thread 1 (process 27339): #0 0x00002b034900f030 in strlen () from /lib/libc.so.6 #1 0x00002b0348fdbcb1 in vfprintf () from /lib/libc.so.6 #2 0x00002b0348fe1c08 in fprintf () from /lib/libc.so.6 #3 0x000000000041ce7d in msgDestruct (ppThis=) at msg.c:350 #4 0x000000000044283a in actionCallDoAction (pAction=0x6856d0, pMsg=0x6a41f0) at ../action.c:495 #5 0x0000000000439c9c in qAddDirect (pThis=0x6857e0, pUsr=0x6a41f0) at queue.c:939 #6 0x000000000043dd83 in queueEnqObj (pThis=0x6857e0, flowCtlType=, pUsr=0x6a41f0) at queue.c:1016 #7 0x0000000000442d63 in actionWriteToAction (pAction=0x6856d0) at ../action.c:672 #8 0x00000000004430d0 in actionCallAction (pAction=0x6856d0, pMsg=0x6a41f0) at ../action.c:778 #9 0x000000000040b307 in processMsgDoActions (pData=0x6856d0, pParam=0x407ffe90) at syslogd.c:1140 #10 0x000000000041def8 in llExecFunc (pThis=0x685540, pFunc=0x40b2b0 , pParam=0x407ffe90) at linkedlist.c:391 #11 0x000000000040ae19 in msgConsumer (notNeeded=, pUsr=) at syslogd.c:1183 #12 0x000000000043c577 in queueConsumerReg (pThis=0x68cc80, pWti=0x6a1030, iCancelStateSave=) at queue.c:1598 #13 0x0000000000433050 in wtiWorker (pThis=0x6a1030) at wti.c:416 #14 0x00000000004317aa in wtpWorker (arg=0x6a1030) at wtp.c:425 #15 0x00002b0348970fc7 in start_thread () from /lib/libpthread.so.0 #16 0x00002b03490635ad in clone () from /lib/libc.so.6 #17 0x0000000000000000 in ?? () (gdb) quit -------------- next part -------------- 4037.620068405:main thread: Writing pidfile /var/run/rsyslogd.pid. 4037.620491470:main thread: rsyslog 4.1.3 - called init() 4037.620502795:main thread: Unloading non-static modules. 4037.620513481:main thread: module lmnet NOT unloaded because it still has a refcount of 3 4037.620522445:main thread: Clearing templates. 4037.620569724:main thread: cfline: '$ModLoad imuxsock # provides support for local system logging' 4037.620585477:main thread: Requested to load module 'imuxsock' 4037.620596298:main thread: loading module '/usr/lib/rsyslog/imuxsock.so' 4037.620662954:main thread: imuxsock version 4.1.3 initializing 4037.620699263:main thread: module of type 0 being loaded. 4037.620712772:main thread: cfline: '$ModLoad imklog # provides kernel logging support (previously done by rklogd)' 4037.620724718:main thread: Requested to load module 'imklog' 4037.620733972:main thread: loading module '/usr/lib/rsyslog/imklog.so' 4037.620847557:main thread: module of type 0 being loaded. 4037.620864846:main thread: cfline: '$ActionFileDefaultTemplate RSYSLOG_TraditionalFileFormat' 4037.620884928:main thread: cfline: '$FileOwner root' 4037.621151637:main thread: uid 0 obtained for user 'root' 4037.621164483:main thread: cfline: '$FileGroup adm' 4037.621221737:main thread: gid 4 obtained for group 'adm' 4037.621233731:main thread: cfline: '$FileCreateMode 0640' 4037.621247204:main thread: cfline: '$IncludeConfig /etc/rsyslog.d/*.conf' 4037.621306972:main thread: requested to include config file '/etc/rsyslog.d/remote.conf' 4037.621334470:main thread: cfline: '$WorkDirectory /var/log/rsyslog' 4037.621352254:main thread: cfline: '$ActionQueueType Direct # use synchronous processing' 4037.621692792:main thread: action queue type set to DIRECT (no queueing at all) 4037.621705098:main thread: cfline: '$ActionQueueFileName srvrfwd # set file name, also enables disk mode' 4037.621720665:main thread: cfline: '$ActionResumeRetryCount -1 # infinite retries on insert failure' 4037.621734291:main thread: cfline: '$ActionQueueSaveOnShutdown on # save in-memory data if rsyslog shuts down' 4037.621748715:main thread: cfline: 'mail.* @@xx.yy.zz.tt:514' 4037.621761573:main thread: - traditional PRI filter 4037.621771329:main thread: symbolic name: * ==> 255 4037.621783748:main thread: symbolic name: mail ==> 16 4037.621800473:main thread: tried selector action for builtin-file: -2001 4037.621816553:main thread: caller requested object 'netstrms', not found (iRet -3003) 4037.621829132:main thread: Requested to load module 'lmnetstrms' 4037.621839089:main thread: loading module '/usr/lib/rsyslog/lmnetstrms.so' 4037.621919155:main thread: module of type 2 being loaded. 4037.621932301:main thread: source file omfwd.c requested reference for module 'lmnetstrms', reference count now 1 4037.621945375:main thread: source file omfwd.c requested reference for module 'lmnetstrms', reference count now 2 4037.621960807:main thread: caller requested object 'tcpclt', not found (iRet -3003) 4037.621970535:main thread: Requested to load module 'lmtcpclt' 4037.621979727:main thread: loading module '/usr/lib/rsyslog/lmtcpclt.so' 4037.622039220:main thread: module of type 2 being loaded. 4037.622051937:main thread: source file omfwd.c requested reference for module 'lmtcpclt', reference count now 1 4037.622064386:main thread: hostname 'xx.yy.zz.tt', port '514' 4037.622084093:main thread: tried selector action for builtin-fwd: 0 4037.622095973:main thread: Module builtin-fwd processed this config line. 4037.622111045:main thread: template: 'RSYSLOG_TraditionalForwardFormat' assigned 4037.622134550:main thread: action 1 queue: save on shutdown 1, max disk space allowed 0 4037.622153957:main thread: action 1 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 4037.622166394:main thread: Action 0x6838c0: queue 0x683d60 created 4037.622179432:main thread: cfline: '$ActionExecOnlyWhenPreviousIsSuspended on' 4037.622192407:main thread: cfline: '& /data/var_syslog/failover.log' 4037.622218048:main thread: tried selector action for builtin-file: 0 4037.622239084:main thread: Module builtin-file processed this config line. 4037.622249944:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 4037.622264904:main thread: action 2 queue: save on shutdown 1, max disk space allowed 0 4037.622278185:main thread: action 2 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 4037.622289525:main thread: Action 0x684b30: queue 0x684d70 created 4037.622300676:main thread: cfline: '$ActionExecOnlyWhenPreviousIsSuspended off' 4037.622315313:main thread: selector line successfully processed 4037.622335353:main thread: cfline: 'auth,authpriv.* /var/log/auth.log' 4037.622346713:main thread: - traditional PRI filter 4037.622355695:main thread: symbolic name: * ==> 255 4037.622367074:main thread: symbolic name: auth ==> 32 4037.622378090:main thread: symbolic name: authpriv ==> 80 4037.622399801:main thread: tried selector action for builtin-file: 0 4037.622409569:main thread: Module builtin-file processed this config line. 4037.622419853:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 4037.622431973:main thread: action 3 queue: save on shutdown 1, max disk space allowed 0 4037.622445019:main thread: action 3 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 4037.622456983:main thread: Action 0x685160: queue 0x685220 created 4037.622467966:main thread: cfline: '*.*;auth,authpriv.none -/var/log/syslog' 4037.622477221:main thread: selector line successfully processed 4037.622486077:main thread: - traditional PRI filter 4037.622494606:main thread: symbolic name: * ==> 255 4037.622506225:main thread: symbolic name: none ==> 16 4037.622517007:main thread: symbolic name: auth ==> 32 4037.622527927:main thread: symbolic name: authpriv ==> 80 4037.622547618:main thread: tried selector action for builtin-file: 0 4037.622557092:main thread: Module builtin-file processed this config line. 4037.622567055:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 4037.622578953:main thread: action 4 queue: save on shutdown 1, max disk space allowed 0 4037.622591601:main thread: action 4 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 4037.622603373:main thread: Action 0x6856d0: queue 0x6857e0 created 4037.622614425:main thread: cfline: 'daemon.* -/var/log/daemon.log' 4037.622623611:main thread: selector line successfully processed 4037.622632946:main thread: - traditional PRI filter 4037.622641538:main thread: symbolic name: * ==> 255 4037.622652635:main thread: symbolic name: daemon ==> 24 4037.622672048:main thread: tried selector action for builtin-file: 0 4037.622681333:main thread: Module builtin-file processed this config line. 4037.622690864:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 4037.622704736:main thread: action 5 queue: save on shutdown 1, max disk space allowed 0 4037.622718299:main thread: action 5 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 4037.622730175:main thread: Action 0x685cb0: queue 0x685dc0 created 4037.622740990:main thread: cfline: 'kern.* -/var/log/kern.log' 4037.622749924:main thread: selector line successfully processed 4037.622759053:main thread: - traditional PRI filter 4037.622767804:main thread: symbolic name: * ==> 255 4037.622779282:main thread: symbolic name: kern ==> 0 4037.622799130:main thread: tried selector action for builtin-file: 0 4037.622808619:main thread: Module builtin-file processed this config line. 4037.622818753:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 4037.622830206:main thread: action 6 queue: save on shutdown 1, max disk space allowed 0 4037.622842911:main thread: action 6 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 4037.622854803:main thread: Action 0x686290: queue 0x6863a0 created 4037.622865624:main thread: cfline: 'lpr.* -/var/log/lpr.log' 4037.622874702:main thread: selector line successfully processed 4037.622883912:main thread: - traditional PRI filter 4037.622904459:main thread: symbolic name: * ==> 255 4037.622915496:main thread: symbolic name: lpr ==> 48 4037.622935076:main thread: tried selector action for builtin-file: 0 4037.622944394:main thread: Module builtin-file processed this config line. 4037.622953982:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 4037.622965406:main thread: action 7 queue: save on shutdown 1, max disk space allowed 0 4037.622978123:main thread: action 7 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 4037.622989985:main thread: Action 0x686870: queue 0x686980 created 4037.623000683:main thread: cfline: 'mail.* -/var/log/mail.log' 4037.623009707:main thread: selector line successfully processed 4037.623018565:main thread: - traditional PRI filter 4037.623027088:main thread: symbolic name: * ==> 255 4037.623038884:main thread: symbolic name: mail ==> 16 4037.623058105:main thread: tried selector action for builtin-file: 0 4037.623067588:main thread: Module builtin-file processed this config line. 4037.623077685:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 4037.623093423:main thread: action 8 queue: save on shutdown 1, max disk space allowed 0 4037.623107052:main thread: action 8 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 4037.623118908:main thread: Action 0x686e50: queue 0x686f60 created 4037.623129726:main thread: cfline: 'user.* -/var/log/user.log' 4037.623138774:main thread: selector line successfully processed 4037.623147684:main thread: - traditional PRI filter 4037.623156198:main thread: symbolic name: * ==> 255 4037.623167187:main thread: symbolic name: user ==> 8 4037.623186686:main thread: tried selector action for builtin-file: 0 4037.623196019:main thread: Module builtin-file processed this config line. 4037.623205766:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 4037.623217211:main thread: action 9 queue: save on shutdown 1, max disk space allowed 0 4037.623229541:main thread: action 9 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 4037.623240500:main thread: Action 0x6873f0: queue 0x687500 created 4037.623252272:main thread: cfline: 'mail.info -/var/log/mail.info' 4037.623261136:main thread: selector line successfully processed 4037.623269866:main thread: - traditional PRI filter 4037.623278671:main thread: symbolic name: info ==> 6 4037.623289546:main thread: symbolic name: mail ==> 16 4037.623308401:main thread: tried selector action for builtin-file: 0 4037.623317689:main thread: Module builtin-file processed this config line. 4037.623327277:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 4037.623338569:main thread: action 10 queue: save on shutdown 1, max disk space allowed 0 4037.623351333:main thread: action 10 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 4037.623362865:main thread: Action 0x6879d0: queue 0x687ae0 created 4037.623373563:main thread: cfline: 'mail.warn -/var/log/mail.warn' 4037.623382608:main thread: selector line successfully processed 4037.623391311:main thread: - traditional PRI filter 4037.623399873:main thread: symbolic name: warn ==> 4 4037.623410589:main thread: symbolic name: mail ==> 16 4037.623429414:main thread: tried selector action for builtin-file: 0 4037.623438681:main thread: Module builtin-file processed this config line. 4037.623451643:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 4037.623463664:main thread: action 11 queue: save on shutdown 1, max disk space allowed 0 4037.623476036:main thread: action 11 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 4037.623486893:main thread: Action 0x687fb0: queue 0x6880c0 created 4037.623497465:main thread: cfline: 'mail.err /var/log/mail.err' 4037.623506468:main thread: selector line successfully processed 4037.623515453:main thread: - traditional PRI filter 4037.623523865:main thread: symbolic name: err ==> 3 4037.623545812:main thread: symbolic name: mail ==> 16 4037.623566230:main thread: tried selector action for builtin-file: 0 4037.623575947:main thread: Module builtin-file processed this config line. 4037.623585871:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 4037.623597019:main thread: action 12 queue: save on shutdown 1, max disk space allowed 0 4037.623609634:main thread: action 12 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 4037.623621292:main thread: Action 0x688590: queue 0x6886a0 created 4037.623632775:main thread: cfline: 'news.crit /var/log/news/news.crit' 4037.623642228:main thread: selector line successfully processed 4037.623651312:main thread: - traditional PRI filter 4037.623660168:main thread: symbolic name: crit ==> 2 4037.623671004:main thread: symbolic name: news ==> 56 4037.623692517:main thread: tried selector action for builtin-file: 0 4037.623701901:main thread: Module builtin-file processed this config line. 4037.623711765:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 4037.623723191:main thread: action 13 queue: save on shutdown 1, max disk space allowed 0 4037.623735872:main thread: action 13 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 4037.623747536:main thread: Action 0x688b70: queue 0x688c80 created 4037.623758651:main thread: cfline: 'news.err /var/log/news/news.err' 4037.623767741:main thread: selector line successfully processed 4037.623776690:main thread: - traditional PRI filter 4037.623785240:main thread: symbolic name: err ==> 3 4037.623796478:main thread: symbolic name: news ==> 56 4037.623819517:main thread: tried selector action for builtin-file: 0 4037.623829048:main thread: Module builtin-file processed this config line. 4037.623838879:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 4037.623850438:main thread: action 14 queue: save on shutdown 1, max disk space allowed 0 4037.623862924:main thread: action 14 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 4037.623873871:main thread: Action 0x689150: queue 0x689260 created 4037.623884569:main thread: cfline: 'news.notice -/var/log/news/news.notice' 4037.623893560:main thread: selector line successfully processed 4037.623902664:main thread: - traditional PRI filter 4037.623911415:main thread: symbolic name: notice ==> 5 4037.623922467:main thread: symbolic name: news ==> 56 4037.623942264:main thread: tried selector action for builtin-file: 0 4037.623951402:main thread: Module builtin-file processed this config line. 4037.623961122:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 4037.623972360:main thread: action 15 queue: save on shutdown 1, max disk space allowed 0 4037.623985014:main thread: action 15 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 4037.623996926:main thread: Action 0x689730: queue 0x689840 created 4037.624009085:main thread: cfline: '*.=debug;auth,authpriv.none;news.none;mail.none -/var/log/debug' 4037.624018460:main thread: selector line successfully processed 4037.624027550:main thread: - traditional PRI filter 4037.624036394:main thread: symbolic name: debug ==> 7 4037.624047617:main thread: symbolic name: none ==> 16 4037.624058183:main thread: symbolic name: auth ==> 32 4037.624069187:main thread: symbolic name: authpriv ==> 80 4037.624080178:main thread: symbolic name: none ==> 16 4037.624090699:main thread: symbolic name: news ==> 56 4037.624101499:main thread: symbolic name: none ==> 16 4037.624112416:main thread: symbolic name: mail ==> 16 4037.624131976:main thread: tried selector action for builtin-file: 0 4037.624141360:main thread: Module builtin-file processed this config line. 4037.624151527:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 4037.624166254:main thread: action 16 queue: save on shutdown 1, max disk space allowed 0 4037.624179048:main thread: action 16 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 4037.624203996:main thread: Action 0x689d10: queue 0x689e20 created 4037.624216560:main thread: cfline: '*.=info;*.=notice;*.=warn;auth,authpriv.none;cron,daemon.none;mail,news.none -/var/log/messages' 4037.624225710:main thread: selector line successfully processed 4037.624234992:main thread: - traditional PRI filter 4037.624243941:main thread: symbolic name: info ==> 6 4037.624255317:main thread: symbolic name: notice ==> 5 4037.624266620:main thread: symbolic name: warn ==> 4 4037.624277663:main thread: symbolic name: none ==> 16 4037.624288730:main thread: symbolic name: auth ==> 32 4037.624299497:main thread: symbolic name: authpriv ==> 80 4037.624310429:main thread: symbolic name: none ==> 16 4037.624321088:main thread: symbolic name: cron ==> 72 4037.624331828:main thread: symbolic name: daemon ==> 24 4037.624342664:main thread: symbolic name: none ==> 16 4037.624353199:main thread: symbolic name: mail ==> 16 4037.624363960:main thread: symbolic name: news ==> 56 4037.624383361:main thread: tried selector action for builtin-file: 0 4037.624392931:main thread: Module builtin-file processed this config line. 4037.624402870:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 4037.624414390:main thread: action 17 queue: save on shutdown 1, max disk space allowed 0 4037.624427209:main thread: action 17 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 4037.624438942:main thread: Action 0x68a2f0: queue 0x68a400 created 4037.624450554:main thread: cfline: '*.emerg *' 4037.624459350:main thread: selector line successfully processed 4037.624468485:main thread: - traditional PRI filter 4037.624477275:main thread: symbolic name: emerg ==> 0 4037.624489113:main thread: tried selector action for builtin-file: -2001 4037.624498587:main thread: tried selector action for builtin-fwd: -2001 4037.624509258:main thread: tried selector action for builtin-shell: -2001 4037.624519854:main thread: tried selector action for builtin-discard: -2001 4037.624531161:main thread: write-alltried selector action for builtin-usrmsg: 0 4037.624543715:main thread: Module builtin-usrmsg processed this config line. 4037.624553426:main thread: template: ' WallFmt' assigned 4037.624568261:main thread: action 18 queue: save on shutdown 1, max disk space allowed 0 4037.624581266:main thread: action 18 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 4037.624592975:main thread: Action 0x68ad40: queue 0x68af50 created 4037.624608143:main thread: cfline: 'daemon.*;mail.*;news.err;*.=debug;*.=info;*.=notice;*.=warn |/dev/xconsole' 4037.624617917:main thread: selector line successfully processed 4037.624627063:main thread: - traditional PRI filter 4037.624635829:main thread: symbolic name: * ==> 255 4037.624646719:main thread: symbolic name: daemon ==> 24 4037.624657687:main thread: symbolic name: * ==> 255 4037.624668442:main thread: symbolic name: mail ==> 16 4037.624679359:main thread: symbolic name: err ==> 3 4037.624689994:main thread: symbolic name: news ==> 56 4037.624700698:main thread: symbolic name: debug ==> 7 4037.624711852:main thread: symbolic name: info ==> 6 4037.624722777:main thread: symbolic name: notice ==> 5 4037.624733886:main thread: symbolic name: warn ==> 4 4037.624753131:main thread: Error opening log file: /dev/xconsole 4037.624764081:main thread: Called LogError, msg: /dev/xconsole rsyslogd: /dev/xconsole 4037.624834841:main thread: tried selector action for builtin-file: 0 4037.624844138:main thread: Module builtin-file processed this config line. 4037.624854248:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 4037.624866050:main thread: action 19 queue: save on shutdown 1, max disk space allowed 0 4037.624878512:main thread: action 19 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 4037.624889537:main thread: Action 0x68c870: queue 0x68c980 created 4037.624901089:main thread: selector line successfully processed 4037.624925545:main thread: main queue: is NOT disk-assisted 4037.624949380:main thread: main queue: type 0, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 4037.624967146:main thread: main queue:Reg: finalizing construction of worker thread pool 4037.624985371:main thread: main queue:Reg/w0: finalizing construction of worker instance data 4037.624994322:main thread: main queue: queue starts up without (loading) any DA disk state (this is normal for the DA queue itself!) 4037.625008485:main thread: main queue:Reg: high activity - starting 1 additional worker thread(s). 4037.625021502:main thread: main queue:Reg/w0: receiving command 2 4037.625062410:main thread: main queue:Reg: started with state 0, num workers now 1 4037.625097359:main thread: Main processing queue is initialized and running 4037.625132246:main thread: Opened UNIX socket '/dev/log' (fd 3). 4037.625198155:main thread: main queue: entry added, size now 1 entries 4037.625212867:main thread: wtpAdviseMaxWorkers signals busy 4037.625224705:main thread: main queue: EnqueueMsg advised worker start 4037.625241685:40800950: main queue:Reg/w0: receiving command 4 4037.625272671:imuxsock.c: --------imuxsock calling select, active file descriptors (max 3): 3 4037.625309667:main thread: Active selectors: 4037.625319477:main thread: Selector 1: 4037.625327307:main thread: X X FF X X X X X X X X X X X X X X X X X X X X X X Actions: 4037.625400575:main thread: builtin-fwd: Instance data: 0x680d20 4037.625426870:main thread: RepeatedMsgReduction: 0 4037.625435459:main thread: Resume Interval: 30 4037.625443472:main thread: Suspended: 0 4037.625454034:main thread: Disabled: 0 4037.625462161:main thread: Exec only when previous is suspended: 0 4037.625470180:main thread: 4037.625477854:main thread: 4037.625486236:main thread: builtin-file: Instance data: 0x684870 4037.625499685:main thread: RepeatedMsgReduction: 0 4037.625508049:main thread: Resume Interval: 30 4037.625516113:main thread: Suspended: 0 4037.625526223:main thread: Disabled: 0 4037.625534110:main thread: Exec only when previous is suspended: 1 4037.625542227:main thread: 4037.625549973:main thread: 4037.625558091:main thread: 4037.625565903:main thread: Selector 2: 4037.625573421:main thread: X X X X FF X X X X X FF X X X 4037.625647001:main queue:Reg/w0: main queue: entry deleted, state 0, size now 0 entries 4037.625668214:main queue:Reg/w0: Called action, logging to builtin-file 4037.625702210:main queue:Reg/w0: (/var/log/syslog) From rgerhards at hq.adiscon.com Fri Jan 16 17:54:29 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Fri, 16 Jan 2009 17:54:29 +0100 Subject: [rsyslog] rsyslog still crashes In-Reply-To: References: <1232046859.22744.39.camel@localhost.localdomain><577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9C2@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9C8@grfint2.intern.adiscon.com> Message-ID: <577465F99B41C842AAFBE9ED71E70ABA44F9CA@grfint2.intern.adiscon.com> OK, maybe we can simplify the config, that would remove code pathes from the potential bug candidate list. Could you comment out all the $ActionQueue* settings? Rainer > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of Lorenzo M. Catucci > Sent: Friday, January 16, 2009 5:52 PM > To: rsyslog-users > Subject: Re: [rsyslog] rsyslog still crashes > > On Fri, 16 Jan 2009, Rainer Gerhards wrote: > > RG> Lorenzo, > RG> > RG> one thing: can you change the actionqueuemode to "direct" just for > a > RG> short period. I would be very interested to see what happens. > RG> > > Very short period... it crashed about as soon as started... I'm > enclosing > both the log and the backtrace. > > See you soon, > > lorenzo > > > +-------------------------+-------------------------------------------- > --+ > | Lorenzo M. Catucci | Centro di Calcolo e Documentazione > | > | catucci at ccd.uniroma2.it | Universit? degli Studi di Roma "Tor > Vergata" | > | | Via O. Raimondo 18 ** I-00173 ROMA ** > ITALY | > | Tel. +39 06 7259 2255 | Fax. +39 06 7259 2125 > | > +-------------------------+-------------------------------------------- > --+ From lorenzo at sancho.ccd.uniroma2.it Fri Jan 16 18:07:50 2009 From: lorenzo at sancho.ccd.uniroma2.it (Lorenzo M. Catucci) Date: Fri, 16 Jan 2009 18:07:50 +0100 (CET) Subject: [rsyslog] rsyslog still crashes In-Reply-To: <577465F99B41C842AAFBE9ED71E70ABA44F9CA@grfint2.intern.adiscon.com> References: <1232046859.22744.39.camel@localhost.localdomain><577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9C2@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9C8@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44F9CA@grfint2.intern.adiscon.com> Message-ID: On Fri, 16 Jan 2009, Rainer Gerhards wrote: RG> OK, maybe we can simplify the config, that would remove code pathes RG> from the potential bug candidate list. Could you comment out all the RG> $ActionQueue* settings? RG> Done, it's still crashing immediately! Here are the logs. lorenzo +-------------------------+----------------------------------------------+ | Lorenzo M. Catucci | Centro di Calcolo e Documentazione | | catucci at ccd.uniroma2.it | Universit? degli Studi di Roma "Tor Vergata" | | | Via O. Raimondo 18 ** I-00173 ROMA ** ITALY | | Tel. +39 06 7259 2255 | Fax. +39 06 7259 2125 | +-------------------------+----------------------------------------------+ -------------- next part -------------- GNU gdb 6.8-debian Copyright (C) 2008 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-linux-gnu"... Reading symbols from /usr/lib/libz.so.1...done. Loaded symbols for /usr/lib/libz.so.1 Reading symbols from /lib/libpthread.so.0...Reading symbols from /usr/lib/debug/lib/libpthread-2.7.so...done. done. Loaded symbols for /lib/libpthread.so.0 Reading symbols from /lib/libdl.so.2...Reading symbols from /usr/lib/debug/lib/libdl-2.7.so...done. done. Loaded symbols for /lib/libdl.so.2 Reading symbols from /lib/librt.so.1...Reading symbols from /usr/lib/debug/lib/librt-2.7.so...done. done. Loaded symbols for /lib/librt.so.1 Reading symbols from /lib/libc.so.6...Reading symbols from /usr/lib/debug/lib/libc-2.7.so...done. done. Loaded symbols for /lib/libc.so.6 Reading symbols from /lib/ld-linux-x86-64.so.2...Reading symbols from /usr/lib/debug/lib/ld-2.7.so...done. done. Loaded symbols for /lib64/ld-linux-x86-64.so.2 Reading symbols from /usr/lib/rsyslog/lmnet.so...done. Loaded symbols for /usr/lib/rsyslog/lmnet.so Reading symbols from /lib/libnss_files.so.2...Reading symbols from /usr/lib/debug/lib/libnss_files-2.7.so...done. done. Loaded symbols for /lib/libnss_files.so.2 Reading symbols from /usr/lib/rsyslog/imuxsock.so...done. Loaded symbols for /usr/lib/rsyslog/imuxsock.so Reading symbols from /usr/lib/rsyslog/imklog.so...done. Loaded symbols for /usr/lib/rsyslog/imklog.so Reading symbols from /lib/libnss_compat.so.2...Reading symbols from /usr/lib/debug/lib/libnss_compat-2.7.so...done. done. Loaded symbols for /lib/libnss_compat.so.2 Reading symbols from /lib/libnsl.so.1...Reading symbols from /usr/lib/debug/lib/libnsl-2.7.so...done. done. Loaded symbols for /lib/libnsl.so.1 Reading symbols from /lib/libnss_nis.so.2...Reading symbols from /usr/lib/debug/lib/libnss_nis-2.7.so...done. done. Loaded symbols for /lib/libnss_nis.so.2 Reading symbols from /usr/lib/rsyslog/lmnetstrms.so...done. Loaded symbols for /usr/lib/rsyslog/lmnetstrms.so Reading symbols from /usr/lib/rsyslog/lmtcpclt.so...done. Loaded symbols for /usr/lib/rsyslog/lmtcpclt.so Core was generated by `rsyslogd -c4 -d'. Program terminated with signal 11, Segmentation fault. [New process 4397] [New process 4399] [New process 4398] [New process 4396] #0 0x00002ac177165030 in strlen () from /lib/libc.so.6 (gdb) Thread 4 (process 4396): #0 0x00002ac176acd4c5 in __lll_unlock_wake () from /lib/libpthread.so.0 #1 0x00002ac176ac9ff9 in _L_unlock_56 () from /lib/libpthread.so.0 #2 0x00002ac176ac9c56 in __pthread_mutex_unlock_usercnt () from /lib/libpthread.so.0 #3 0x0000000000422a09 in dbgprint (pObj=, pszMsg=0x7fff34417d50 "FF ", lenMsg=3) at debug.c:157 #4 0x0000000000422c33 in dbgprintf (fmt=) at debug.c:892 #5 0x000000000040d549 in init () at syslogd.c:2209 #6 0x000000000040da79 in mainThread () at syslogd.c:2954 #7 0x000000000040ee96 in realMain (argc=, argv=) at syslogd.c:3631 #8 0x00002ac1771081a6 in __libc_start_main () from /lib/libc.so.6 #9 0x000000000040a259 in _start () Thread 3 (process 4398): #0 0x00002ac1771b2ce2 in select () from /lib/libc.so.6 #1 0x00002ac1778589fd in runInput (pThrd=) at imuxsock.c:280 #2 0x000000000044377f in thrdStarter (arg=0x6a3a30) at ../threads.c:139 #3 0x00002ac176ac6fc7 in start_thread () from /lib/libpthread.so.0 #4 0x00002ac1771b95ad in clone () from /lib/libc.so.6 #5 0x0000000000000000 in ?? () Thread 2 (process 4399): #0 0x00002ac176acd7db in read () from /lib/libpthread.so.0 #1 0x00002ac177a5d1ef in klogLogKMsg () at linux.c:449 #2 0x00002ac177a5c594 in runInput (pThrd=0x6a6a30) at imklog.c:224 #3 0x000000000044377f in thrdStarter (arg=0x6a6a30) at ../threads.c:139 #4 0x00002ac176ac6fc7 in start_thread () from /lib/libpthread.so.0 #5 0x00002ac1771b95ad in clone () from /lib/libc.so.6 #6 0x0000000000000000 in ?? () Thread 1 (process 4397): #0 0x00002ac177165030 in strlen () from /lib/libc.so.6 #1 0x00002ac177131cb1 in vfprintf () from /lib/libc.so.6 #2 0x00002ac177137c08 in fprintf () from /lib/libc.so.6 #3 0x000000000041ce7d in msgDestruct (ppThis=) at msg.c:350 #4 0x000000000044283a in actionCallDoAction (pAction=0x685570, pMsg=0x6a4090) at ../action.c:495 #5 0x0000000000439c9c in qAddDirect (pThis=0x685680, pUsr=0x6a4090) at queue.c:939 #6 0x000000000043dd83 in queueEnqObj (pThis=0x685680, flowCtlType=, pUsr=0x6a4090) at queue.c:1016 #7 0x0000000000442d63 in actionWriteToAction (pAction=0x685570) at ../action.c:672 #8 0x00000000004430d0 in actionCallAction (pAction=0x685570, pMsg=0x6a4090) at ../action.c:778 #9 0x000000000040b307 in processMsgDoActions (pData=0x685570, pParam=0x407ffe90) at syslogd.c:1140 #10 0x000000000041def8 in llExecFunc (pThis=0x6853e0, pFunc=0x40b2b0 , pParam=0x407ffe90) at linkedlist.c:391 #11 0x000000000040ae19 in msgConsumer (notNeeded=, pUsr=) at syslogd.c:1183 #12 0x000000000043c577 in queueConsumerReg (pThis=0x68cb20, pWti=0x6a0ed0, iCancelStateSave=) at queue.c:1598 #13 0x0000000000433050 in wtiWorker (pThis=0x6a0ed0) at wti.c:416 #14 0x00000000004317aa in wtpWorker (arg=0x6a0ed0) at wtp.c:425 #15 0x00002ac176ac6fc7 in start_thread () from /lib/libpthread.so.0 #16 0x00002ac1771b95ad in clone () from /lib/libc.so.6 #17 0x0000000000000000 in ?? () (gdb) quit -------------- next part -------------- 5437.595245610:main thread: Writing pidfile /var/run/rsyslogd.pid. 5437.595686368:main thread: rsyslog 4.1.3 - called init() 5437.595698050:main thread: Unloading non-static modules. 5437.595709554:main thread: module lmnet NOT unloaded because it still has a refcount of 3 5437.595719067:main thread: Clearing templates. 5437.595771624:main thread: cfline: '$ModLoad imuxsock # provides support for local system logging' 5437.595788522:main thread: Requested to load module 'imuxsock' 5437.595799718:main thread: loading module '/usr/lib/rsyslog/imuxsock.so' 5437.595870056:main thread: imuxsock version 4.1.3 initializing 5437.595908971:main thread: module of type 0 being loaded. 5437.595923470:main thread: cfline: '$ModLoad imklog # provides kernel logging support (previously done by rklogd)' 5437.595935908:main thread: Requested to load module 'imklog' 5437.595945421:main thread: loading module '/usr/lib/rsyslog/imklog.so' 5437.596063430:main thread: module of type 0 being loaded. 5437.596081982:main thread: cfline: '$ActionFileDefaultTemplate RSYSLOG_TraditionalFileFormat' 5437.596103387:main thread: cfline: '$FileOwner root' 5437.596366370:main thread: uid 0 obtained for user 'root' 5437.596380167:main thread: cfline: '$FileGroup adm' 5437.596439758:main thread: gid 4 obtained for group 'adm' 5437.596452445:main thread: cfline: '$FileCreateMode 0640' 5437.596466524:main thread: cfline: '$IncludeConfig /etc/rsyslog.d/*.conf' 5437.596530495:main thread: requested to include config file '/etc/rsyslog.d/remote.conf' 5437.596560987:main thread: cfline: '$WorkDirectory /var/log/rsyslog' 5437.596580414:main thread: cfline: '$ActionResumeRetryCount -1 # infinite retries on insert failure' 5437.596596212:main thread: cfline: 'mail.* @@xx.yy.zz.tt:514' 5437.596612292:main thread: - traditional PRI filter 5437.596622579:main thread: symbolic name: * ==> 255 5437.596635854:main thread: symbolic name: mail ==> 16 5437.596652432:main thread: tried selector action for builtin-file: -2001 5437.596668871:main thread: caller requested object 'netstrms', not found (iRet -3003) 5437.596678996:main thread: Requested to load module 'lmnetstrms' 5437.596688740:main thread: loading module '/usr/lib/rsyslog/lmnetstrms.so' 5437.596773657:main thread: module of type 2 being loaded. 5437.596787910:main thread: source file omfwd.c requested reference for module 'lmnetstrms', reference count now 1 5437.596801209:main thread: source file omfwd.c requested reference for module 'lmnetstrms', reference count now 2 5437.596819848:main thread: caller requested object 'tcpclt', not found (iRet -3003) 5437.596830324:main thread: Requested to load module 'lmtcpclt' 5437.596839704:main thread: loading module '/usr/lib/rsyslog/lmtcpclt.so' 5437.596905755:main thread: module of type 2 being loaded. 5437.596919522:main thread: source file omfwd.c requested reference for module 'lmtcpclt', reference count now 1 5437.596932436:main thread: hostname 'xx.yy.zz.tt', port '514' 5437.596953352:main thread: tried selector action for builtin-fwd: 0 5437.596966354:main thread: Module builtin-fwd processed this config line. 5437.596982080:main thread: template: 'RSYSLOG_TraditionalForwardFormat' assigned 5437.597007211:main thread: action 1 queue: save on shutdown 1, max disk space allowed 0 5437.597027685:main thread: action 1 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 5437.597040903:main thread: Action 0x683630: queue 0x683ad0 created 5437.597054232:main thread: cfline: '$ActionExecOnlyWhenPreviousIsSuspended on' 5437.597069310:main thread: cfline: '& /data/var_syslog/failover.log' 5437.597096292:main thread: tried selector action for builtin-file: 0 5437.597106887:main thread: Module builtin-file processed this config line. 5437.597117030:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 5437.597129750:main thread: action 2 queue: save on shutdown 1, max disk space allowed 0 5437.597143076:main thread: action 2 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 5437.597154860:main thread: Action 0x6849d0: queue 0x684c10 created 5437.597184145:main thread: cfline: '$ActionExecOnlyWhenPreviousIsSuspended off' 5437.597199760:main thread: selector line successfully processed 5437.597220784:main thread: cfline: 'auth,authpriv.* /var/log/auth.log' 5437.597232670:main thread: - traditional PRI filter 5437.597241793:main thread: symbolic name: * ==> 255 5437.597253664:main thread: symbolic name: auth ==> 32 5437.597265178:main thread: symbolic name: authpriv ==> 80 5437.597288145:main thread: tried selector action for builtin-file: 0 5437.597298717:main thread: Module builtin-file processed this config line. 5437.597311914:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 5437.597324910:main thread: action 3 queue: save on shutdown 1, max disk space allowed 0 5437.597338377:main thread: action 3 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 5437.597350305:main thread: Action 0x685000: queue 0x6850c0 created 5437.597361812:main thread: cfline: '*.*;auth,authpriv.none -/var/log/syslog' 5437.597371760:main thread: selector line successfully processed 5437.597381303:main thread: - traditional PRI filter 5437.597390459:main thread: symbolic name: * ==> 255 5437.597402201:main thread: symbolic name: none ==> 16 5437.597413358:main thread: symbolic name: auth ==> 32 5437.597424743:main thread: symbolic name: authpriv ==> 80 5437.597445059:main thread: tried selector action for builtin-file: 0 5437.597455309:main thread: Module builtin-file processed this config line. 5437.597465506:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 5437.597477596:main thread: action 4 queue: save on shutdown 1, max disk space allowed 0 5437.597490499:main thread: action 4 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 5437.597501863:main thread: Action 0x685570: queue 0x685680 created 5437.597513116:main thread: cfline: 'daemon.* -/var/log/daemon.log' 5437.597522704:main thread: selector line successfully processed 5437.597532007:main thread: - traditional PRI filter 5437.597540904:main thread: symbolic name: * ==> 255 5437.597552373:main thread: symbolic name: daemon ==> 24 5437.597573067:main thread: tried selector action for builtin-file: 0 5437.597583540:main thread: Module builtin-file processed this config line. 5437.597593506:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 5437.597605173:main thread: action 5 queue: save on shutdown 1, max disk space allowed 0 5437.597618478:main thread: action 5 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 5437.597630567:main thread: Action 0x685b50: queue 0x685c60 created 5437.597641754:main thread: cfline: 'kern.* -/var/log/kern.log' 5437.597651414:main thread: selector line successfully processed 5437.597660795:main thread: - traditional PRI filter 5437.597669852:main thread: symbolic name: * ==> 255 5437.597681123:main thread: symbolic name: kern ==> 0 5437.597705051:main thread: tried selector action for builtin-file: 0 5437.597715490:main thread: Module builtin-file processed this config line. 5437.597725735:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 5437.597737798:main thread: action 6 queue: save on shutdown 1, max disk space allowed 0 5437.597751004:main thread: action 6 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 5437.597762995:main thread: Action 0x686130: queue 0x686240 created 5437.597774134:main thread: cfline: 'lpr.* -/var/log/lpr.log' 5437.597783830:main thread: selector line successfully processed 5437.597793046:main thread: - traditional PRI filter 5437.597801811:main thread: symbolic name: * ==> 255 5437.597813298:main thread: symbolic name: lpr ==> 48 5437.597833524:main thread: tried selector action for builtin-file: 0 5437.597843772:main thread: Module builtin-file processed this config line. 5437.597853705:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 5437.597865321:main thread: action 7 queue: save on shutdown 1, max disk space allowed 0 5437.597890979:main thread: action 7 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 5437.597903456:main thread: Action 0x686710: queue 0x686820 created 5437.597914697:main thread: cfline: 'mail.* -/var/log/mail.log' 5437.597924591:main thread: selector line successfully processed 5437.597934092:main thread: - traditional PRI filter 5437.597943242:main thread: symbolic name: * ==> 255 5437.597954096:main thread: symbolic name: mail ==> 16 5437.597974738:main thread: tried selector action for builtin-file: 0 5437.597985043:main thread: Module builtin-file processed this config line. 5437.597995450:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 5437.598012859:main thread: action 8 queue: save on shutdown 1, max disk space allowed 0 5437.598027103:main thread: action 8 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 5437.598039193:main thread: Action 0x686cf0: queue 0x686e00 created 5437.598050464:main thread: cfline: 'user.* -/var/log/user.log' 5437.598059877:main thread: selector line successfully processed 5437.598069162:main thread: - traditional PRI filter 5437.598078222:main thread: symbolic name: * ==> 255 5437.598089760:main thread: symbolic name: user ==> 8 5437.598110994:main thread: tried selector action for builtin-file: 0 5437.598121194:main thread: Module builtin-file processed this config line. 5437.598130959:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 5437.598142863:main thread: action 9 queue: save on shutdown 1, max disk space allowed 0 5437.598156515:main thread: action 9 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 5437.598168587:main thread: Action 0x687290: queue 0x6873a0 created 5437.598180692:main thread: cfline: 'mail.info -/var/log/mail.info' 5437.598190523:main thread: selector line successfully processed 5437.598199946:main thread: - traditional PRI filter 5437.598208868:main thread: symbolic name: info ==> 6 5437.598220223:main thread: symbolic name: mail ==> 16 5437.598240955:main thread: tried selector action for builtin-file: 0 5437.598251116:main thread: Module builtin-file processed this config line. 5437.598261157:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 5437.598272995:main thread: action 10 queue: save on shutdown 1, max disk space allowed 0 5437.598286279:main thread: action 10 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 5437.598298537:main thread: Action 0x687870: queue 0x687980 created 5437.598309727:main thread: cfline: 'mail.warn -/var/log/mail.warn' 5437.598319450:main thread: selector line successfully processed 5437.598329097:main thread: - traditional PRI filter 5437.598338166:main thread: symbolic name: warn ==> 4 5437.598349602:main thread: symbolic name: mail ==> 16 5437.598369906:main thread: tried selector action for builtin-file: 0 5437.598379983:main thread: Module builtin-file processed this config line. 5437.598389949:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 5437.598405150:main thread: action 11 queue: save on shutdown 1, max disk space allowed 0 5437.598419093:main thread: action 11 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 5437.598430433:main thread: Action 0x687e50: queue 0x687f60 created 5437.598441455:main thread: cfline: 'mail.err /var/log/mail.err' 5437.598450704:main thread: selector line successfully processed 5437.598459923:main thread: - traditional PRI filter 5437.598468857:main thread: symbolic name: err ==> 3 5437.598480887:main thread: symbolic name: mail ==> 16 5437.598501595:main thread: tried selector action for builtin-file: 0 5437.598515449:main thread: Module builtin-file processed this config line. 5437.598525751:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 5437.598537496:main thread: action 12 queue: save on shutdown 1, max disk space allowed 0 5437.598550707:main thread: action 12 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 5437.598573321:main thread: Action 0x688430: queue 0x688540 created 5437.598585363:main thread: cfline: 'news.crit /var/log/news/news.crit' 5437.598595176:main thread: selector line successfully processed 5437.598604833:main thread: - traditional PRI filter 5437.598613572:main thread: symbolic name: crit ==> 2 5437.598624768:main thread: symbolic name: news ==> 56 5437.598647705:main thread: tried selector action for builtin-file: 0 5437.598657971:main thread: Module builtin-file processed this config line. 5437.598668150:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 5437.598680177:main thread: action 13 queue: save on shutdown 1, max disk space allowed 0 5437.598693176:main thread: action 13 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 5437.598705332:main thread: Action 0x688a10: queue 0x688b20 created 5437.598716744:main thread: cfline: 'news.err /var/log/news/news.err' 5437.598726596:main thread: selector line successfully processed 5437.598736043:main thread: - traditional PRI filter 5437.598744979:main thread: symbolic name: err ==> 3 5437.598756160:main thread: symbolic name: news ==> 56 5437.598777286:main thread: tried selector action for builtin-file: 0 5437.598787129:main thread: Module builtin-file processed this config line. 5437.598800314:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 5437.598812803:main thread: action 14 queue: save on shutdown 1, max disk space allowed 0 5437.598826177:main thread: action 14 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 5437.598837804:main thread: Action 0x688ff0: queue 0x689100 created 5437.598849081:main thread: cfline: 'news.notice -/var/log/news/news.notice' 5437.598858618:main thread: selector line successfully processed 5437.598867741:main thread: - traditional PRI filter 5437.598876750:main thread: symbolic name: notice ==> 5 5437.598888111:main thread: symbolic name: news ==> 56 5437.598908859:main thread: tried selector action for builtin-file: 0 5437.598919188:main thread: Module builtin-file processed this config line. 5437.598929240:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 5437.598940865:main thread: action 15 queue: save on shutdown 1, max disk space allowed 0 5437.598953981:main thread: action 15 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 5437.598966119:main thread: Action 0x6895d0: queue 0x6896e0 created 5437.598978587:main thread: cfline: '*.=debug;auth,authpriv.none;news.none;mail.none -/var/log/debug' 5437.598988430:main thread: selector line successfully processed 5437.598997799:main thread: - traditional PRI filter 5437.599006913:main thread: symbolic name: debug ==> 7 5437.599018781:main thread: symbolic name: none ==> 16 5437.599030057:main thread: symbolic name: auth ==> 32 5437.599041136:main thread: symbolic name: authpriv ==> 80 5437.599052494:main thread: symbolic name: none ==> 16 5437.599063705:main thread: symbolic name: news ==> 56 5437.599075069:main thread: symbolic name: none ==> 16 5437.599086205:main thread: symbolic name: mail ==> 16 5437.599107133:main thread: tried selector action for builtin-file: 0 5437.599117174:main thread: Module builtin-file processed this config line. 5437.599127409:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 5437.599139610:main thread: action 16 queue: save on shutdown 1, max disk space allowed 0 5437.599152729:main thread: action 16 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 5437.599164081:main thread: Action 0x689bb0: queue 0x689cc0 created 5437.599176261:main thread: cfline: '*.=info;*.=notice;*.=warn;auth,authpriv.none;cron,daemon.none;mail,news.none -/var/log/messages' 5437.599185849:main thread: selector line successfully processed 5437.599198395:main thread: - traditional PRI filter 5437.599207875:main thread: symbolic name: info ==> 6 5437.599231109:main thread: symbolic name: notice ==> 5 5437.599243598:main thread: symbolic name: warn ==> 4 5437.599255067:main thread: symbolic name: none ==> 16 5437.599266446:main thread: symbolic name: auth ==> 32 5437.599277561:main thread: symbolic name: authpriv ==> 80 5437.599294223:main thread: symbolic name: none ==> 16 5437.599305491:main thread: symbolic name: cron ==> 72 5437.599316587:main thread: symbolic name: daemon ==> 24 5437.599327972:main thread: symbolic name: none ==> 16 5437.599338829:main thread: symbolic name: mail ==> 16 5437.599349656:main thread: symbolic name: news ==> 56 5437.599370203:main thread: tried selector action for builtin-file: 0 5437.599380253:main thread: Module builtin-file processed this config line. 5437.599390312:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 5437.599402081:main thread: action 17 queue: save on shutdown 1, max disk space allowed 0 5437.599414977:main thread: action 17 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 5437.599426824:main thread: Action 0x68a190: queue 0x68a2a0 created 5437.599438617:main thread: cfline: '*.emerg *' 5437.599447848:main thread: selector line successfully processed 5437.599457043:main thread: - traditional PRI filter 5437.599465704:main thread: symbolic name: emerg ==> 0 5437.599477968:main thread: tried selector action for builtin-file: -2001 5437.599487949:main thread: tried selector action for builtin-fwd: -2001 5437.599498509:main thread: tried selector action for builtin-shell: -2001 5437.599509125:main thread: tried selector action for builtin-discard: -2001 5437.599520609:main thread: write-alltried selector action for builtin-usrmsg: 0 5437.599533671:main thread: Module builtin-usrmsg processed this config line. 5437.599543706:main thread: template: ' WallFmt' assigned 5437.599558706:main thread: action 18 queue: save on shutdown 1, max disk space allowed 0 5437.599572392:main thread: action 18 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 5437.599584044:main thread: Action 0x68abe0: queue 0x68adf0 created 5437.599599727:main thread: cfline: 'daemon.*;mail.*;news.err;*.=debug;*.=info;*.=notice;*.=warn |/dev/xconsole' 5437.599609933:main thread: selector line successfully processed 5437.599619050:main thread: - traditional PRI filter 5437.599627543:main thread: symbolic name: * ==> 255 5437.599639018:main thread: symbolic name: daemon ==> 24 5437.599650199:main thread: symbolic name: * ==> 255 5437.599661098:main thread: symbolic name: mail ==> 16 5437.599672207:main thread: symbolic name: err ==> 3 5437.599683163:main thread: symbolic name: news ==> 56 5437.599694127:main thread: symbolic name: debug ==> 7 5437.599705530:main thread: symbolic name: info ==> 6 5437.599716852:main thread: symbolic name: notice ==> 5 5437.599728234:main thread: symbolic name: warn ==> 4 5437.599747710:main thread: Error opening log file: /dev/xconsole 5437.599759170:main thread: Called LogError, msg: /dev/xconsole rsyslogd: /dev/xconsole 5437.599828730:main thread: tried selector action for builtin-file: 0 5437.599838531:main thread: Module builtin-file processed this config line. 5437.599848509:main thread: template: 'RSYSLOG_TraditionalFileFormat' assigned 5437.599860758:main thread: action 19 queue: save on shutdown 1, max disk space allowed 0 5437.599874021:main thread: action 19 queue: type 3, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 5437.599885441:main thread: Action 0x68c710: queue 0x68c820 created 5437.599897609:main thread: selector line successfully processed 5437.599922620:main thread: main queue: is NOT disk-assisted 5437.599936522:main thread: main queue: type 0, enq-only 0, disk assisted 0, maxFileSz 1048576, qsize 0, child 0 starting 5437.599955023:main thread: main queue:Reg: finalizing construction of worker thread pool 5437.599972840:main thread: main queue:Reg/w0: finalizing construction of worker instance data 5437.599983459:main thread: main queue: queue starts up without (loading) any DA disk state (this is normal for the DA queue itself!) 5437.600011680:main thread: main queue:Reg: high activity - starting 1 additional worker thread(s). 5437.600025603:main thread: main queue:Reg/w0: receiving command 2 5437.600068026:main thread: main queue:Reg: started with state 0, num workers now 1 5437.600104178:main thread: Main processing queue is initialized and running 5437.600141621:main thread: Opened UNIX socket '/dev/log' (fd 3). 5437.600209693:main thread: main queue: entry added, size now 1 entries 5437.600224753:main thread: wtpAdviseMaxWorkers signals busy 5437.600237254:main thread: main queue: EnqueueMsg advised worker start 5437.600255410:40800950: main queue:Reg/w0: receiving command 4 5437.600288919:imuxsock.c: --------imuxsock calling select, active file descriptors (max 3): 3 5437.600327454:main thread: Active selectors: 5437.600338062:main thread: Selector 1: 5437.600345985:main thread: X X FF X X X X X X X X X X X X X X X X X X X X X X Actions: 5437.600417150:main thread: builtin-fwd: Instance data: 0x680a90 5437.600444615:main thread: RepeatedMsgReduction: 0 5437.600453239:main thread: Resume Interval: 30 5437.600461504:main thread: Suspended: 0 5437.600472064:main thread: Disabled: 0 5437.600480533:main thread: Exec only when previous is suspended: 0 5437.600489317:main thread: 5437.600497369:main thread: 5437.600506120:main thread: builtin-file: Instance data: 0x684710 5437.600520046:main thread: RepeatedMsgReduction: 0 5437.600528425:main thread: Resume Interval: 30 5437.600536885:main thread: Suspended: 0 5437.600547397:main thread: Disabled: 0 5437.600555448:main thread: Exec only when previous is suspended: 1 5437.600563851:main thread: 5437.600571822:main thread: 5437.600579955:main thread: 5437.600587890:main thread: Selector 2: 5437.600595939:main thread: X X X X FF X X X X X FF X X X X X X X X X X X X X X Actions: 5437.600664965:main thread: builtin-file: Instance data: 0x67f920 5437.600677232:main thread: RepeatedMsgReduction: 0 5437.600685740:main thread: Resume Interval: 30 5437.600694011:main thread: Suspended: 0 5437.600704478:main thread: Disabled: 0 5437.600712497:main thread: Exec only when previous is suspended: 0 5437.600720972:main thread: 5437.600728721:main thread: 5437.600736893:main thread: 5437.600744893:main thread: Selector 3: 5437.600752783:main thread: FF FF FF FF X FF FF FF FF FF X FF FF FF FF FF FF FF FF FF FF FF FF 5437.600852964:main queue:Reg/w0: main queue: entry deleted, state 0, size now 0 entries 5437.600874750:main queue:Reg/w0: Called action, logging to builtin-file 5437.600907327:main queue:Reg/w0: (/var/log/syslog) From lorenzo at sancho.ccd.uniroma2.it Fri Jan 16 18:17:30 2009 From: lorenzo at sancho.ccd.uniroma2.it (Lorenzo M. Catucci) Date: Fri, 16 Jan 2009 18:17:30 +0100 (CET) Subject: [rsyslog] rsyslog still crashes In-Reply-To: <577465F99B41C842AAFBE9ED71E70ABA44F9CA@grfint2.intern.adiscon.com> References: <1232046859.22744.39.camel@localhost.localdomain><577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9C2@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9C8@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44F9CA@grfint2.intern.adiscon.com> Message-ID: On Fri, 16 Jan 2009, Rainer Gerhards wrote: RG> OK, maybe we can simplify the config, that would remove code pathes RG> from the potential bug candidate list. Could you comment out all the RG> $ActionQueue* settings? RG> I've just restored the #if 0 in runtime/msg.c; it seems the immediate crashes came from those two lines. Now logging. Servus, lorenzo +-------------------------+----------------------------------------------+ | Lorenzo M. Catucci | Centro di Calcolo e Documentazione | | catucci at ccd.uniroma2.it | Universit? degli Studi di Roma "Tor Vergata" | | | Via O. Raimondo 18 ** I-00173 ROMA ** ITALY | | Tel. +39 06 7259 2255 | Fax. +39 06 7259 2125 | +-------------------------+----------------------------------------------+ From rgerhards at hq.adiscon.com Fri Jan 16 18:21:34 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Fri, 16 Jan 2009 18:21:34 +0100 Subject: [rsyslog] rsyslog still crashes In-Reply-To: References: <1232046859.22744.39.camel@localhost.localdomain><577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9C2@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9C8@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9CA@grfint2.intern.adiscon.com> Message-ID: <577465F99B41C842AAFBE9ED71E70ABA44F9CB@grfint2.intern.adiscon.com> Ah, ok. Side-note: I got my machine up and it is running some test. Unfortunately no aborts so far, but is has only 4 cores... I hope something turns out... Rainer > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of Lorenzo M. Catucci > Sent: Friday, January 16, 2009 6:18 PM > To: rsyslog-users > Subject: Re: [rsyslog] rsyslog still crashes > > On Fri, 16 Jan 2009, Rainer Gerhards wrote: > > RG> OK, maybe we can simplify the config, that would remove code pathes > RG> from the potential bug candidate list. Could you comment out all > the > RG> $ActionQueue* settings? > RG> > > I've just restored the #if 0 in runtime/msg.c; it seems the immediate > crashes came from those two lines. Now logging. > > Servus, > > lorenzo > > > +-------------------------+-------------------------------------------- > --+ > | Lorenzo M. Catucci | Centro di Calcolo e Documentazione > | > | catucci at ccd.uniroma2.it | Universit? degli Studi di Roma "Tor > Vergata" | > | | Via O. Raimondo 18 ** I-00173 ROMA ** > ITALY | > | Tel. +39 06 7259 2255 | Fax. +39 06 7259 2125 > | > +-------------------------+-------------------------------------------- > --+ From lorenzo at sancho.ccd.uniroma2.it Fri Jan 16 18:29:26 2009 From: lorenzo at sancho.ccd.uniroma2.it (Lorenzo M. Catucci) Date: Fri, 16 Jan 2009 18:29:26 +0100 (CET) Subject: [rsyslog] rsyslog still crashes In-Reply-To: <577465F99B41C842AAFBE9ED71E70ABA44F9CB@grfint2.intern.adiscon.com> References: <1232046859.22744.39.camel@localhost.localdomain><577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9C2@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9C8@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9CA@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44F9CB@grfint2.intern.adiscon.com> Message-ID: On Fri, 16 Jan 2009, Rainer Gerhards wrote: RG> Ah, ok. Side-note: I got my machine up and it is running some test. RG> Unfortunately no aborts so far, but is has only 4 cores... I hope RG> something turns out... RG> I think the real problem is in keeping those cores very busy... I'd try to spawn something like 20 loggers each spawning a couple "workers" per second and logging startup/shutdown of any child. Maybe make each worker sleep for a random time before exiting. I don't have any Fedora/RedHat system; if nothing else, I'd suggest doing your tests on a debian/testing system too. Yours, lorenzo PS still running... +-------------------------+----------------------------------------------+ | Lorenzo M. Catucci | Centro di Calcolo e Documentazione | | catucci at ccd.uniroma2.it | Universit? degli Studi di Roma "Tor Vergata" | | | Via O. Raimondo 18 ** I-00173 ROMA ** ITALY | | Tel. +39 06 7259 2255 | Fax. +39 06 7259 2125 | +-------------------------+----------------------------------------------+ From rgerhards at hq.adiscon.com Fri Jan 16 18:30:51 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Fri, 16 Jan 2009 18:30:51 +0100 Subject: [rsyslog] rsyslog still crashes In-Reply-To: References: <1232046859.22744.39.camel@localhost.localdomain><577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9C2@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9C8@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9CA@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9CB@grfint2.intern.adiscon.com> Message-ID: <577465F99B41C842AAFBE9ED71E70ABA44F9CD@grfint2.intern.adiscon.com> > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of Lorenzo M. Catucci > Sent: Friday, January 16, 2009 6:29 PM > To: rsyslog-users > Subject: Re: [rsyslog] rsyslog still crashes > > On Fri, 16 Jan 2009, Rainer Gerhards wrote: > > RG> Ah, ok. Side-note: I got my machine up and it is running some test. > RG> Unfortunately no aborts so far, but is has only 4 cores... I hope > RG> something turns out... > RG> > > I think the real problem is in keeping those cores very busy... I'd try > to > spawn something like 20 loggers each spawning a couple "workers" per > second and logging startup/shutdown of any child. Maybe make each > worker > sleep for a random time before exiting. Good suggestion, thanks. > > I don't have any Fedora/RedHat system; if nothing else, I'd suggest > doing > your tests on a debian/testing system too. That's what I am running on that machine - with components downloaded today. Rainer From david at lang.hm Sat Jan 17 00:26:04 2009 From: david at lang.hm (david at lang.hm) Date: Fri, 16 Jan 2009 15:26:04 -0800 (PST) Subject: [rsyslog] Baclogged files to disk are pretty slow In-Reply-To: <4255c2570901160719o4aa3bc6bk9813225374bfc53c@mail.gmail.com> References: <21149_1232040593_n0FHSsbU031125_96AF20FDF4301D419B33CCE8E3A0132B0AB699D2@SAT4MX07.RACKSPACE.CORP> <4255c2570901151111n6696fbc9md66a30c9bc9b4a10@mail.gmail.com> <19646_1232060570_n0FN2mSc020768_96AF20FDF4301D419B33CCE8E3A0132B0ACECB55@SAT4MX07.RACKSPACE.CORP> <4255c2570901160719o4aa3bc6bk9813225374bfc53c@mail.gmail.com> Message-ID: On Fri, 16 Jan 2009, RB wrote: > On Thu, Jan 15, 2009 at 16:01, Daniel Anson wrote: >> I would hope that there is an easy solution as my next idea is to write >> some type of daemonized process that can insert messages from a pool of >> MySQL connections. I can achieve this in C but would rather hopefully >> find a solution inside of the configuration. > > Short of implementing the queue/worker configuration (no idea how), it > seems the only current option would be to implement something of the > sort, either by an update to the ommysql module (optimal, as it gets > your code supported by someone else for its lifetim) or by some > external program. multiple workers will help if mySQL can handle more transactions at a time if the hit in parallel. the fact that you are doing 4000/sec indicates that you are not doing a fsync for each insert, so it is unlikly to help (if you are fsync limited the data rates are probably gong to be closer to 100-200/sec depending on your drives) > I'd think an optimal external solution would be some sort of > relp2mysql bridge, but suspect that would end up reimplementing a good > chunk of rsyslog. actually, the optimal solution is to modify rsyslog to be able to handle multiple messages at once in the output queues. that is a major effort (2-4 man weeks) that will require a sponser. Once this is implemented I would expect the throughput to go up by 2-3 orders of magnatude for database inserts. David Lang From david at lang.hm Sat Jan 17 03:31:24 2009 From: david at lang.hm (david at lang.hm) Date: Fri, 16 Jan 2009 18:31:24 -0800 (PST) Subject: [rsyslog] rsyslog still crashes In-Reply-To: <1232046366.22744.34.camel@localhost.localdomain> References: <1232046366.22744.34.camel@localhost.localdomain> Message-ID: On Thu, 15 Jan 2009, Rainer Gerhards wrote: > On Fri, 2009-01-16 at 01:20 +0100, Michael Biebl wrote: >> Given the -c4 command line argument, I'd expect it to be 4.1.3. >> >> Sounds familiar to >> http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=509292 (which is >> 3.18.6). >> >> It seems to be a more general problem with multi core (= very fast??) systems. > > Yes, that is what my analysis so far points to. It's also part of the > problem, because I do not have very fast hardware to reproduce the issue > (and it is also not easy to reliably reproduce if you have...). > > I've gotten a couple of reports (I think most on the mailing list) on > such problems and all they have in common is 4+ core machines. > > I'll try to get hold based on what Lorenzo submits. In his environment, > the problem seems to occur most reliably (he probably has the fastest > machine...). > > Lorenzo: details follow soon. I just got some time to work on this sort of thing again. my test system is a 4-socket (dual core) opteron system with 16g of ram I've done a fair amount of stress testing of the system without lockups (around the time the 4.1 branch started) if you can describe a test setup I can see about reproducing it. David Lang From david at lang.hm Sat Jan 17 03:40:22 2009 From: david at lang.hm (david at lang.hm) Date: Fri, 16 Jan 2009 18:40:22 -0800 (PST) Subject: [rsyslog] rsyslog still crashes In-Reply-To: <577465F99B41C842AAFBE9ED71E70ABA44F9C9@grfint2.intern.adiscon.com> References: <1232046859.22744.39.camel@localhost.localdomain><577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9C2@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44F9C8@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44F9C9@grfint2.intern.adiscon.com> Message-ID: On Fri, 16 Jan 2009, Rainer Gerhards wrote: > Lorenzo and others: > > I hopefully got a system today where I can reproduce. I am setting it up right now. I also have written a stub wiki page with information useful to hunt this bug: one other thing that you can do for this sort of thing is to use the amazon cloud. to quote a message from Rob Landley to the linux-kernel mailing list > My friend Mark's been experimenting with the amazon "cloud" thing, > feeding in an image with a qemu instance and distcc+cross-compiler, and > running builds under that. Renting an 8-way ~2.5 ghz server with 7 > gigabytes of ram and 1.6 terabytes of disk is 80 cents/hour through them > plus another few cents/day for bandwidth and persistent storage and > such. That's likely to get cheaper as time goes on. > > We're still planning to buy a build server of our own to have something > in- house, but for running nightly builds it's almost to the point where > depreciation on the hardware is more than buying time from a server > farm. Just _one_ of those 8-way servers is enough hardware to build an > entire distro in an hour or so. > > What this really allows us to do is experiment with "how parallel can we > get our build"? Because renting ten 8-way servers in a cluster is > $8/hour, and distcc already scales trivially over that. Down the road > what Firmware Linux is working towards is multiple qemu instances > running in parallel with a central instance distributing builds to each > one, so each can do its own ./configure in parallel, distribute > compilation to the distccd instances as it has stuff to compile, and > then package up the resulting binary into one of those portage tarballs > and send it back to the central node to install on a network mount that > the lot of 'em can mount as build context, so the packages can get their > dependencies right. (You don't want your build taking place in a > network mount, but your OS being on one you never write to isn't so bad > as long as you have local storage to build in.) > > We'll probably leverage the heck out of Portage for this, and might wind > up modifying it heavily. Dunno yet. (We can even force dependencies on > portage so it doesn't need to calculate 'em, the central node can do > that and then say "you have these packages, _build_"...) > > But yeah, hobbyists with a laptop, network access, and a monthly budget > of $20 can do cluster builds these days. would it make sense to start a fund to pay for some time for you to use like this? David Lang > http://wiki.rsyslog.com/index.php/V3_Race_Condition_Hunt_Page > > Lorenzo, can you please double-check I have used the right config indeed. > > All others: if you can add scenarios/information, please do. I'll try to repro the problem as soon as the system is ready. Hope it will work... > > Rainer > >> -----Original Message----- >> From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- >> bounces at lists.adiscon.com] On Behalf Of Rainer Gerhards >> Sent: Friday, January 16, 2009 5:20 PM >> To: rsyslog-users >> Subject: Re: [rsyslog] rsyslog still crashes >> >> Lorenzo, >> >> one thing: can you change the actionqueuemode to "direct" just for a >> short period. I would be very interested to see what happens. >> >> Rainer >> >>> -----Original Message----- >>> From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- >>> bounces at lists.adiscon.com] On Behalf Of Lorenzo M. Catucci >>> Sent: Friday, January 16, 2009 5:10 PM >>> To: rsyslog-users >>> Subject: Re: [rsyslog] rsyslog still crashes >>> >>> On Fri, 16 Jan 2009, Lorenzo M. Catucci wrote: >>> >>> LMC> >>> LMC> The -n crash was completely silent; the -d run was chatty (as >>> expected); >>> LMC> with stdout redirected, it took a lot more time to crash, but >> here >>> are >>> LMC> both the logfile and the gdb backtrace. >>> LMC> >>> >>> As for the last crash, I found on the screen session the line: >>> >>> rsyslogd: queue.c:1393: queueChkDiscardMsg: Assertion `(unsigned) >>> ((obj_t*)(pUsr))->iObjCooCKiE == (unsigned) 0xBADEFEE' failed. >>> >>> since I forgot redirecting stderr too. >>> >>> Yours, >>> >>> lorenzo >>> >>> +-------------------------+------------------------------------------ >> -- >>> --+ >>> | Lorenzo M. Catucci | Centro di Calcolo e Documentazione >>> | >>> | catucci at ccd.uniroma2.it | Universit? degli Studi di Roma "Tor >>> Vergata" | >>> | | Via O. Raimondo 18 ** I-00173 ROMA ** >>> ITALY | >>> | Tel. +39 06 7259 2255 | Fax. +39 06 7259 2125 >>> | >>> +-------------------------+------------------------------------------ >> -- >>> --+ >> _______________________________________________ >> rsyslog mailing list >> http://lists.adiscon.net/mailman/listinfo/rsyslog >> http://www.rsyslog.com > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com From david at lang.hm Sat Jan 17 03:42:09 2009 From: david at lang.hm (david at lang.hm) Date: Fri, 16 Jan 2009 18:42:09 -0800 (PST) Subject: [rsyslog] rsyslog on AIX Message-ID: we are looking at using rsyslog on AIX and the sysadmins are reporting 'problems getting it to compile' (unfortunantly no details yet) has anyone tried this? David Lang From mbiebl at gmail.com Sat Jan 17 11:10:39 2009 From: mbiebl at gmail.com (Michael Biebl) Date: Sat, 17 Jan 2009 11:10:39 +0100 Subject: [rsyslog] Is rsyslog leaking memory? Message-ID: Hi, I'm running rsyslog 3.20.2 I noticed the following: # /etc/init.d/rsyslog restart VSZ RSS (as reported by ps) 27100 1184 # logger foo 27100 1196 # logger foo (1000x) 27100 1200 # logger foo (1000x) 27100 1204 # logger foo (1000x) 27100 1208 and so on. This made me wonder, if rsyslog is leaking memory somewhere. I also noticed, that for each loaded module, rsyslog resevers exactly 8 Mb of anoymous memory (pmap -d `pgrep rsyslog`) With a couple of loaded modules you easily get over 50Mb VSZ. Michael -- Why is it that all of the instruments seeking intelligent life in the universe are pointed away from Earth? From david at lang.hm Sun Jan 18 01:55:56 2009 From: david at lang.hm (david at lang.hm) Date: Sat, 17 Jan 2009 16:55:56 -0800 (PST) Subject: [rsyslog] Is rsyslog leaking memory? In-Reply-To: References: Message-ID: On Sat, 17 Jan 2009, Michael Biebl wrote: > Hi, > > I'm running rsyslog 3.20.2 > > I noticed the following: > # /etc/init.d/rsyslog restart > VSZ RSS (as reported by ps) > 27100 1184 > # logger foo > 27100 1196 > # logger foo (1000x) > 27100 1200 > # logger foo (1000x) > 27100 1204 > # logger foo (1000x) > 27100 1208 > > and so on. > > > This made me wonder, if rsyslog is leaking memory somewhere. I have run rsyslog through stress tests where I have sent it 1B log messages and do not think that there is a memory leak. what I think that you are seeing is that the default rsyslog memory queue only uses as much ram as it needs to hold the data (even though it's described as a array it seems to grow dynamicly, I'm not sure about it shrinking) when you log a bunch of messages via logger you push data into the array faster then it gets extracted, so it takes more memory (up until you hit the max size of the array, which I think is 1000 entries) > I also noticed, that for each loaded module, rsyslog resevers exactly > 8 Mb of anoymous memory (pmap -d `pgrep rsyslog`) > With a couple of loaded modules you easily get over 50Mb VSZ. I haven't tried doing stuff with different modules, so I don't know about this. David Lang From rgerhards at hq.adiscon.com Sun Jan 18 12:01:53 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Sun, 18 Jan 2009 12:01:53 +0100 Subject: [rsyslog] Is rsyslog leaking memory? In-Reply-To: References: Message-ID: <1232276513.22744.45.camel@localhost.localdomain> On Sat, 2009-01-17 at 16:55 -0800, david at lang.hm wrote: > On Sat, 17 Jan 2009, Michael Biebl wrote: > > > Hi, > > > > I'm running rsyslog 3.20.2 > > > > I noticed the following: > > # /etc/init.d/rsyslog restart > > VSZ RSS (as reported by ps) > > 27100 1184 > > # logger foo > > 27100 1196 > > # logger foo (1000x) > > 27100 1200 > > # logger foo (1000x) > > 27100 1204 > > # logger foo (1000x) > > 27100 1208 > > > > and so on. > > > > > > This made me wonder, if rsyslog is leaking memory somewhere. > > I have run rsyslog through stress tests where I have sent it 1B log > messages and do not think that there is a memory leak. I am using valgrind's excellent memory debugger routinely during development. That brings up leaks and invalid memory access rather quickly. In fact, code quality has much improved when I started to use valgrind routinely roughly a year ago. From time to time I also do specific tests for leaks, both using valgrind and the traditional analysis technics. >From what I have seen so far, I, too, doubt there is a leak. However, there are various levels of testing. For example, the postgres output module and the GSSAPI code is contributed and I do not even have a test environment. So these are not checked using that procedure. The libdbi code is only checked every now and then and not with all backends (e.g. no Oracle at hand ... and so on...). If I ever get over to a full testing suite (no collaborators found so far...), I'll probably be able to do more consitent testing of all modules. > > what I think that you are seeing is that the default rsyslog memory queue > only uses as much ram as it needs to hold the data (even though it's > described as a array it seems to grow dynamicly, I'm not sure about it > shrinking) If you use "fixedarray" mode, the pointer array is allocated statically, no matter how many messages are in the queue. HOWEVER, this is only the pointers, so quite few memory. Actual messages are dynamically allocated and freed when processed - in any mode. > > when you log a bunch of messages via logger you push data into the array > faster then it gets extracted, so it takes more memory (up until you hit > the max size of the array, which I think is 1000 entries) > > > I also noticed, that for each loaded module, rsyslog resevers exactly > > 8 Mb of anoymous memory (pmap -d `pgrep rsyslog`) > > With a couple of loaded modules you easily get over 50Mb VSZ. > > I haven't tried doing stuff with different modules, so I don't know about > this. I am not sure where it comes from, but I'd think into the dlload direction. Could also very well be the runtime stack for each thread (not dug into the details). Rainer From david at lang.hm Sun Jan 18 13:33:16 2009 From: david at lang.hm (david at lang.hm) Date: Sun, 18 Jan 2009 04:33:16 -0800 (PST) Subject: [rsyslog] Is rsyslog leaking memory? In-Reply-To: <1232276513.22744.45.camel@localhost.localdomain> References: <1232276513.22744.45.camel@localhost.localdomain> Message-ID: On Sun, 18 Jan 2009, Rainer Gerhards wrote: > On Sat, 2009-01-17 at 16:55 -0800, david at lang.hm wrote: > >> >> what I think that you are seeing is that the default rsyslog memory queue >> only uses as much ram as it needs to hold the data (even though it's >> described as a array it seems to grow dynamicly, I'm not sure about it >> shrinking) > > If you use "fixedarray" mode, the pointer array is allocated statically, > no matter how many messages are in the queue. HOWEVER, this is only the > pointers, so quite few memory. Actual messages are dynamically allocated > and freed when processed - in any mode. that makes sense. It would be interesting to see what would happen to the enqueue/dequeue timings if the message memory was staticly allocated from what I remember seeing of the memory footprint it does appear as if you allocate the max size for the message each time, not the minimum sized needed to hold the message if that shows a noticable difference it may be worth allocating the memory in chunks substantially larger than a single message David Lang From rgerhards at hq.adiscon.com Sun Jan 18 12:21:24 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Sun, 18 Jan 2009 12:21:24 +0100 Subject: [rsyslog] Is rsyslog leaking memory? In-Reply-To: References: <1232276513.22744.45.camel@localhost.localdomain> Message-ID: <1232277684.22744.48.camel@localhost.localdomain> On Sun, 2009-01-18 at 04:33 -0800, david at lang.hm wrote: > On Sun, 18 Jan 2009, Rainer Gerhards wrote: > > > On Sat, 2009-01-17 at 16:55 -0800, david at lang.hm wrote: > > > >> > >> what I think that you are seeing is that the default rsyslog memory queue > >> only uses as much ram as it needs to hold the data (even though it's > >> described as a array it seems to grow dynamicly, I'm not sure about it > >> shrinking) > > > > If you use "fixedarray" mode, the pointer array is allocated statically, > > no matter how many messages are in the queue. HOWEVER, this is only the > > pointers, so quite few memory. Actual messages are dynamically allocated > > and freed when processed - in any mode. > > that makes sense. It would be interesting to see what would happen to the > enqueue/dequeue timings if the message memory was staticly allocated > > from what I remember seeing of the memory footprint it does appear as if > you allocate the max size for the message each time, not the minimum sized > needed to hold the message > yes, that's right. This is done to prevent an additional copy to clean things up (realloc might work, too) and memory fragmentation. The later is really nasty, I've seen that some memory areas remain allocated for quite some while due to fragmentation. > if that shows a noticable difference it may be worth allocating the memory > in chunks substantially larger than a single message That's a good suggestion. The basic classes are able to trim strings. It may be worth putting a config option into it. The current approach works well for small queues, but obviously does provide sub-optimal performance as soon as the queues grow considerably. So it may even make sense to start trimming messages only after a certain amount of messages are in-queue. Rainer From rgerhards at hq.adiscon.com Sun Jan 18 12:26:51 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Sun, 18 Jan 2009 12:26:51 +0100 Subject: [rsyslog] rsyslog on AIX In-Reply-To: References: Message-ID: <1232278011.22744.50.camel@localhost.localdomain> On Fri, 2009-01-16 at 18:42 -0800, david at lang.hm wrote: > we are looking at using rsyslog on AIX and the sysadmins are reporting > 'problems getting it to compile' (unfortunantly no details yet) > > has anyone tried this? All I know is that it doesn't work. No idea on how hard it is to get this right. Some time ago I was interested in porting (and had the time...) but found neither (virtual) hardware/software nor anyone interested in it. So I dropped the idea. I'd still be very interested in a port, but now unfortunately have much less time... Rainer From david at lang.hm Mon Jan 19 02:30:58 2009 From: david at lang.hm (david at lang.hm) Date: Sun, 18 Jan 2009 17:30:58 -0800 (PST) Subject: [rsyslog] Is rsyslog leaking memory? In-Reply-To: <1232277684.22744.48.camel@localhost.localdomain> References: <1232276513.22744.45.camel@localhost.localdomain> <1232277684.22744.48.camel@localhost.localdomain> Message-ID: On Sun, 18 Jan 2009, Rainer Gerhards wrote: > On Sun, 2009-01-18 at 04:33 -0800, david at lang.hm wrote: >> On Sun, 18 Jan 2009, Rainer Gerhards wrote: >> >>> On Sat, 2009-01-17 at 16:55 -0800, david at lang.hm wrote: >>> >>>> >>>> what I think that you are seeing is that the default rsyslog memory queue >>>> only uses as much ram as it needs to hold the data (even though it's >>>> described as a array it seems to grow dynamicly, I'm not sure about it >>>> shrinking) >>> >>> If you use "fixedarray" mode, the pointer array is allocated statically, >>> no matter how many messages are in the queue. HOWEVER, this is only the >>> pointers, so quite few memory. Actual messages are dynamically allocated >>> and freed when processed - in any mode. >> >> that makes sense. It would be interesting to see what would happen to the >> enqueue/dequeue timings if the message memory was staticly allocated >> >> from what I remember seeing of the memory footprint it does appear as if >> you allocate the max size for the message each time, not the minimum sized >> needed to hold the message >> > yes, that's right. This is done to prevent an additional copy to clean > things up (realloc might work, too) and memory fragmentation. The later > is really nasty, I've seen that some memory areas remain allocated for > quite some while due to fragmentation. > >> if that shows a noticable difference it may be worth allocating the memory >> in chunks substantially larger than a single message > > That's a good suggestion. The basic classes are able to trim strings. It > may be worth putting a config option into it. The current approach works > well for small queues, but obviously does provide sub-optimal > performance as soon as the queues grow considerably. So it may even make > sense to start trimming messages only after a certain amount of messages > are in-queue. I'm not sure that we're saying the same thing. let me try again. what I was thinking was that instead of allocating memory for one message at a time, initially allocate memory for 100 messages, then if this needs to be extended increase the allocation by 50-100%. this minimizes the number of allocations needed and the fragmentation of system memory. just like the fixed-array queue option is significantly faster than the linked list queue option (I assume from a combination of having to chase pointers and allocate/deallocate memory), there may be similar benifits from doing the same thing for the message content itself. David Lang From rgerhards at hq.adiscon.com Sun Jan 18 16:45:56 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Sun, 18 Jan 2009 16:45:56 +0100 Subject: [rsyslog] rsyslog still crashes In-Reply-To: References: <1232046859.22744.39.camel@localhost.localdomain> <577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44F9C2@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44F9C8@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44F9CA@grfint2.intern.adiscon.com> Message-ID: <1232293556.2453.3.camel@rf10up.intern.adiscon.com> Hi Lorenzo, I've gone through the material once more. Indeed, it looks like the previous tests (with the #if 1) were not really useful. Sorry for that. Please let me know the outcome of this run here. Also, I thought about one shot we may give it at reducing complexity. I am not sure if it works out, but if it does, that would be a big benefit. Could you please try the following: Use the master branch (the one you previously used). Reduce rsyslog.conf to just the necessary inputs (ideally only imuxsock) and a SINGLE file writer, no further actions. Let that run and tell us if it aborts, too. If it does, we have outruled a lot of code and we can focus much better in our troubleshooting. On my box, I unfortunately had no success yet in reproducing the issue - even though I put a lot of stress on the machine. Will be trying more today, hopefully that brings up some results... Rainer On Fri, 2009-01-16 at 18:17 +0100, Lorenzo M. Catucci wrote: > On Fri, 16 Jan 2009, Rainer Gerhards wrote: > > RG> OK, maybe we can simplify the config, that would remove code pathes > RG> from the potential bug candidate list. Could you comment out all the > RG> $ActionQueue* settings? > RG> > > I've just restored the #if 0 in runtime/msg.c; it seems the immediate > crashes came from those two lines. Now logging. > > Servus, > > lorenzo > > > +-------------------------+----------------------------------------------+ > | Lorenzo M. Catucci | Centro di Calcolo e Documentazione | > | catucci at ccd.uniroma2.it | Universit? degli Studi di Roma "Tor Vergata" | > | | Via O. Raimondo 18 ** I-00173 ROMA ** ITALY | > | Tel. +39 06 7259 2255 | Fax. +39 06 7259 2125 | > +-------------------------+----------------------------------------------+ > _______________________________________________ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com From rgerhards at hq.adiscon.com Sun Jan 18 16:57:03 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Sun, 18 Jan 2009 16:57:03 +0100 Subject: [rsyslog] increasing alloc performance - was: Is rsyslog leaking memory? In-Reply-To: References: <1232276513.22744.45.camel@localhost.localdomain> <1232277684.22744.48.camel@localhost.localdomain> Message-ID: <1232294223.2453.9.camel@rf10up.intern.adiscon.com> On Sun, 2009-01-18 at 17:30 -0800, david at lang.hm wrote: > I'm not sure that we're saying the same thing. let me try again. You are right, we weren't... > > what I was thinking was that instead of allocating memory for one message > at a time, initially allocate memory for 100 messages, then if this needs > to be extended increase the allocation by 50-100%. this minimizes the > number of allocations needed and the fragmentation of system memory. > > just like the fixed-array queue option is significantly faster than the > linked list queue option (I assume from a combination of having to chase > pointers and allocate/deallocate memory), there may be similar benifits > from doing the same thing for the message content itself. I have to admit I am skeptic about this. The reason is that there are many non-fixed fields within the message and they are allocated as needed (with some initial size that fits most messages, but it may extend on an as-needed basis). So there is no real fixed size for any message. It also depends on template formatting and other factors. I think if I'd try to prealloc at least the initial chunks, I'd probably do pretty much the same that the malloc()/free() runtime does. That, however, will probably be less performant than the runtime is (at least I hope so, these parts of the code should be heavily tweaked). This is also an error-prone task. There may be a compromise in between (e.g. allocating a fixed chunk of message text together with the message blobs), but I still think the necessary complexity is not outweight by similar benefits. All in all, I think, we have seen that the in-user-space computing needs (and malloc counts as such) are not really the bottlenecks. Implementing e.g. a "bunch writer" (which enables submission of multiple messages at once to an action) seems to be (just) equally complex but promises far better results. In any case, I'd finally like to track down that dangling race before I do any further optimization. It looks like Lorenzo seems to have a relatively stable environment for reproduction and I'd like to take advantage of that. Rainer From david at lang.hm Mon Jan 19 09:29:35 2009 From: david at lang.hm (david at lang.hm) Date: Mon, 19 Jan 2009 00:29:35 -0800 (PST) Subject: [rsyslog] increasing alloc performance - was: Is rsyslog leaking memory? In-Reply-To: <1232294223.2453.9.camel@rf10up.intern.adiscon.com> References: <1232276513.22744.45.camel@localhost.localdomain> <1232277684.22744.48.camel@localhost.localdomain> <1232294223.2453.9.camel@rf10up.intern.adiscon.com> Message-ID: On Sun, 18 Jan 2009, Rainer Gerhards wrote: > On Sun, 2009-01-18 at 17:30 -0800, david at lang.hm wrote: >> >> what I was thinking was that instead of allocating memory for one message >> at a time, initially allocate memory for 100 messages, then if this needs >> to be extended increase the allocation by 50-100%. this minimizes the >> number of allocations needed and the fragmentation of system memory. >> >> just like the fixed-array queue option is significantly faster than the >> linked list queue option (I assume from a combination of having to chase >> pointers and allocate/deallocate memory), there may be similar benifits >> from doing the same thing for the message content itself. > > I have to admit I am skeptic about this. The reason is that there are > many non-fixed fields within the message and they are allocated as > needed (with some initial size that fits most messages, but it may > extend on an as-needed basis). So there is no real fixed size for any > message. It also depends on template formatting and other factors. > > I think if I'd try to prealloc at least the initial chunks, I'd probably > do pretty much the same that the malloc()/free() runtime does. That, > however, will probably be less performant than the runtime is (at least > I hope so, these parts of the code should be heavily tweaked). This is > also an error-prone task. > > There may be a compromise in between (e.g. allocating a fixed chunk of > message text together with the message blobs), but I still think the > necessary complexity is not outweight by similar benefits. > > All in all, I think, we have seen that the in-user-space computing needs > (and malloc counts as such) are not really the bottlenecks. Implementing > e.g. a "bunch writer" (which enables submission of multiple messages at > once to an action) seems to be (just) equally complex but promises far > better results. always possible. > In any case, I'd finally like to track down that dangling race before I > do any further optimization. It looks like Lorenzo seems to have a > relatively stable environment for reproduction and I'd like to take > advantage of that. agreed, tracking down a reproducable problem takes precidence over new improvements/tweaks any day. David Lang From rgerhards at hq.adiscon.com Mon Jan 19 10:17:18 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Mon, 19 Jan 2009 10:17:18 +0100 Subject: [rsyslog] rsyslog still crashes In-Reply-To: References: <1232046859.22744.39.camel@localhost.localdomain> <577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44F9C2@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44F9C8@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44F9C9@grfint2.intern.adiscon.com> Message-ID: <1232356638.2536.3.camel@rf10up.intern.adiscon.com> Hi David, On Fri, 2009-01-16 at 18:40 -0800, david at lang.hm wrote: > On Fri, 16 Jan 2009, Rainer Gerhards wrote: > > > Lorenzo and others: > > > > I hopefully got a system today where I can reproduce. I am setting it up right now. I also have written a stub wiki page with information useful to hunt this bug: > > one other thing that you can do for this sort of thing is to use the > amazon cloud. > > to quote a message from Rob Landley to the linux-kernel mailing list > > > My friend Mark's been experimenting with the amazon "cloud" thing, > > feeding in an image with a qemu instance and distcc+cross-compiler, and > > running builds under that. Renting an 8-way ~2.5 ghz server with 7 > > gigabytes of ram and 1.6 terabytes of disk is 80 cents/hour through them > > plus another few cents/day for bandwidth and persistent storage and > > such. That's likely to get cheaper as time goes on. > > > > We're still planning to buy a build server of our own to have something > > in- house, but for running nightly builds it's almost to the point where > > depreciation on the hardware is more than buying time from a server > > farm. Just _one_ of those 8-way servers is enough hardware to build an > > entire distro in an hour or so. > > > > What this really allows us to do is experiment with "how parallel can we > > get our build"? Because renting ten 8-way servers in a cluster is > > $8/hour, and distcc already scales trivially over that. Down the road > > what Firmware Linux is working towards is multiple qemu instances > > running in parallel with a central instance distributing builds to each > > one, so each can do its own ./configure in parallel, distribute > > compilation to the distccd instances as it has stuff to compile, and > > then package up the resulting binary into one of those portage tarballs > > and send it back to the central node to install on a network mount that > > the lot of 'em can mount as build context, so the packages can get their > > dependencies right. (You don't want your build taking place in a > > network mount, but your OS being on one you never write to isn't so bad > > as long as you have local storage to build in.) > > > > We'll probably leverage the heck out of Portage for this, and might wind > > up modifying it heavily. Dunno yet. (We can even force dependencies on > > portage so it doesn't need to calculate 'em, the central node can do > > that and then say "you have these packages, _build_"...) > > > > But yeah, hobbyists with a laptop, network access, and a monthly budget > > of $20 can do cluster builds these days. > > would it make sense to start a fund to pay for some time for you to use > like this? That's a very interesting idea, thanks for sharing. At present, however, I think I'll try to stick with Lorenzo's system, because it seems to be able to somewhat reliable reproduce the issue. My 4 core machine unfortunately runs flawlessly, so I suspect that it really depends on the mix of components, where a fast machine is a necessary perquisite, but not a sufficient one. Some other things seem need to go into the mix and I've unfortunately not yet identified them... But the could sounds like an interesting long-term idea, it would definitely be useful to be able to conduct some testing on high-end machines. Rainer From patrick.shen at net-m.de Mon Jan 19 10:21:19 2009 From: patrick.shen at net-m.de (Patrick Shen) Date: Mon, 19 Jan 2009 17:21:19 +0800 Subject: [rsyslog] A weird issue Message-ID: <4974460F.2040903@net-m.de> Hi all, Recently I encountered a weird problem. Let me explain below: I've a client which is using traditional syslog (NOT rsyslog) app for storing and forwarding logs to loghost. Here are some "snmpd" logs for example: ########################################################################################## Jan 19 10:03:09 athos snmpd[1104]: Connection from UDP: [192.168.23.7]:34289 Jan 19 10:03:09 athos snmpd[1104]: Received SNMP packet(s) from UDP: [192.168.23.7]:34289 Jan 19 10:04:10 athos snmpd[1104]: Connection from UDP: [192.168.23.7]:58181 Jan 19 10:04:10 athos snmpd[1104]: Received SNMP packet(s) from UDP: [192.168.23.7]:58181 Jan 19 10:04:10 athos snmpd[1104]: Connection from UDP: [192.168.23.7]:58181 *Jan 19 10:04:10 athos last message repeated 25 times* ########################################################################################## Please take into account the last line. And I've a loghost host for receiving by using rsyslog v3.20.2 and used following dynamic templates to store logs ########################################################################################## $template d_hosts,"/var/rsyslog/HOSTS/%hostname%/%$year%/%$month%/%syslogfacility-text%_%hostname%_%$year%_%$month%_%$day%.log" ########################################################################################## and also opened debug template by following configures in rsyslog.conf. ########################################################################################## $template DEBUG,"Debug line with all properties:\nFROMHOST: '%FROMHOST%', HOSTNAME: '%HOSTNAME%', PRI: %PRI%,\nsyslogtag '%syslogtag%', programname: '%programname%', APP-NAME: '%APP-NAME%', PROCID: '%PROCID%', MSGID: '%MSGID%', FACILITY-TEXT: '%syslogfacility-text%'\nTIMESTAMP: '%TIMESTAMP%', STRUCTURED-DATA: '%STRUCTURED-DATA%',\nmsg: '%msg%'\nrawmsg: '%rawmsg%'\n\n" *.* -/var/rsyslog/debug;DEBUG # or whatever file you like ########################################################################################## I'm monitoring on the server-side now, and checking the last line by raw message. ########################################################################################## Debug line with all properties: FROMHOST: 'athos', HOSTNAME: '*last*', PRI: 30, syslogtag 'message', programname: 'message', APP-NAME: 'message', PROCID: '-', MSGID: '-', FACILITY-TEXT: 'daemon' TIMESTAMP: 'Jan 19 09:59:09', STRUCTURED-DATA: '-', msg: ' repeated 25 times' rawmsg: '<30>last message repeated 25 times' ########################################################################################## Does anyone has any idea why HOSTNAME property is 'last'? (The timestamp is not important, because these messages occur often). Thanks, Patrick From rgerhards at hq.adiscon.com Mon Jan 19 11:00:27 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Mon, 19 Jan 2009 11:00:27 +0100 Subject: [rsyslog] A weird issue In-Reply-To: <4974460F.2040903@net-m.de> References: <4974460F.2040903@net-m.de> Message-ID: <1232359227.2536.6.camel@rf10up.intern.adiscon.com> On Mon, 2009-01-19 at 17:21 +0800, Patrick Shen wrote: > Hi all, > > Recently I encountered a weird problem. Let me explain below: > > I've a client which is using traditional syslog (NOT rsyslog) app for storing and forwarding > logs to loghost. > > Here are some "snmpd" logs for example: > ########################################################################################## > Jan 19 10:03:09 athos snmpd[1104]: Connection from UDP: [192.168.23.7]:34289 > Jan 19 10:03:09 athos snmpd[1104]: Received SNMP packet(s) from UDP: [192.168.23.7]:34289 > Jan 19 10:04:10 athos snmpd[1104]: Connection from UDP: [192.168.23.7]:58181 > Jan 19 10:04:10 athos snmpd[1104]: Received SNMP packet(s) from UDP: [192.168.23.7]:58181 > Jan 19 10:04:10 athos snmpd[1104]: Connection from UDP: [192.168.23.7]:58181 > *Jan 19 10:04:10 athos last message repeated 25 times* > ########################################################################################## > > Please take into account the last line. > > And I've a loghost host for receiving by using rsyslog v3.20.2 and used following dynamic templates to > store logs > ########################################################################################## > $template d_hosts,"/var/rsyslog/HOSTS/%hostname%/%$year%/%$month%/%syslogfacility-text%_%hostname%_%$year%_%$month%_%$day%.log" > ########################################################################################## > > and also opened debug template by following > configures in rsyslog.conf. > ########################################################################################## > $template DEBUG,"Debug line with all properties:\nFROMHOST: '%FROMHOST%', HOSTNAME: '%HOSTNAME%', PRI: %PRI%,\nsyslogtag '%syslogtag%', programname: '%programname%', APP-NAME: '%APP-NAME%', PROCID: > '%PROCID%', MSGID: '%MSGID%', FACILITY-TEXT: '%syslogfacility-text%'\nTIMESTAMP: '%TIMESTAMP%', STRUCTURED-DATA: '%STRUCTURED-DATA%',\nmsg: '%msg%'\nrawmsg: '%rawmsg%'\n\n" > *.* -/var/rsyslog/debug;DEBUG # or whatever file you like > ########################################################################################## > > I'm monitoring on the server-side now, and checking the last line by raw message. > ########################################################################################## > Debug line with all properties: > FROMHOST: 'athos', HOSTNAME: '*last*', PRI: 30, > syslogtag 'message', programname: 'message', APP-NAME: 'message', PROCID: '-', MSGID: '-', FACILITY-TEXT: 'daemon' > TIMESTAMP: 'Jan 19 09:59:09', STRUCTURED-DATA: '-', > msg: ' repeated 25 times' > rawmsg: '<30>last message repeated 25 times' > ########################################################################################## > > Does anyone has any idea why HOSTNAME property is 'last'? (The timestamp is not important, because these messages occur often). Yes, unfortunately ;) The reason simply is that sysklogd does emit malformed messages with the "last message repeated..." line. If you look at a packet capture, you'll see that they do not contain a hostname. What you see in your sysklogd log is a hostname that is locally appended. You can do a similar thing in rsyslog with the fromhost property - it does not contain the hostname but rather the system that send the message. In non-relay cases that should be the same, but in relay scenarios you see only the last hop (thus rsyslog by default uses RFC 3164 format). If you need the relay scenario, there is no way around putting rsyslog on the sending systems, too (or fixing sysklogd, which I guess you need to do yourself or it won't happen...). Rainer From rgerhards at hq.adiscon.com Mon Jan 19 11:10:50 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Mon, 19 Jan 2009 11:10:50 +0100 Subject: [rsyslog] rsyslog 3.20.3 released Message-ID: <577465F99B41C842AAFBE9ED71E70ABA44F9D6@grfint2.intern.adiscon.com> Hi all, Rsyslog 3.20.3, a member of the v3-stable branch, has been released today. It is a bug-fixing release that addresses a potential segfault that could happen if the $AllowedSenders configuration directive is used. It also addresses a doc bug, where the v3-compatibility document had an invalid directive name. This is a recommended update for all users of the v3-stable branch. Change Log: http://www.rsyslog.com/Article339.phtml Download: http://www.rsyslog.com/Downloads-req-viewdownloaddetails-lid-146.phtml I hope this release is useful. Feedback is appreciated. Best regards, Rainer Gerhards From patrick.shen at net-m.de Mon Jan 19 15:21:26 2009 From: patrick.shen at net-m.de (Patrick Shen) Date: Mon, 19 Jan 2009 22:21:26 +0800 Subject: [rsyslog] A weird issue In-Reply-To: <1232359227.2536.6.camel@rf10up.intern.adiscon.com> References: <4974460F.2040903@net-m.de> <1232359227.2536.6.camel@rf10up.intern.adiscon.com> Message-ID: <49748C66.7070102@net-m.de> Rainer Gerhards wrote: > On Mon, 2009-01-19 at 17:21 +0800, Patrick Shen wrote: >> Hi all, >> >> Recently I encountered a weird problem. Let me explain below: >> >> I've a client which is using traditional syslog (NOT rsyslog) app for storing and forwarding >> logs to loghost. >> >> Here are some "snmpd" logs for example: >> ########################################################################################## >> Jan 19 10:03:09 athos snmpd[1104]: Connection from UDP: [192.168.23.7]:34289 >> Jan 19 10:03:09 athos snmpd[1104]: Received SNMP packet(s) from UDP: [192.168.23.7]:34289 >> Jan 19 10:04:10 athos snmpd[1104]: Connection from UDP: [192.168.23.7]:58181 >> Jan 19 10:04:10 athos snmpd[1104]: Received SNMP packet(s) from UDP: [192.168.23.7]:58181 >> Jan 19 10:04:10 athos snmpd[1104]: Connection from UDP: [192.168.23.7]:58181 >> *Jan 19 10:04:10 athos last message repeated 25 times* >> ########################################################################################## >> >> Please take into account the last line. >> >> And I've a loghost host for receiving by using rsyslog v3.20.2 and used following dynamic templates to >> store logs >> ########################################################################################## >> $template d_hosts,"/var/rsyslog/HOSTS/%hostname%/%$year%/%$month%/%syslogfacility-text%_%hostname%_%$year%_%$month%_%$day%.log" >> ########################################################################################## >> >> and also opened debug template by following >> configures in rsyslog.conf. >> ########################################################################################## >> $template DEBUG,"Debug line with all properties:\nFROMHOST: '%FROMHOST%', HOSTNAME: '%HOSTNAME%', PRI: %PRI%,\nsyslogtag '%syslogtag%', programname: '%programname%', APP-NAME: '%APP-NAME%', PROCID: >> '%PROCID%', MSGID: '%MSGID%', FACILITY-TEXT: '%syslogfacility-text%'\nTIMESTAMP: '%TIMESTAMP%', STRUCTURED-DATA: '%STRUCTURED-DATA%',\nmsg: '%msg%'\nrawmsg: '%rawmsg%'\n\n" >> *.* -/var/rsyslog/debug;DEBUG # or whatever file you like >> ########################################################################################## >> >> I'm monitoring on the server-side now, and checking the last line by raw message. >> ########################################################################################## >> Debug line with all properties: >> FROMHOST: 'athos', HOSTNAME: '*last*', PRI: 30, >> syslogtag 'message', programname: 'message', APP-NAME: 'message', PROCID: '-', MSGID: '-', FACILITY-TEXT: 'daemon' >> TIMESTAMP: 'Jan 19 09:59:09', STRUCTURED-DATA: '-', >> msg: ' repeated 25 times' >> rawmsg: '<30>last message repeated 25 times' >> ########################################################################################## >> >> Does anyone has any idea why HOSTNAME property is 'last'? (The timestamp is not important, because these messages occur often). > > Yes, unfortunately ;) The reason simply is that sysklogd does emit > malformed messages with the "last message repeated..." line. If you look > at a packet capture, you'll see that they do not contain a hostname. > What you see in your sysklogd log is a hostname that is locally > appended. Ah, so simple. I'm surprised. Could you please recommend which app for packet capture? And I'd like to share another 2 log examples. ###################################################################################### Debug line with all properties: FROMHOST: 'helios', HOSTNAME: 'helios', PRI: 171, syslogtag '', programname: '', APP-NAME: '', PROCID: '-', MSGID: '-', FACILITY-TEXT: 'local5' TIMESTAMP: 'Jan 19 10:13:13', STRUCTURED-DATA: '-', msg: ' at net.netm.me.coim.GenericImportWorker.run(GenericImportWorker.java:47)' rawmsg: '<171> at net.netm.me.coim.GenericImportWorker.run(GenericImportWorker.java:47)' ###################################################################################### You could see some *spaces* between '<171>' and 'at net ...'. And HOSTNAME propety is "helios". ###################################################################################### Debug line with all properties: FROMHOST: 'helios', HOSTNAME: 'Caused', PRI: 171, syslogtag 'by:', programname: 'by', APP-NAME: 'by', PROCID: '-', MSGID: '-', FACILITY-TEXT: 'local5' TIMESTAMP: 'Jan 19 10:13:13', STRUCTURED-DATA: '-', msg: ' java.sql.BatchUpdateException: Batch entry 0 update item set itm_orderid=3722338, itm_masterorderid=0, refOrderId= 0, itm_name1=Bach: Weihnachtsoratorium, itm_name2=New London Consort, itm_author=NULL, itm_info=/var/APP/ME-utf8/content/ import/Universal-ClassicJazz/MusicDataInProgress/2000000338428, itm_info2=[NEW][ClassicJazz] [CONTENT-OK][CONTENT-320-OK] nullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnulln ullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnu ll[CHECK-MMC][CHECK-AGAIN], itm_lang=NULL, itm_isrc=NULL, itm_grid=NULL, itm_icpn=0028948002795, volume=NULL, track=0, it m_pricegroup=1880, itm_providerid=30000, itm_orderidprovider=0, itm_pricegroupprovider=1363, itm_itemidprovider=NULL, itm _viewable=1, itm_copyrightfree=F, itm_withdrmforwardlock=T, externalinfo=NULL, authorizedAge=0, meanEvaluation=0, numEval uations=0, licenseprovider_id=2131264, importSt' rawmsg: '<171>Caused by: java.sql.BatchUpdateException: Batch entry 0 update item set itm_orderid=3722338, itm_masterorde rid=0, refOrderId=0, itm_name1=Bach: Weihnachtsoratorium, itm_name2=New London Consort, itm_author=NULL, itm_info=/var/AP P/ME-utf8/content/import/Universal-ClassicJazz/MusicDataInProgress/2000000338428, itm_info2=[NEW][ClassicJazz] [CONTENT-O K][CONTENT-320-OK]nullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnul lnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnull nullnullnullnullnull[CHECK-MMC][CHECK-AGAIN], itm_lang=NULL, itm_isrc=NULL, itm_grid=NULL, itm_icpn=0028948002795, volume =NULL, track=0, itm_pricegroup=1880, itm_providerid=30000, itm_orderidprovider=0, itm_pricegroupprovider=1363, itm_itemid provider=NULL, itm_viewable=1, itm_copyrightfree=F, itm_withdrmforwardlock=T, externalinfo=NULL, authorizedAge=0, meanEva luation=0, numEvaluations=0, licenseprovider_id=2131264, importSt' ###################################################################################### But in above example: Word 'Caused' is between '<171>' and 'by ...'. So the HOSTNAME is accidentally set to 'Caused'. I'm wondering if it's a coincidence that if spaces exist between and messages in rawmsg and hostname is not provided, then HOSTNAME will be set correctly? > You can do a similar thing in rsyslog with the fromhost property - it > does not contain the hostname but rather the system that send the > message. In non-relay cases that should be the same, but in relay > scenarios you see only the last hop (thus rsyslog by default uses RFC > 3164 format). And I thought I could use 'FROMHOST' property, but I have another scenario. ###################################################################################### Debug line with all properties: FROMHOST: '172.20.101.6', HOSTNAME: 'icarus', PRI: 174, syslogtag 'httpd8330.sms:', programname: 'httpd8330.sms', APP-NAME: 'httpd8330.sms', PROCID: '-', MSGID: '-', FACILITY-TEXT: 'local5' TIMESTAMP: 'Jan 19 15:14:50', STRUCTURED-DATA: '-', msg: ' xxx.xxx.internal - - [19/Jan/2009:15:14:50 +0100] "GET /itransport/mbg/mbg/io/mbg?provider=TMOBILE_XTC_3ABO_LIVE&request-type=chargeSubscription&critialdata&originator-id=PGW_6686//S0002865748&service-type=web&payment-type=subscr&amount=299&subscription-amount=299&item-amount=299&msisdn=00491704127650&subscription-id=1662126457&subscription-type=2&reply-path=http://pgw:8330/sms/pgw/intern/ReportReception&sms-text=Dein+Abo+wurde+mit+2.99+Euro+gebucht. HTTP/1.1" 200 87#012' rawmsg: '<174>2009-01-19T15:14:50.923441+01:00 icarus httpd8330.sms: xxx.xxx.internal - - [19/Jan/2009:15:14:50 +0100] "GET /itransport/mbg/mbg/io/mbg?provider=TMOBILE_XTC_3ABO_LIVE&request-type=chargeSubscription&critialdata&originator-id=PGW_6686//S0002865748&service-type=web&payment-type=subscr&amount=299&subscription-amount=299&item-amount=299&msisdn=00491704127650&subscription-id=1662126457&subscription-type=2&reply-path=http://pgw:8330/sms/pgw/intern/ReportReception&sms-text=Dein+Abo+wurde+mit+2.99+Euro+gebucht. HTTP/1.1" 200 87#012' ###################################################################################### You could see in HOSTNAME field, it's correct set to 'icarus'. But in FROMHOST field is ip address. And I do have reverse zone for that ip in dns setting. Any ideas? > If you need the relay scenario, there is no way around putting rsyslog > on the sending systems, too (or fixing sysklogd, which I guess you need > to do yourself or it won't happen...). > > Rainer Thanks a lot for your information. Best regards, Patrick From jules at visionintel.com Mon Jan 19 15:23:27 2009 From: jules at visionintel.com (Jules Pagna Disso) Date: Mon, 19 Jan 2009 14:23:27 +0000 Subject: [rsyslog] client Message-ID: <69544300901190623g77d5f317n885a1e923f134f9f@mail.gmail.com> hi there, Is there an example of client sending alert to syslog? is it possible to create and send an alert from the command prompt to syslog? thanks, Jules From patrick.shen at net-m.de Mon Jan 19 15:48:11 2009 From: patrick.shen at net-m.de (Patrick Shen) Date: Mon, 19 Jan 2009 22:48:11 +0800 Subject: [rsyslog] client In-Reply-To: <69544300901190623g77d5f317n885a1e923f134f9f@mail.gmail.com> References: <69544300901190623g77d5f317n885a1e923f134f9f@mail.gmail.com> Message-ID: <497492AB.5030901@net-m.de> Jules Pagna Disso wrote: > hi there, > > Is there an example of client sending alert to syslog? > > is it possible to create and send an alert from the command prompt to > syslog? > > thanks, > Jules Do you mean 'logger' ? Try 'man logger'. Best regards, Patrick From lists at luigirosa.com Mon Jan 19 15:45:46 2009 From: lists at luigirosa.com (Luigi Rosa) Date: Mon, 19 Jan 2009 15:45:46 +0100 Subject: [rsyslog] client In-Reply-To: <69544300901190623g77d5f317n885a1e923f134f9f@mail.gmail.com> References: <69544300901190623g77d5f317n885a1e923f134f9f@mail.gmail.com> Message-ID: <4974921A.5040108@luigirosa.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Jules Pagna Disso said the following on 19/01/09 15:23: > Is there an example of client sending alert to syslog? You mean something like the logger utility? http://linux.about.com/library/cmd/blcmdl1_logger.htm Ciao, luigi - -- / +--[Luigi Rosa]-- \ She was a lovely girl. Our courtship was fast and furious. I was fast and she was furious. --Max Kauffmann -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iEYEARECAAYFAkl0khQACgkQ3kWu7Tfl6ZTTtwCgrgL4RTPoLiZoKaa0uw2mz9y/ KAYAnj/1BMfinxINNSgttd9TIOGfi/z4 =LxGV -----END PGP SIGNATURE----- From mrdemeanour at jackpot.uk.net Mon Jan 19 15:46:14 2009 From: mrdemeanour at jackpot.uk.net (Mr. Demeanour) Date: Mon, 19 Jan 2009 14:46:14 +0000 Subject: [rsyslog] client In-Reply-To: <69544300901190623g77d5f317n885a1e923f134f9f@mail.gmail.com> References: <69544300901190623g77d5f317n885a1e923f134f9f@mail.gmail.com> Message-ID: <49749236.6060108@jackpot.uk.net> Jules Pagna Disso wrote: > hi there, > > Is there an example of client sending alert to syslog? > > is it possible to create and send an alert from the command prompt to > syslog? Try: $ logger "Test log message" Regards, Jack. From rgerhards at hq.adiscon.com Mon Jan 19 14:45:41 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Mon, 19 Jan 2009 14:45:41 +0100 Subject: [rsyslog] A weird issue In-Reply-To: <49748C66.7070102@net-m.de> References: <4974460F.2040903@net-m.de> <1232359227.2536.6.camel@rf10up.intern.adiscon.com> <49748C66.7070102@net-m.de> Message-ID: <1232372741.2536.15.camel@rf10up.intern.adiscon.com> On Mon, 2009-01-19 at 22:21 +0800, Patrick Shen wrote: > >> Does anyone has any idea why HOSTNAME property is 'last'? (The timestamp is not important, because these messages occur often). > > > > Yes, unfortunately ;) The reason simply is that sysklogd does emit > > malformed messages with the "last message repeated..." line. If you look > > at a packet capture, you'll see that they do not contain a hostname. > > What you see in your sysklogd log is a hostname that is locally > > appended. > > Ah, so simple. I'm surprised. Could you please recommend which app for packet capture? Actually, I should have read your mail more careful. You already use rawmsg, which is the second best thing after the packet capture. But in this case, you'll see exactly the same thing (if you don't trust me, use WireShark, an excellent open source capture app). Look at this: rawmsg: '<171> at net.netm.me.coim.GenericImportWorker.run(GenericImportWorker.java:47)' Compare that the the header that is describe in RFC 3164 and you will see that there is nothing close to a real header inside that message. As the message is malformed, funny things can happen. In other words, results are unpredictable, and this is what you are seeing. > > And I'd like to share another 2 log examples. > > ###################################################################################### > Debug line with all properties: > FROMHOST: 'helios', HOSTNAME: 'helios', PRI: 171, > syslogtag '', programname: '', APP-NAME: '', PROCID: '-', MSGID: '-', FACILITY-TEXT: 'local5' > TIMESTAMP: 'Jan 19 10:13:13', STRUCTURED-DATA: '-', > msg: ' at net.netm.me.coim.GenericImportWorker.run(GenericImportWorker.java:47)' > rawmsg: '<171> at net.netm.me.coim.GenericImportWorker.run(GenericImportWorker.java:47)' > ###################################################################################### > > You could see some *spaces* between '<171>' and 'at net ...'. And HOSTNAME propety is "helios". > > > ###################################################################################### > Debug line with all properties: > FROMHOST: 'helios', HOSTNAME: 'Caused', PRI: 171, > syslogtag 'by:', programname: 'by', APP-NAME: 'by', PROCID: '-', MSGID: '-', FACILITY-TEXT: 'local5' > TIMESTAMP: 'Jan 19 10:13:13', STRUCTURED-DATA: '-', > msg: ' java.sql.BatchUpdateException: Batch entry 0 update item set itm_orderid=3722338, itm_masterorderid=0, refOrderId= > 0, itm_name1=Bach: Weihnachtsoratorium, itm_name2=New London Consort, itm_author=NULL, itm_info=/var/APP/ME-utf8/content/ > import/Universal-ClassicJazz/MusicDataInProgress/2000000338428, itm_info2=[NEW][ClassicJazz] [CONTENT-OK][CONTENT-320-OK] > nullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnulln > ullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnu > ll[CHECK-MMC][CHECK-AGAIN], itm_lang=NULL, itm_isrc=NULL, itm_grid=NULL, itm_icpn=0028948002795, volume=NULL, track=0, it > m_pricegroup=1880, itm_providerid=30000, itm_orderidprovider=0, itm_pricegroupprovider=1363, itm_itemidprovider=NULL, itm > _viewable=1, itm_copyrightfree=F, itm_withdrmforwardlock=T, externalinfo=NULL, authorizedAge=0, meanEvaluation=0, numEval > uations=0, licenseprovider_id=2131264, importSt' > rawmsg: '<171>Caused by: java.sql.BatchUpdateException: Batch entry 0 update item set itm_orderid=3722338, itm_masterorde > rid=0, refOrderId=0, itm_name1=Bach: Weihnachtsoratorium, itm_name2=New London Consort, itm_author=NULL, itm_info=/var/AP > P/ME-utf8/content/import/Universal-ClassicJazz/MusicDataInProgress/2000000338428, itm_info2=[NEW][ClassicJazz] [CONTENT-O > K][CONTENT-320-OK]nullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnul > lnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnull > nullnullnullnullnull[CHECK-MMC][CHECK-AGAIN], itm_lang=NULL, itm_isrc=NULL, itm_grid=NULL, itm_icpn=0028948002795, volume > =NULL, track=0, itm_pricegroup=1880, itm_providerid=30000, itm_orderidprovider=0, itm_pricegroupprovider=1363, itm_itemid > provider=NULL, itm_viewable=1, itm_copyrightfree=F, itm_withdrmforwardlock=T, externalinfo=NULL, authorizedAge=0, meanEva > luation=0, numEvaluations=0, licenseprovider_id=2131264, importSt' > ###################################################################################### > > But in above example: > Word 'Caused' is between '<171>' and 'by ...'. So the HOSTNAME is accidentally set to 'Caused'. > > I'm wondering if it's a coincidence that if spaces exist between and messages in rawmsg and hostname is not provided, > then HOSTNAME will be set correctly? that's probably the case with current code, but I don't guarantee that will stay. Again: invalid format => unpredictable results on all header fields > > > > You can do a similar thing in rsyslog with the fromhost property - it > > does not contain the hostname but rather the system that send the > > message. In non-relay cases that should be the same, but in relay > > scenarios you see only the last hop (thus rsyslog by default uses RFC > > 3164 format). > > And I thought I could use 'FROMHOST' property, but I have another scenario. > > ###################################################################################### > Debug line with all properties: > FROMHOST: '172.20.101.6', HOSTNAME: 'icarus', PRI: 174, > syslogtag 'httpd8330.sms:', programname: 'httpd8330.sms', APP-NAME: 'httpd8330.sms', PROCID: '-', MSGID: '-', FACILITY-TEXT: 'local5' > TIMESTAMP: 'Jan 19 15:14:50', STRUCTURED-DATA: '-', > msg: ' xxx.xxx.internal - - [19/Jan/2009:15:14:50 +0100] "GET > /itransport/mbg/mbg/io/mbg?provider=TMOBILE_XTC_3ABO_LIVE&request-type=chargeSubscription&critialdata&originator-id=PGW_6686//S0002865748&service-type=web&payment-type=subscr&amount=299&subscription-amount=299&item-amount=299&msisdn=00491704127650&subscription-id=1662126457&subscription-type=2&reply-path=http://pgw:8330/sms/pgw/intern/ReportReception&sms-text=Dein+Abo+wurde+mit+2.99+Euro+gebucht. > HTTP/1.1" 200 87#012' > rawmsg: '<174>2009-01-19T15:14:50.923441+01:00 icarus httpd8330.sms: xxx.xxx.internal - - [19/Jan/2009:15:14:50 +0100] "GET > /itransport/mbg/mbg/io/mbg?provider=TMOBILE_XTC_3ABO_LIVE&request-type=chargeSubscription&critialdata&originator-id=PGW_6686//S0002865748&service-type=web&payment-type=subscr&amount=299&subscription-amount=299&item-amount=299&msisdn=00491704127650&subscription-id=1662126457&subscription-type=2&reply-path=http://pgw:8330/sms/pgw/intern/ReportReception&sms-text=Dein+Abo+wurde+mit+2.99+Euro+gebucht. > HTTP/1.1" 200 87#012' > ###################################################################################### > that's a correctly formatted message > You could see in HOSTNAME field, it's correct set to 'icarus'. But in FROMHOST field is ip address. > And I do have reverse zone for that ip in dns setting. Any ideas? To get the name, you indeed need to enable remote lookups. One solution would be to permit different settings for different remote hosts, but that would be a feature request. Would make sense, but I am currently rather busy. If you add it to the bugzilla http://bugzilla.adiscon.com I'll see that I implement it when nothing of higher priority is in front of it. Rainer > > > If you need the relay scenario, there is no way around putting rsyslog > > on the sending systems, too (or fixing sysklogd, which I guess you need > > to do yourself or it won't happen...). > > > > Rainer > > Thanks a lot for your information. > > Best regards, > Patrick > > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com From patrick.shen at net-m.de Tue Jan 20 04:05:20 2009 From: patrick.shen at net-m.de (Patrick Shen) Date: Tue, 20 Jan 2009 11:05:20 +0800 Subject: [rsyslog] A weird issue In-Reply-To: <1232372741.2536.15.camel@rf10up.intern.adiscon.com> References: <4974460F.2040903@net-m.de> <1232359227.2536.6.camel@rf10up.intern.adiscon.com> <49748C66.7070102@net-m.de> <1232372741.2536.15.camel@rf10up.intern.adiscon.com> Message-ID: <49753F70.5050601@net-m.de> Rainer Gerhards wrote: > On Mon, 2009-01-19 at 22:21 +0800, Patrick Shen wrote: >> Ah, so simple. I'm surprised. Could you please recommend which app for packet capture? > > Actually, I should have read your mail more careful. You already use > rawmsg, which is the second best thing after the packet capture. But in > this case, you'll see exactly the same thing (if you don't trust me, use > WireShark, an excellent open source capture app). > > Look at this: > > rawmsg: '<171> at > net.netm.me.coim.GenericImportWorker.run(GenericImportWorker.java:47)' > > Compare that the the header that is describe in RFC 3164 and you will > see that there is nothing close to a real header inside that message. As > the message is malformed, funny things can happen. In other words, > results are unpredictable, and this is what you are seeing. > >> And I'd like to share another 2 log examples. >> >> ###################################################################################### >> Debug line with all properties: >> FROMHOST: 'helios', HOSTNAME: 'helios', PRI: 171, >> syslogtag '', programname: '', APP-NAME: '', PROCID: '-', MSGID: '-', FACILITY-TEXT: 'local5' >> TIMESTAMP: 'Jan 19 10:13:13', STRUCTURED-DATA: '-', >> msg: ' at net.netm.me.coim.GenericImportWorker.run(GenericImportWorker.java:47)' >> rawmsg: '<171> at net.netm.me.coim.GenericImportWorker.run(GenericImportWorker.java:47)' >> ###################################################################################### >> >> You could see some *spaces* between '<171>' and 'at net ...'. And HOSTNAME propety is "helios". >> >> >> ###################################################################################### >> Debug line with all properties: >> FROMHOST: 'helios', HOSTNAME: 'Caused', PRI: 171, >> syslogtag 'by:', programname: 'by', APP-NAME: 'by', PROCID: '-', MSGID: '-', FACILITY-TEXT: 'local5' >> TIMESTAMP: 'Jan 19 10:13:13', STRUCTURED-DATA: '-', >> msg: ' java.sql.BatchUpdateException: Batch entry 0 update item set itm_orderid=3722338, itm_masterorderid=0, refOrderId= >> 0, itm_name1=Bach: Weihnachtsoratorium, itm_name2=New London Consort, itm_author=NULL, itm_info=/var/APP/ME-utf8/content/ >> import/Universal-ClassicJazz/MusicDataInProgress/2000000338428, itm_info2=[NEW][ClassicJazz] [CONTENT-OK][CONTENT-320-OK] >> nullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnulln >> ullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnu >> ll[CHECK-MMC][CHECK-AGAIN], itm_lang=NULL, itm_isrc=NULL, itm_grid=NULL, itm_icpn=0028948002795, volume=NULL, track=0, it >> m_pricegroup=1880, itm_providerid=30000, itm_orderidprovider=0, itm_pricegroupprovider=1363, itm_itemidprovider=NULL, itm >> _viewable=1, itm_copyrightfree=F, itm_withdrmforwardlock=T, externalinfo=NULL, authorizedAge=0, meanEvaluation=0, numEval >> uations=0, licenseprovider_id=2131264, importSt' >> rawmsg: '<171>Caused by: java.sql.BatchUpdateException: Batch entry 0 update item set itm_orderid=3722338, itm_masterorde >> rid=0, refOrderId=0, itm_name1=Bach: Weihnachtsoratorium, itm_name2=New London Consort, itm_author=NULL, itm_info=/var/AP >> P/ME-utf8/content/import/Universal-ClassicJazz/MusicDataInProgress/2000000338428, itm_info2=[NEW][ClassicJazz] [CONTENT-O >> K][CONTENT-320-OK]nullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnul >> lnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnull >> nullnullnullnullnull[CHECK-MMC][CHECK-AGAIN], itm_lang=NULL, itm_isrc=NULL, itm_grid=NULL, itm_icpn=0028948002795, volume >> =NULL, track=0, itm_pricegroup=1880, itm_providerid=30000, itm_orderidprovider=0, itm_pricegroupprovider=1363, itm_itemid >> provider=NULL, itm_viewable=1, itm_copyrightfree=F, itm_withdrmforwardlock=T, externalinfo=NULL, authorizedAge=0, meanEva >> luation=0, numEvaluations=0, licenseprovider_id=2131264, importSt' >> ###################################################################################### >> >> But in above example: >> Word 'Caused' is between '<171>' and 'by ...'. So the HOSTNAME is accidentally set to 'Caused'. >> >> I'm wondering if it's a coincidence that if spaces exist between and messages in rawmsg and hostname is not provided, >> then HOSTNAME will be set correctly? > > that's probably the case with current code, but I don't guarantee that > will stay. Again: invalid format => unpredictable results on all header > fields OK, now I see the malformed format messages will cause unpredictable results in rsyslog. That's quite helpful. >> >> And I thought I could use 'FROMHOST' property, but I have another scenario. >> >> ###################################################################################### >> Debug line with all properties: >> FROMHOST: '172.20.101.6', HOSTNAME: 'icarus', PRI: 174, >> syslogtag 'httpd8330.sms:', programname: 'httpd8330.sms', APP-NAME: 'httpd8330.sms', PROCID: '-', MSGID: '-', FACILITY-TEXT: 'local5' >> TIMESTAMP: 'Jan 19 15:14:50', STRUCTURED-DATA: '-', >> msg: ' xxx.xxx.internal - - [19/Jan/2009:15:14:50 +0100] "GET >> /itransport/mbg/mbg/io/mbg?provider=TMOBILE_XTC_3ABO_LIVE&request-type=chargeSubscription&critialdata&originator-id=PGW_6686//S0002865748&service-type=web&payment-type=subscr&amount=299&subscription-amount=299&item-amount=299&msisdn=00491704127650&subscription-id=1662126457&subscription-type=2&reply-path=http://pgw:8330/sms/pgw/intern/ReportReception&sms-text=Dein+Abo+wurde+mit+2.99+Euro+gebucht. >> HTTP/1.1" 200 87#012' >> rawmsg: '<174>2009-01-19T15:14:50.923441+01:00 icarus httpd8330.sms: xxx.xxx.internal - - [19/Jan/2009:15:14:50 +0100] "GET >> /itransport/mbg/mbg/io/mbg?provider=TMOBILE_XTC_3ABO_LIVE&request-type=chargeSubscription&critialdata&originator-id=PGW_6686//S0002865748&service-type=web&payment-type=subscr&amount=299&subscription-amount=299&item-amount=299&msisdn=00491704127650&subscription-id=1662126457&subscription-type=2&reply-path=http://pgw:8330/sms/pgw/intern/ReportReception&sms-text=Dein+Abo+wurde+mit+2.99+Euro+gebucht. >> HTTP/1.1" 200 87#012' >> ###################################################################################### >> > that's a correctly formatted message > >> You could see in HOSTNAME field, it's correct set to 'icarus'. But in FROMHOST field is ip address. >> And I do have reverse zone for that ip in dns setting. Any ideas? > > To get the name, you indeed need to enable remote lookups. One solution > would be to permit different settings for different remote hosts, but > that would be a feature request. Would make sense, but I am currently > rather busy. If you add it to the bugzilla http://bugzilla.adiscon.com > I'll see that I implement it when nothing of higher priority is in front > of it. I've filed a bugzilla report [1] for your information. Anyway, one more question, if I use rsyslog at the client side, will it avoid malformed/invalid format message sending out? [1]: http://bugzilla.adiscon.com/show_bug.cgi?id=116 Thanks a lot for your help, Patrick From rgerhards at hq.adiscon.com Mon Jan 19 18:16:08 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Mon, 19 Jan 2009 18:16:08 +0100 Subject: [rsyslog] A weird issue In-Reply-To: <49753F70.5050601@net-m.de> References: <4974460F.2040903@net-m.de> <1232359227.2536.6.camel@rf10up.intern.adiscon.com> <49748C66.7070102@net-m.de> <1232372741.2536.15.camel@rf10up.intern.adiscon.com> <49753F70.5050601@net-m.de> Message-ID: <1232385368.2536.22.camel@rf10up.intern.adiscon.com> On Tue, 2009-01-20 at 11:05 +0800, Patrick Shen wrote: > >> But in above example: > >> Word 'Caused' is between '<171>' and 'by ...'. So the HOSTNAME is accidentally set to 'Caused'. > >> > >> I'm wondering if it's a coincidence that if spaces exist between and messages in rawmsg and hostname is not provided, > >> then HOSTNAME will be set correctly? > > > > that's probably the case with current code, but I don't guarantee that > > will stay. Again: invalid format => unpredictable results on all header > > fields > > OK, now I see the malformed format messages will cause unpredictable results in rsyslog. > That's quite helpful. > > >> > >> And I thought I could use 'FROMHOST' property, but I have another scenario. > >> > >> ###################################################################################### > >> Debug line with all properties: > >> FROMHOST: '172.20.101.6', HOSTNAME: 'icarus', PRI: 174, > >> syslogtag 'httpd8330.sms:', programname: 'httpd8330.sms', APP-NAME: 'httpd8330.sms', PROCID: '-', MSGID: '-', FACILITY-TEXT: 'local5' > >> TIMESTAMP: 'Jan 19 15:14:50', STRUCTURED-DATA: '-', > >> msg: ' xxx.xxx.internal - - [19/Jan/2009:15:14:50 +0100] "GET > >> /itransport/mbg/mbg/io/mbg?provider=TMOBILE_XTC_3ABO_LIVE&request-type=chargeSubscription&critialdata&originator-id=PGW_6686//S0002865748&service-type=web&payment-type=subscr&amount=299&subscription-amount=299&item-amount=299&msisdn=00491704127650&subscription-id=1662126457&subscription-type=2&reply-path=http://pgw:8330/sms/pgw/intern/ReportReception&sms-text=Dein+Abo+wurde+mit+2.99+Euro+gebucht. > >> HTTP/1.1" 200 87#012' > >> rawmsg: '<174>2009-01-19T15:14:50.923441+01:00 icarus httpd8330.sms: xxx.xxx.internal - - [19/Jan/2009:15:14:50 +0100] "GET > >> /itransport/mbg/mbg/io/mbg?provider=TMOBILE_XTC_3ABO_LIVE&request-type=chargeSubscription&critialdata&originator-id=PGW_6686//S0002865748&service-type=web&payment-type=subscr&amount=299&subscription-amount=299&item-amount=299&msisdn=00491704127650&subscription-id=1662126457&subscription-type=2&reply-path=http://pgw:8330/sms/pgw/intern/ReportReception&sms-text=Dein+Abo+wurde+mit+2.99+Euro+gebucht. > >> HTTP/1.1" 200 87#012' > >> ###################################################################################### > >> > > that's a correctly formatted message > > > >> You could see in HOSTNAME field, it's correct set to 'icarus'. But in FROMHOST field is ip address. > >> And I do have reverse zone for that ip in dns setting. Any ideas? > > > > To get the name, you indeed need to enable remote lookups. One solution > > would be to permit different settings for different remote hosts, but > > that would be a feature request. Would make sense, but I am currently > > rather busy. If you add it to the bugzilla http://bugzilla.adiscon.com > > I'll see that I implement it when nothing of higher priority is in front > > of it. > > I've filed a bugzilla report [1] for your information. Anyway, one more question, if I use rsyslog at > the client side, will it avoid malformed/invalid format message sending out? I have tweaked the feature request a bit so that it matches the actual request ;) As far as rsyslog on the client side is concerned, you need to do nothing. If you use the default templates, it emits correctly formatted messages. Rainer From rgerhards at hq.adiscon.com Tue Jan 20 14:00:00 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Tue, 20 Jan 2009 14:00:00 +0100 Subject: [rsyslog] Anyone in Computer Forensics? Message-ID: <577465F99B41C842AAFBE9ED71E70ABA44F9EF@grfint2.intern.adiscon.com> Hi all, are there some folks on this list who are working in the computer forensics space? I wonder how syslog, and rsyslog in specific, works in forensics. Most importantly, I am interested in what stops acceptance in the forensics field (or what nurtures it). I am interested in feedback to help shape the medium to long term schedule for rsyslog (including those initiatives that I should learn more about). Any feedback is appreciated. Thanks, Rainer From rgerhards at hq.adiscon.com Tue Jan 20 15:27:57 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Tue, 20 Jan 2009 15:27:57 +0100 Subject: [rsyslog] Is rsyslog leaking memory? In-Reply-To: References: Message-ID: <577465F99B41C842AAFBE9ED71E70ABA44F9F4@grfint2.intern.adiscon.com> FYI: Based on a forum thread, I just created this page: http://wiki.rsyslog.com/index.php/Reducing_memory_usage I think it actually describes the source of the 8MB memory blocks. Rainer > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of Michael Biebl > Sent: Saturday, January 17, 2009 11:11 AM > To: rsyslog-users > Subject: [rsyslog] Is rsyslog leaking memory? > > Hi, > > I'm running rsyslog 3.20.2 > > I noticed the following: > # /etc/init.d/rsyslog restart > VSZ RSS (as reported by ps) > 27100 1184 > # logger foo > 27100 1196 > # logger foo (1000x) > 27100 1200 > # logger foo (1000x) > 27100 1204 > # logger foo (1000x) > 27100 1208 > > and so on. > > > This made me wonder, if rsyslog is leaking memory somewhere. > > I also noticed, that for each loaded module, rsyslog resevers exactly > 8 Mb of anoymous memory (pmap -d `pgrep rsyslog`) > With a couple of loaded modules you easily get over 50Mb VSZ. > > > Michael > -- > Why is it that all of the instruments seeking intelligent life in the > universe are pointed away from Earth? > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com From aoz.syn at gmail.com Tue Jan 20 16:39:34 2009 From: aoz.syn at gmail.com (RB) Date: Tue, 20 Jan 2009 08:39:34 -0700 Subject: [rsyslog] Anyone in Computer Forensics? In-Reply-To: <577465F99B41C842AAFBE9ED71E70ABA44F9EF@grfint2.intern.adiscon.com> References: <577465F99B41C842AAFBE9ED71E70ABA44F9EF@grfint2.intern.adiscon.com> Message-ID: <4255c2570901200739v7b60da13qf8fbf425c83503f8@mail.gmail.com> On Tue, Jan 20, 2009 at 06:00, Rainer Gerhards wrote: > are there some folks on this list who are working in the computer > forensics space? I wonder how syslog, and rsyslog in specific, works in > forensics. Could you clarify what you're asking here? There are two clearly delineated portions of the computer forensics space: that which is analyzed and that which performs the analysis. Are you looking more to improve analysis of rsyslog instances or to integrate into back-end tools? > Most importantly, I am interested in what stops acceptance in > the forensics field (or what nurtures it). I am interested in feedback > to help shape the medium to long term schedule for rsyslog (including > those initiatives that I should learn more about). Law Enforcement. LE is by far the biggest driver in industry acceptance, nearly regardless of technology. The "primary" forensics tool, EnCase, is a perfect example: there are many arguably better products on the market, but because huge numbers of extremely non-technical police officers are comfortable with it (since Guidance gives steep LE discounts), it is by far the biggest player. There isn't a huge amount of logging to be done in the analysis space. Although centralized solutions are becoming more prevalent, most of the critical logs are being (or will be) stored with the encrypted/signed forensic data for non-repudiation. Even so, there is more effort going into improving analysis (carvers, documenting formats, etc.) than building up proper logging and storage. From david at lang.hm Tue Jan 20 20:54:13 2009 From: david at lang.hm (david at lang.hm) Date: Tue, 20 Jan 2009 11:54:13 -0800 (PST) Subject: [rsyslog] Anyone in Computer Forensics? In-Reply-To: <4255c2570901200739v7b60da13qf8fbf425c83503f8@mail.gmail.com> References: <577465F99B41C842AAFBE9ED71E70ABA44F9EF@grfint2.intern.adiscon.com> <4255c2570901200739v7b60da13qf8fbf425c83503f8@mail.gmail.com> Message-ID: On Tue, 20 Jan 2009, RB wrote: > On Tue, Jan 20, 2009 at 06:00, Rainer Gerhards wrote: >> are there some folks on this list who are working in the computer >> forensics space? I wonder how syslog, and rsyslog in specific, works in >> forensics. > > Could you clarify what you're asking here? There are two clearly > delineated portions of the computer forensics space: that which is > analyzed and that which performs the analysis. Are you looking more > to improve analysis of rsyslog instances or to integrate into back-end > tools? > >> Most importantly, I am interested in what stops acceptance in >> the forensics field (or what nurtures it). I am interested in feedback >> to help shape the medium to long term schedule for rsyslog (including >> those initiatives that I should learn more about). I think that what he is asking about is what makes logs acceptable or not acceptable when doing forensics, and what configurations of rsyslog would be acceptable. for example, rsyslog can be configured to use disk-based queues on redundant drives and RELP for network communication, and the result will be that rsyslog is _very_ reliable in terms of preserving messages that get to it (at the cost of performance, but you can throw hardware at it to deal with that) this is probably acceptable as a log for forensics type work. but what about the more normal settings? (tcp or udp network communications with memory-based queues). those settings can loose data, but won't under normal conditions (assuming the network isn't so busy that it drops UDP packets) David Lang From jules at visionintel.com Tue Jan 20 20:14:58 2009 From: jules at visionintel.com (Jules Pagna Disso) Date: Tue, 20 Jan 2009 19:14:58 +0000 Subject: [rsyslog] client In-Reply-To: <497492AB.5030901@net-m.de> References: <69544300901190623g77d5f317n885a1e923f134f9f@mail.gmail.com> <497492AB.5030901@net-m.de> Message-ID: <69544300901201114r7671a2f9t3b8b8f27797b1cd7@mail.gmail.com> hi there, thanks for the answer it helped and does what I wanted. Now, I am wonder if there is a sample code how to send log file from a c/c++ code to syslog deamon. thanks Jules 2009/1/19 Patrick Shen > Jules Pagna Disso wrote: > > hi there, > > > > Is there an example of client sending alert to syslog? > > > > is it possible to create and send an alert from the command prompt to > > syslog? > > > > thanks, > > Jules > > Do you mean 'logger' ? > > Try 'man logger'. > > Best regards, > Patrick > > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com > From danson at rackspace.com Tue Jan 20 20:27:45 2009 From: danson at rackspace.com (Daniel Anson) Date: Tue, 20 Jan 2009 13:27:45 -0600 Subject: [rsyslog] client In-Reply-To: <69544300901201114r7671a2f9t3b8b8f27797b1cd7@mail.gmail.com> References: <69544300901190623g77d5f317n885a1e923f134f9f@mail.gmail.com><497492AB.5030901@net-m.de> <69544300901201114r7671a2f9t3b8b8f27797b1cd7@mail.gmail.com> Message-ID: <3435_1232479837_n0KJUY8N004510_96AF20FDF4301D419B33CCE8E3A0132B0ACED7E8@SAT4MX07.RACKSPACE.CORP> I use this: >gcc -o syslog_write syslog_writer.c >./syslog_writer 300 <-- This is the number of messages it will write #include #include #include int main(int argc, char **argv) { int num_syslogs = atoi(argv[1]), i; openlog("syslog_writer", LOG_CONS | LOG_PID, LOG_LOCAL1); for(i=0; i < num_syslogs; i++) { syslog(LOG_NOTICE, "syslog_writer: log number %d", i); } return(1); } -----Original Message----- From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog-bounces at lists.adiscon.com] On Behalf Of Jules Pagna Disso Sent: Tuesday, January 20, 2009 1:15 PM To: rsyslog-users Subject: Re: [rsyslog] client hi there, thanks for the answer it helped and does what I wanted. Now, I am wonder if there is a sample code how to send log file from a c/c++ code to syslog deamon. thanks Jules 2009/1/19 Patrick Shen > Jules Pagna Disso wrote: > > hi there, > > > > Is there an example of client sending alert to syslog? > > > > is it possible to create and send an alert from the command prompt to > > syslog? > > > > thanks, > > Jules > > Do you mean 'logger' ? > > Try 'man logger'. > > Best regards, > Patrick > > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com > _______________________________________________ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com Confidentiality Notice: This e-mail message (including any attached or embedded documents) is intended for the exclusive and confidential use of the individual or entity to which this message is addressed, and unless otherwise expressly indicated, is confidential and privileged information of Rackspace. Any dissemination, distribution or copying of the enclosed material is prohibited. If you receive this transmission in error, please notify us immediately by e-mail at abuse at rackspace.com, and delete the original message. Your cooperation is appreciated. From aoz.syn at gmail.com Wed Jan 21 18:59:42 2009 From: aoz.syn at gmail.com (RB) Date: Wed, 21 Jan 2009 10:59:42 -0700 Subject: [rsyslog] Anyone in Computer Forensics? In-Reply-To: References: <577465F99B41C842AAFBE9ED71E70ABA44F9EF@grfint2.intern.adiscon.com> <4255c2570901200739v7b60da13qf8fbf425c83503f8@mail.gmail.com> Message-ID: <4255c2570901210959t130dd49oc0402ebe7d8c2b69@mail.gmail.com> On Tue, Jan 20, 2009 at 12:54, wrote: > I think that what he is asking about is what makes logs acceptable or not > acceptable when doing forensics, and what configurations of rsyslog would > be acceptable. That's still unclear as to whether the logging instances are being analyzed or they are part of the analysis process (i.e. logging investigator actions, "interesting" items, etc.). > for example, rsyslog can be configured to use disk-based queues on > redundant drives and RELP for network communication, and the result will > be that rsyslog is _very_ reliable in terms of preserving messages that > get to it (at the cost of performance, but you can throw hardware at it to > deal with that) > > this is probably acceptable as a log for forensics type work. > > but what about the more normal settings? (tcp or udp network > communications with memory-based queues). those settings can loose data, > but won't under normal conditions (assuming the network isn't so busy that > it drops UDP packets) Generally speaking, forensics prefers the "save everything, impossible to lose" approach. A single lost message probably won't break a given case, but the possibility is definitely there. RELP with disk queues on hardware-redundant drives would probably be a good start if you're looking to ease future analysis, but it is my opinion that networked logging of the forensic process is both unlikely and overkill, as most analysis processes want their logs integrated instead of held as a separate source. One item I have had on my wish-list for quite some time is the ability to log directly to a UDF VAT filesystem (incremental writes on write-once optical media). Poor man's WORM, if you will. It would enable physical assurance that log data is unmodified up to the point of compromise. Add in the idea of incremental checksums or signing, and you have an extremely controlled, verifiable log source. Of course, it doesn't have to be solved in rsyslog-space, but it'd definitely be useful. RB From david at lang.hm Wed Jan 21 20:55:25 2009 From: david at lang.hm (david at lang.hm) Date: Wed, 21 Jan 2009 11:55:25 -0800 (PST) Subject: [rsyslog] Anyone in Computer Forensics? In-Reply-To: <4255c2570901210959t130dd49oc0402ebe7d8c2b69@mail.gmail.com> References: <577465F99B41C842AAFBE9ED71E70ABA44F9EF@grfint2.intern.adiscon.com> <4255c2570901200739v7b60da13qf8fbf425c83503f8@mail.gmail.com> <4255c2570901210959t130dd49oc0402ebe7d8c2b69@mail.gmail.com> Message-ID: On Wed, 21 Jan 2009, RB wrote: > On Tue, Jan 20, 2009 at 12:54, wrote: >> I think that what he is asking about is what makes logs acceptable or not >> acceptable when doing forensics, and what configurations of rsyslog would >> be acceptable. > > That's still unclear as to whether the logging instances are being > analyzed or they are part of the analysis process (i.e. logging > investigator actions, "interesting" items, etc.). I think it's the logs being analysed, not logging investigator actions (other than the extent that things the investigators do would be logged if anyone did them) >> for example, rsyslog can be configured to use disk-based queues on >> redundant drives and RELP for network communication, and the result will >> be that rsyslog is _very_ reliable in terms of preserving messages that >> get to it (at the cost of performance, but you can throw hardware at it to >> deal with that) >> >> this is probably acceptable as a log for forensics type work. >> >> but what about the more normal settings? (tcp or udp network >> communications with memory-based queues). those settings can loose data, >> but won't under normal conditions (assuming the network isn't so busy that >> it drops UDP packets) > > Generally speaking, forensics prefers the "save everything, impossible > to lose" approach. A single lost message probably won't break a given > case, but the possibility is definitely there. this is the most paranoid/conservative view, and by this definition there are basicly no logs in existance that meet the forensics requirements > RELP with disk queues > on hardware-redundant drives would probably be a good start if you're > looking to ease future analysis, but it is my opinion that networked > logging of the forensic process is both unlikely and overkill, as most > analysis processes want their logs integrated instead of held as a > separate source. > > One item I have had on my wish-list for quite some time is the ability > to log directly to a UDF VAT filesystem (incremental writes on > write-once optical media). Poor man's WORM, if you will. It would > enable physical assurance that log data is unmodified up to the point > of compromise. Add in the idea of incremental checksums or signing, > and you have an extremely controlled, verifiable log source. Of > course, it doesn't have to be solved in rsyslog-space, but it'd > definitely be useful. frankly, if you really need write-only media, the best thing to do (volume permitting) is to dump to a printer. David Lang From aoz.syn at gmail.com Wed Jan 21 21:59:28 2009 From: aoz.syn at gmail.com (RB) Date: Wed, 21 Jan 2009 13:59:28 -0700 Subject: [rsyslog] Anyone in Computer Forensics? In-Reply-To: References: <577465F99B41C842AAFBE9ED71E70ABA44F9EF@grfint2.intern.adiscon.com> <4255c2570901200739v7b60da13qf8fbf425c83503f8@mail.gmail.com> <4255c2570901210959t130dd49oc0402ebe7d8c2b69@mail.gmail.com> Message-ID: <4255c2570901211259o3b54573dp746a4d8f41efee24@mail.gmail.com> On Wed, Jan 21, 2009 at 12:55, wrote: > this is the most paranoid/conservative view, and by this definition there > are basicly no logs in existance that meet the forensics requirements Rather than set an unattainable standard, my intent was to communicate the conservative approach forensics would rather take. Edge cases and mitigating controls are acceptable as long as they are well-documented - that's basic security practice. I would rather see a solution that has 100 well-documented lossy edge cases than one that claims to be lossless with no proofs to back it. > frankly, if you really need write-only media, the best thing to do (volume > permitting) is to dump to a printer. You may want to recalculate; even 6-point font on large (14.875x11.5") tractor-feed paper only fits ~80MB per 3500-sheet box. Or, put another way, 2 512-byte events per second will burn through a $70 case per day. Or 6.5 reams of US Letter per day. Extremely limited volume. From david at lang.hm Wed Jan 21 23:19:01 2009 From: david at lang.hm (david at lang.hm) Date: Wed, 21 Jan 2009 14:19:01 -0800 (PST) Subject: [rsyslog] Anyone in Computer Forensics? In-Reply-To: <4255c2570901211259o3b54573dp746a4d8f41efee24@mail.gmail.com> References: <577465F99B41C842AAFBE9ED71E70ABA44F9EF@grfint2.intern.adiscon.com> <4255c2570901200739v7b60da13qf8fbf425c83503f8@mail.gmail.com> <4255c2570901210959t130dd49oc0402ebe7d8c2b69@mail.gmail.com> <4255c2570901211259o3b54573dp746a4d8f41efee24@mail.gmail.com> Message-ID: On Wed, 21 Jan 2009, RB wrote: > On Wed, Jan 21, 2009 at 12:55, wrote: >> this is the most paranoid/conservative view, and by this definition there >> are basicly no logs in existance that meet the forensics requirements > > Rather than set an unattainable standard, my intent was to communicate > the conservative approach forensics would rather take. Edge cases and > mitigating controls are acceptable as long as they are well-documented > - that's basic security practice. I would rather see a solution that > has 100 well-documented lossy edge cases than one that claims to be > lossless with no proofs to back it. the problem is that so many forensics people list the perfect situation and tell people that anything less won't stand up in court. like everything else, it's a reliability/performance/cost trade-off but we really aren't answering the initial question here (or rather we are demonstrating that there isn't a clear answer to the question) >> franklk, if you really need write-only media, the best thing to do (volume >> permitting) is to dump to a printer. > > You may want to recalculate; even 6-point font on large (14.875x11.5") > tractor-feed paper only fits ~80MB per 3500-sheet box. Or, put > another way, 2 512-byte events per second will burn through a $70 case > per day. Or 6.5 reams of US Letter per day. Extremely limited > volume. that's why I said volume permitting (and for your most critical logs the volume is probably fairly low) David Lang From rgerhards at hq.adiscon.com Wed Jan 21 22:21:08 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Wed, 21 Jan 2009 22:21:08 +0100 Subject: [rsyslog] Anyone in Computer Forensics? In-Reply-To: <4255c2570901211259o3b54573dp746a4d8f41efee24@mail.gmail.com> References: <577465F99B41C842AAFBE9ED71E70ABA44F9EF@grfint2.intern.adiscon.com><4255c2570901200739v7b60da13qf8fbf425c83503f8@mail.gmail.com><4255c2570901210959t130dd49oc0402ebe7d8c2b69@mail.gmail.com> <4255c2570901211259o3b54573dp746a4d8f41efee24@mail.gmail.com> Message-ID: <577465F99B41C842AAFBE9ED71E70ABA44FA0B@grfint2.intern.adiscon.com> Hi all, Sorry for posting the question and then being offline. I had a meeting and was after that a bit more swamped than I expected ;) Thanks for the good answers so far. My question was vague, but that reflected that I actually do not exactly know what to ask for. While I took a look at forensics every now and then, this is not an area where I have really any deep expertise. However, I should have stated that I am primarily interested on the event detection/gathering, transmission and storage part of the picture. That's where rsyslog can play a role (that limits the "event detection" process to listening to whoever wants to talk to it). The analysis part is beyond my scope right now (and probably will be for quite some time). As I said, I do not have an immediate need, but would like to understand the needs a bit better (and you have already provided good advise so far :)). The root cause of my question is that I would like to refine my medium, may be long term vision. While I think I can not implement any of the outcome, it helps my tune the implementation of things I do in a way that facilitates forensic needs (at least in cases where I have a choice). Without that information, I would probably do things in ways that will require much more effort once I get to "forensics-readiness". I hope this clarifies and sorry for not replying sooner. I will probably be a bit swamped 'til the end of the week, but will try to be more responsive now :) Thanks again for all that fine information, please keep it flowing. It is very useful. Rainer > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com > [mailto:rsyslog-bounces at lists.adiscon.com] On Behalf Of RB > Sent: Wednesday, January 21, 2009 9:59 PM > To: rsyslog-users > Subject: Re: [rsyslog] Anyone in Computer Forensics? > > On Wed, Jan 21, 2009 at 12:55, wrote: > > this is the most paranoid/conservative view, and by this > definition there > > are basicly no logs in existance that meet the forensics > requirements > > Rather than set an unattainable standard, my intent was to communicate > the conservative approach forensics would rather take. Edge cases and > mitigating controls are acceptable as long as they are well-documented > - that's basic security practice. I would rather see a solution that > has 100 well-documented lossy edge cases than one that claims to be > lossless with no proofs to back it. > > > frankly, if you really need write-only media, the best > thing to do (volume > > permitting) is to dump to a printer. > > You may want to recalculate; even 6-point font on large (14.875x11.5") > tractor-feed paper only fits ~80MB per 3500-sheet box. Or, put > another way, 2 512-byte events per second will burn through a $70 case > per day. Or 6.5 reams of US Letter per day. Extremely limited > volume. > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com > From milton at calnek.com Thu Jan 22 02:24:48 2009 From: milton at calnek.com (Milton Calnek) Date: Wed, 21 Jan 2009 19:24:48 -0600 Subject: [rsyslog] Multiple devices with same ip address. Message-ID: <4977CAE0.1040403@calnek.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi, I'm running a test lab with gear where every piece of gear under test has the same ip address. I have separated them via vlans, but I want to be able to send syslog from these devices to a central host... but with everything having the same ip address, there doesn't seem to be a way easily separate the logs. I see how to log based on ip, but not MAC nor interface. Before I invest in the development time, I was wondering if you folks have any suggestions? Thanks. - -- Milton Calnek BSc, A/Slt(Ret.) milton at calnek.com 306-717-8737 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.5 (GNU/Linux) Comment: Using GnuPG with CentOS - http://enigmail.mozdev.org iD8DBQFJd8rgHgnbf2T2QqMRArhdAKCCisNIrs+ohNoq2AUiaaiZJdT6SwCfSS3u 4r5JOPJn6SBPWlzMXUBjfQE= =eVoR -----END PGP SIGNATURE----- -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From rsyslog at lists.bod.org Thu Jan 22 03:31:39 2009 From: rsyslog at lists.bod.org (Paul Chambers) Date: Wed, 21 Jan 2009 18:31:39 -0800 Subject: [rsyslog] Multiple devices with same ip address. In-Reply-To: <4977CAE0.1040403@calnek.com> References: <4977CAE0.1040403@calnek.com> Message-ID: <4977DA8B.3010309@lists.bod.org> Couldn't you use NAT on the vlan interfaces? that way traffic on each interface could be mapped to a different IP address as seen by the logging machine. -- Paul Milton Calnek wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Hi, > > I'm running a test lab with gear where every piece of gear > under test has the same ip address. > > I have separated them via vlans, but I want to be able to send syslog > from these devices to a central host... but with everything having the > same ip address, there doesn't seem to be a way easily separate the logs. > I see how to log based on ip, but not MAC nor interface. > > Before I invest in the development time, I was wondering if you folks > have any suggestions? > > Thanks. > - -- > Milton Calnek BSc, A/Slt(Ret.) > milton at calnek.com > 306-717-8737 > > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.4.5 (GNU/Linux) > Comment: Using GnuPG with CentOS - http://enigmail.mozdev.org > > iD8DBQFJd8rgHgnbf2T2QqMRArhdAKCCisNIrs+ohNoq2AUiaaiZJdT6SwCfSS3u > 4r5JOPJn6SBPWlzMXUBjfQE= > =eVoR > -----END PGP SIGNATURE----- > > From milton at calnek.com Thu Jan 22 04:26:25 2009 From: milton at calnek.com (Milton Calnek) Date: Wed, 21 Jan 2009 21:26:25 -0600 Subject: [rsyslog] Multiple devices with same ip address. In-Reply-To: <4977DA8B.3010309@lists.bod.org> References: <4977CAE0.1040403@calnek.com> <4977DA8B.3010309@lists.bod.org> Message-ID: <4977E761.7070903@calnek.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Paul Chambers wrote: > Couldn't you use NAT on the vlan interfaces? that way traffic on each > interface could be mapped to a different IP address as seen by the > logging machine. I tried that. It didn't work for me. I don't remember the details just now, but it had something to do with the order things happen on the linux IP stack. If you can suggest a set of commands, I'll try it out. Thanks. - -- Milton Calnek BSc, A/Slt(Ret.) milton at calnek.com 306-717-8737 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.5 (GNU/Linux) Comment: Using GnuPG with CentOS - http://enigmail.mozdev.org iD8DBQFJd+dhHgnbf2T2QqMRArc9AKCf1tk2gW5XGOM4cCNevVj8QKwV5gCdHKAT 8OETLsF4Csv6d4/gFVlLtjU= =23Dv -----END PGP SIGNATURE----- -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From rsyslog at lists.bod.org Thu Jan 22 05:19:07 2009 From: rsyslog at lists.bod.org (Paul Chambers) Date: Wed, 21 Jan 2009 20:19:07 -0800 Subject: [rsyslog] Multiple devices with same ip address. In-Reply-To: <4977E761.7070903@calnek.com> References: <4977CAE0.1040403@calnek.com> <4977DA8B.3010309@lists.bod.org> <4977E761.7070903@calnek.com> Message-ID: <4977F3BB.6080205@lists.bod.org> Hard to give you specifics without a lot more information (and time's scarce, sorry). Something that helped me understand how netfilter handles packets, and the order the various tables/chains happen, is the documentation for ebtables, specifically: http://ebtables.sourceforge.net/br_fw_ia/br_fw_ia.html I'd be amazed if it's not possible to masquerade/source-NAT each vlan interface to a unique IP addresses. Between netfilter and ebtables, there's an enormous amount of flexibility. -- Paul Milton Calnek wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > > > Paul Chambers wrote: > >> Couldn't you use NAT on the vlan interfaces? that way traffic on each >> interface could be mapped to a different IP address as seen by the >> logging machine. >> > > I tried that. It didn't work for me. I don't remember the details just now, > but it had something to do with the order things happen on the linux IP stack. > > If you can suggest a set of commands, I'll try it out. > > Thanks. > - -- > Milton Calnek BSc, A/Slt(Ret.) > milton at calnek.com > 306-717-8737 > > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.4.5 (GNU/Linux) > Comment: Using GnuPG with CentOS - http://enigmail.mozdev.org > > iD8DBQFJd+dhHgnbf2T2QqMRArc9AKCf1tk2gW5XGOM4cCNevVj8QKwV5gCdHKAT > 8OETLsF4Csv6d4/gFVlLtjU= > =23Dv > -----END PGP SIGNATURE----- > > From david at lang.hm Thu Jan 22 07:48:49 2009 From: david at lang.hm (david at lang.hm) Date: Wed, 21 Jan 2009 22:48:49 -0800 (PST) Subject: [rsyslog] Multiple devices with same ip address. In-Reply-To: <4977CAE0.1040403@calnek.com> References: <4977CAE0.1040403@calnek.com> Message-ID: On Wed, 21 Jan 2009, Milton Calnek wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Hi, > > I'm running a test lab with gear where every piece of gear > under test has the same ip address. > > I have separated them via vlans, but I want to be able to send syslog > from these devices to a central host... but with everything having the > same ip address, there doesn't seem to be a way easily separate the logs. > I see how to log based on ip, but not MAC nor interface. > > Before I invest in the development time, I was wondering if you folks > have any suggestions? if you are running rsyslog on the systems under test, try changing the template that rsyslog uses to sent the messages out from each system puts something unique in it's logs. David Lang From rgerhards at hq.adiscon.com Thu Jan 22 08:46:48 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Thu, 22 Jan 2009 08:46:48 +0100 Subject: [rsyslog] Multiple devices with same ip address. In-Reply-To: References: <4977CAE0.1040403@calnek.com> Message-ID: <577465F99B41C842AAFBE9ED71E70ABA44FA0D@grfint2.intern.adiscon.com> David is right, this is probably the best way to do it. Even if the sender's in question are not powered by rsyslog, it most often is possible to put something unique into the messages. If there are few devices (<= 8), you can also use the local syslog facilities to identify the instances (almost all senders allow to configure that). In any case, you can then use the unique identifier to sort out messages to different bins on the receiver. HTH Rainer > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of david at lang.hm > Sent: Thursday, January 22, 2009 7:49 AM > To: rsyslog-users > Subject: Re: [rsyslog] Multiple devices with same ip address. > > On Wed, 21 Jan 2009, Milton Calnek wrote: > > > -----BEGIN PGP SIGNED MESSAGE----- > > Hash: SHA1 > > > > Hi, > > > > I'm running a test lab with gear where every piece of gear > > under test has the same ip address. > > > > I have separated them via vlans, but I want to be able to send syslog > > from these devices to a central host... but with everything having > the > > same ip address, there doesn't seem to be a way easily separate the > logs. > > I see how to log based on ip, but not MAC nor interface. > > > > Before I invest in the development time, I was wondering if you folks > > have any suggestions? > > if you are running rsyslog on the systems under test, try changing the > template that rsyslog uses to sent the messages out from > each system puts something unique in it's logs. > > David Lang > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com From rgerhards at hq.adiscon.com Thu Jan 22 16:58:24 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Thu, 22 Jan 2009 16:58:24 +0100 Subject: [rsyslog] rsyslog still crashes In-Reply-To: References: <1232046859.22744.39.camel@localhost.localdomain><577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9C2@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9C8@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9CA@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9CB@grfint2.intern.adiscon.com> Message-ID: <577465F99B41C842AAFBE9ED71E70ABA44FA18@grfint2.intern.adiscon.com> Hi folks, just an update on this matter. Lorenzo needed to change his system setup after some problems. We are in contact and expect to conduct further testing soon (hopefully the bug will reappear). Even better news is that I have been able to reproduce the bug 4 times in my lab today. It's not as easy as I would hope, but at least I can get results with some patience. I am also experimenting a bit with Twitter and actually found it useful to keep track of the troubleshooting process. Those of your interested can follow it at http://twitter.com/rgerhards I don't promise (yet) to keep it current at all times, but I will use it during the troubleshooting effort. Rainer > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of Lorenzo M. Catucci > Sent: Friday, January 16, 2009 6:29 PM > To: rsyslog-users > Subject: Re: [rsyslog] rsyslog still crashes > > On Fri, 16 Jan 2009, Rainer Gerhards wrote: > > RG> Ah, ok. Side-note: I got my machine up and it is running some test. > RG> Unfortunately no aborts so far, but is has only 4 cores... I hope > RG> something turns out... > RG> > > I think the real problem is in keeping those cores very busy... I'd try > to > spawn something like 20 loggers each spawning a couple "workers" per > second and logging startup/shutdown of any child. Maybe make each > worker > sleep for a random time before exiting. > > I don't have any Fedora/RedHat system; if nothing else, I'd suggest > doing > your tests on a debian/testing system too. > > Yours, > > lorenzo > > PS still running... > > > +-------------------------+-------------------------------------------- > --+ > | Lorenzo M. Catucci | Centro di Calcolo e Documentazione > | > | catucci at ccd.uniroma2.it | Universit? degli Studi di Roma "Tor > Vergata" | > | | Via O. Raimondo 18 ** I-00173 ROMA ** > ITALY | > | Tel. +39 06 7259 2255 | Fax. +39 06 7259 2125 > | > +-------------------------+-------------------------------------------- > --+ From lorenzo at sancho.ccd.uniroma2.it Thu Jan 22 17:19:15 2009 From: lorenzo at sancho.ccd.uniroma2.it (Lorenzo M. Catucci) Date: Thu, 22 Jan 2009 17:19:15 +0100 (CET) Subject: [rsyslog] rsyslog still crashes In-Reply-To: <577465F99B41C842AAFBE9ED71E70ABA44FA18@grfint2.intern.adiscon.com> References: <577465F99B41C842AAFBE9ED71E70ABA44FA18@grfint2.intern.adiscon.com> Message-ID: On Thu, 22 Jan 2009, Rainer Gerhards wrote: RG> Hi folks, RG> RG> just an update on this matter. Lorenzo needed to change his system RG> setup after some problems. We are in contact and expect to conduct RG> further testing soon (hopefully the bug will reappear). RG> Some administration chores the last couple of days; almost finished, big hopes for the week-end!!! RG> RG> Even better news is that I have been able to reproduce the bug 4 times RG> in my lab today. It's not as easy as I would hope, but at least I can RG> get results with some patience. I am also experimenting a bit with RG> Twitter and actually found it useful to keep track of the RG> troubleshooting process. Those of your interested can follow it at RG> This is really great news! Really, since rsyslog is been running this well since a long time on "normal" systems, and I've been (almost) alone in experiencing the crashes, the critters should have been hiding very well! See you soon, lorenzo +-------------------------+----------------------------------------------+ | Lorenzo M. Catucci | Centro di Calcolo e Documentazione | | catucci at ccd.uniroma2.it | Universit? degli Studi di Roma "Tor Vergata" | | | Via O. Raimondo 18 ** I-00173 ROMA ** ITALY | | Tel. +39 06 7259 2255 | Fax. +39 06 7259 2125 | +-------------------------+----------------------------------------------+ From rgerhards at hq.adiscon.com Thu Jan 22 18:53:44 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Thu, 22 Jan 2009 18:53:44 +0100 Subject: [rsyslog] rsyslog still crashes In-Reply-To: References: <577465F99B41C842AAFBE9ED71E70ABA44FA18@grfint2.intern.adiscon.com> Message-ID: <577465F99B41C842AAFBE9ED71E70ABA44FA1A@grfint2.intern.adiscon.com> OK, an update, full history at http://twitter.com/rgerhards It looks like there is some trouble with GCC atomic operation support. Has anyone seen this race on a non-Debian platform? I am asking because that may narrow down (or not ;)) the issue. Of course, I am not sure if atomic operations are really the root cause. However, replacing them is not very practical at some places and definitely time-consuming. So I'd like to have some feedback before I take that route. Does anyone know if there is a problem with atomic operation support in Debian (no bashing, honest question ;))? Rainer > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of Lorenzo M. Catucci > Sent: Thursday, January 22, 2009 5:19 PM > To: rsyslog-users > Subject: Re: [rsyslog] rsyslog still crashes > > On Thu, 22 Jan 2009, Rainer Gerhards wrote: > > RG> Hi folks, > RG> > RG> just an update on this matter. Lorenzo needed to change his system > RG> setup after some problems. We are in contact and expect to conduct > RG> further testing soon (hopefully the bug will reappear). > RG> > > Some administration chores the last couple of days; almost finished, > big hopes for the week-end!!! > > RG> > RG> Even better news is that I have been able to reproduce the bug 4 > times > RG> in my lab today. It's not as easy as I would hope, but at least I > can > RG> get results with some patience. I am also experimenting a bit with > RG> Twitter and actually found it useful to keep track of the > RG> troubleshooting process. Those of your interested can follow it at > RG> > > This is really great news! Really, since rsyslog is been running this > well > since a long time on "normal" systems, and I've been (almost) alone in > experiencing the crashes, the critters should have been hiding very > well! > > See you soon, > > lorenzo > > > +-------------------------+-------------------------------------------- > --+ > | Lorenzo M. Catucci | Centro di Calcolo e Documentazione > | > | catucci at ccd.uniroma2.it | Universit? degli Studi di Roma "Tor > Vergata" | > | | Via O. Raimondo 18 ** I-00173 ROMA ** > ITALY | > | Tel. +39 06 7259 2255 | Fax. +39 06 7259 2125 > | > +-------------------------+-------------------------------------------- > --+ From mbiebl at gmail.com Thu Jan 22 19:46:30 2009 From: mbiebl at gmail.com (Michael Biebl) Date: Thu, 22 Jan 2009 19:46:30 +0100 Subject: [rsyslog] rsyslog still crashes In-Reply-To: <577465F99B41C842AAFBE9ED71E70ABA44FA1A@grfint2.intern.adiscon.com> References: <577465F99B41C842AAFBE9ED71E70ABA44FA18@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44FA1A@grfint2.intern.adiscon.com> Message-ID: 2009/1/22 Rainer Gerhards : > OK, an update, full history at http://twitter.com/rgerhards > > It looks like there is some trouble with GCC atomic operation support. Has anyone seen this race on a non-Debian platform? I am asking because that may narrow down (or not ;)) the issue. Of course, I am not sure if atomic operations are really the root cause. However, replacing them is not very practical at some places and definitely time-consuming. So I'd like to have some feedback before I take that route. > > Does anyone know if there is a problem with atomic operation support in Debian (no bashing, honest question ;))? This would be a compiler (GCC) problem then, right? I'm not aware of any such problem. FWIW Debian is using GCC 4.3 in lenny/sid I've checked the bugs reported against the Debian gcc package [1] and the Debian specific patches on top of gcc [2], but I didn't find anything obvious. Rainer, if you have a more specific question, I could forward that question to the Debian GCC maintainers. Cheers, Michael [1] http://bugs.debian.org/cgi-bin/pkgreport.cgi?src=gcc-4.3&repeatmerged=no [2] http://patch-tracking.debian.net/package/gcc-4.3/4.3.2-1.1 -- Why is it that all of the instruments seeking intelligent life in the universe are pointed away from Earth? From rgerhards at hq.adiscon.com Thu Jan 22 21:18:19 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Thu, 22 Jan 2009 21:18:19 +0100 Subject: [rsyslog] rsyslog still crashes In-Reply-To: References: <577465F99B41C842AAFBE9ED71E70ABA44FA18@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44FA1A@grfint2.intern.adiscon.com> Message-ID: <577465F99B41C842AAFBE9ED71E70ABA44FA1B@grfint2.intern.adiscon.com> > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com > [mailto:rsyslog-bounces at lists.adiscon.com] On Behalf Of Michael Biebl > Sent: Thursday, January 22, 2009 7:47 PM > To: rsyslog-users > Subject: Re: [rsyslog] rsyslog still crashes > > 2009/1/22 Rainer Gerhards : > > OK, an update, full history at http://twitter.com/rgerhards > > > > It looks like there is some trouble with GCC atomic > operation support. Has anyone seen this race on a non-Debian > platform? I am asking because that may narrow down (or not > ;)) the issue. Of course, I am not sure if atomic operations > are really the root cause. However, replacing them is not > very practical at some places and definitely time-consuming. > So I'd like to have some feedback before I take that route. > > > > Does anyone know if there is a problem with atomic > operation support in Debian (no bashing, honest question ;))? > > This would be a compiler (GCC) problem then, right? Excatly > > I'm not aware of any such problem. FWIW Debian is using GCC > 4.3 in lenny/sid > I've checked the bugs reported against the Debian gcc package [1] and > the Debian specific patches on top of gcc [2], > but I didn't find anything obvious. > > Rainer, if you have a more specific question, I could forward that > question to the Debian GCC maintainers. Thanks, Michael. But I think before we ask other's for their time, I'll try to do my homework. So far, I am just guessing. As I now seem to be able to repro the problem, I can look further into it. Tomorrow, I'll first check what it takes to replace the atomic operations by mutex calls. I think that's quite some work, but hopefully I am wrong. Thanks to the info you provided, this seems to be useful work. I keep you posted. Rainer > > Cheers, > Michael > > [1] > http://bugs.debian.org/cgi-bin/pkgreport.cgi?src=gcc-4.3&repea > tmerged=no > [2] http://patch-tracking.debian.net/package/gcc-4.3/4.3.2-1.1 > > -- > Why is it that all of the instruments seeking intelligent life in the > universe are pointed away from Earth? > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com > From danson at rackspace.com Mon Jan 26 21:51:13 2009 From: danson at rackspace.com (Daniel Anson) Date: Mon, 26 Jan 2009 14:51:13 -0600 Subject: [rsyslog] UNIX timestamp Message-ID: <7161_1233003195_n0QKr81o012376_96AF20FDF4301D419B33CCE8E3A0132B0AE1CFD1@SAT4MX07.RACKSPACE.CORP> Is there a convention in rsyslog whereby I can get a UNIX timestamp instead of the other RFC time standards? Daniel M. Anson Linux Systems Engineer Rackspace danson at rackspace.com Office: (210)312-5114 Confidentiality Notice: This e-mail message (including any attached or embedded documents) is intended for the exclusive and confidential use of the individual or entity to which this message is addressed, and unless otherwise expressly indicated, is confidential and privileged information of Rackspace. Any dissemination, distribution or copying of the enclosed material is prohibited. If you receive this transmission in error, please notify us immediately by e-mail at abuse at rackspace.com, and delete the original message. Your cooperation is appreciated. From hks.private at gmail.com Mon Jan 26 22:10:06 2009 From: hks.private at gmail.com ((private) HKS) Date: Mon, 26 Jan 2009 16:10:06 -0500 Subject: [rsyslog] UNIX timestamp In-Reply-To: <7161_1233003195_n0QKr81o012376_96AF20FDF4301D419B33CCE8E3A0132B0AE1CFD1@SAT4MX07.RACKSPACE.CORP> References: <7161_1233003195_n0QKr81o012376_96AF20FDF4301D419B33CCE8E3A0132B0AE1CFD1@SAT4MX07.RACKSPACE.CORP> Message-ID: On Mon, Jan 26, 2009 at 3:51 PM, Daniel Anson wrote: > Is there a convention in rsyslog whereby I can get a UNIX timestamp > instead of the other RFC time standards? > > > > Daniel M. Anson > Linux Systems Engineer > Rackspace > danson at rackspace.com > Office: (210)312-5114 Unfortunately, no. You can find a serious discussion about it at http://kb.monitorware.com/post14653.html, but in a word, it's complicated. -HKS > > > > > Confidentiality Notice: This e-mail message (including any attached or > embedded documents) is intended for the exclusive and confidential use of the > individual or entity to which this message is addressed, and unless otherwise > expressly indicated, is confidential and privileged information of Rackspace. > Any dissemination, distribution or copying of the enclosed material is prohibited. > If you receive this transmission in error, please notify us immediately by e-mail > at abuse at rackspace.com, and delete the original message. > Your cooperation is appreciated. > > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com > From danson at rackspace.com Mon Jan 26 22:16:18 2009 From: danson at rackspace.com (Daniel Anson) Date: Mon, 26 Jan 2009 15:16:18 -0600 Subject: [rsyslog] UNIX timestamp In-Reply-To: References: <7161_1233003195_n0QKr81o012376_96AF20FDF4301D419B33CCE8E3A0132B0AE1CFD1@SAT4MX07.RACKSPACE.CORP> Message-ID: <15897_1233004899_n0QLLcFR018661_96AF20FDF4301D419B33CCE8E3A0132B0AE1CFEB@SAT4MX07.RACKSPACE.CORP> I figured as much but I thought I would ask. In essence, writing a UNIX timestamp would go against the RFC standard especially if an rsyslog server were set up as a relay. I am using MySQL UNIX_TIMESTAMP() function to get what I need but thought this may be available locally in rsyslog. Thx for the reply, Daniel -----Original Message----- From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog-bounces at lists.adiscon.com] On Behalf Of (private) HKS Sent: Monday, January 26, 2009 3:10 PM To: rsyslog-users Subject: Re: [rsyslog] UNIX timestamp On Mon, Jan 26, 2009 at 3:51 PM, Daniel Anson wrote: > Is there a convention in rsyslog whereby I can get a UNIX timestamp > instead of the other RFC time standards? > > > > Daniel M. Anson > Linux Systems Engineer > Rackspace > danson at rackspace.com > Office: (210)312-5114 Unfortunately, no. You can find a serious discussion about it at http://kb.monitorware.com/post14653.html, but in a word, it's complicated. -HKS > > > > > Confidentiality Notice: This e-mail message (including any attached or > embedded documents) is intended for the exclusive and confidential use of the > individual or entity to which this message is addressed, and unless otherwise > expressly indicated, is confidential and privileged information of Rackspace. > Any dissemination, distribution or copying of the enclosed material is prohibited. > If you receive this transmission in error, please notify us immediately by e-mail > at abuse at rackspace.com, and delete the original message. > Your cooperation is appreciated. > > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com > _______________________________________________ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com Confidentiality Notice: This e-mail message (including any attached or embedded documents) is intended for the exclusive and confidential use of the individual or entity to which this message is addressed, and unless otherwise expressly indicated, is confidential and privileged information of Rackspace. Any dissemination, distribution or copying of the enclosed material is prohibited. If you receive this transmission in error, please notify us immediately by e-mail at abuse at rackspace.com, and delete the original message. Your cooperation is appreciated. From sur5r at sur5r.net Tue Jan 27 19:07:09 2009 From: sur5r at sur5r.net (Jakob Haufe) Date: Tue, 27 Jan 2009 19:07:09 +0100 Subject: [rsyslog] Is rsyslog leaking memory? References: <1232276513.22744.45.camel@localhost.localdomain> Message-ID: <20090127190709.40a2b81b@mp-atlantis3.ziti.uni-heidelberg.de> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Sun, 18 Jan 2009 12:01:53 +0100 Rainer Gerhards wrote: > From what I have seen so far, I, too, doubt there is a leak. However, > there are various levels of testing. For example, the postgres output > module and the GSSAPI code is contributed and I do not even have a > test environment. So these are not checked using that procedure. The > libdbi code is only checked every now and then and not with all > backends (e.g. no Oracle at hand ... and so on...). If I ever get > over to a full testing suite (no collaborators found so far...), I'll > probably be able to do more consitent testing of all modules. As I'm the one who wrote (or rather ported) the postgres module, I would be willing to help debugging/valgrinding it. Unfortunately, I have not yet completely understood how the files tests/ work. To be honest, I have just started looking at it. What would you suggest as a way to test ompgqsl in particular? Simply run rsyslogd with valgrind and throw messages against it? Regarding GSSAPI: As I'm a big fan of Kerberos I will definitely give it a try as soon as I have some spare time, maybe I can help in valgrinding it, too. Regards, Jakob (aka sur5r) -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (GNU/Linux) iEYEARECAAYFAkl/TU0ACgkQ1YAhDic+ada31QCgu1f54fx4XMNpLjrASZ2fGIJ8 V8sAoKD8hRx7tuRzpwkajg5PPCDkwnLY =luw3 -----END PGP SIGNATURE----- From rsyslog at clark-communications.com Wed Jan 28 02:19:45 2009 From: rsyslog at clark-communications.com (Don Jackson) Date: Tue, 27 Jan 2009 17:19:45 -0800 Subject: [rsyslog] UPDATE: sysutils/rsyslog-3.20.3 Message-ID: Port updated to the recent 3.20.3 release of rsyslog. Tested on OpenBSD 4.4, amd64 and i386. It would be great if someone would commit this to the OpenBSD ports tree. $ cat ./pkg/DESCR A syslogd replacement -------------- next part -------------- From rgerhards at hq.adiscon.com Wed Jan 28 18:32:04 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Wed, 28 Jan 2009 18:32:04 +0100 Subject: [rsyslog] rsyslog still crashes In-Reply-To: <577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com> References: <1232046859.22744.39.camel@localhost.localdomain><577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com> Message-ID: <577465F99B41C842AAFBE9ED71E70ABA44FA8F@grfint2.intern.adiscon.com> Hi all, thanks to Lorenzo's help, we made good progress. It is too much to post inside a mail, please have a look at my analysis of the bug: http://blog.gerhards.net/2009/01/rsyslog-data-race-analysis.html The short story is that we have at least improved the situation very much and I hope to have fixes for all branches within the next couple of days. Rainer > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of Rainer Gerhards > Sent: Friday, January 16, 2009 3:22 PM > To: rsyslog-users > Subject: Re: [rsyslog] rsyslog still crashes > > Lorenzo, > > I have created a new branch "raceDebug" and done a first commit to it. > The change is very lightweight. Please pull, compile as usual and give > it a try. It spits out some info to stdout from time to time > (hopefully). I am not sure if it aborts, depending on the output it may > or may not. Even if we get messages, they are probably not enough to > pinpoint the bug, but I wanted to do something very light to see if the > bug stays. > > Feedback appreciated. > > Rainer From david at lang.hm Thu Jan 29 09:36:41 2009 From: david at lang.hm (david at lang.hm) Date: Thu, 29 Jan 2009 00:36:41 -0800 (PST) Subject: [rsyslog] rsyslog still crashes In-Reply-To: <577465F99B41C842AAFBE9ED71E70ABA44FA8F@grfint2.intern.adiscon.com> References: <1232046859.22744.39.camel@localhost.localdomain><577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44FA8F@grfint2.intern.adiscon.com> Message-ID: On Wed, 28 Jan 2009, Rainer Gerhards wrote: > Hi all, > > thanks to Lorenzo's help, we made good progress. It is too much to post > inside a mail, please have a look at my analysis of the bug: > > http://blog.gerhards.net/2009/01/rsyslog-data-race-analysis.html > > The short story is that we have at least improved the situation very > much and I hope to have fixes for all branches within the next couple of > days. I just finished reading through this excellant write-up one small thing. you quote the spec Accesses to cacheable memory that are split across bus widths, cache lines, and page boundaries are not guaranteed to be atomic and then conclude that So aligned word-access does not guarantee (not even enhance the chance) of atomicity. I read that to mean that the alignment requirements are more complicated, not that alignment is useless. you should also look at the code that's generated by -Os, with the heavily cached systems that we have nowdays it's common that the code being smaller (and therefor more of the code fitting into the L1 cache) is more of an advantage than the optimizations that -O3 provides. congradulations on tracking down a nasty and subtle issue. David Lang > Rainer > >> -----Original Message----- >> From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- >> bounces at lists.adiscon.com] On Behalf Of Rainer Gerhards >> Sent: Friday, January 16, 2009 3:22 PM >> To: rsyslog-users >> Subject: Re: [rsyslog] rsyslog still crashes >> >> Lorenzo, >> >> I have created a new branch "raceDebug" and done a first commit to it. >> The change is very lightweight. Please pull, compile as usual and give >> it a try. It spits out some info to stdout from time to time >> (hopefully). I am not sure if it aborts, depending on the output it > may >> or may not. Even if we get messages, they are probably not enough to >> pinpoint the bug, but I wanted to do something very light to see if > the >> bug stays. >> >> Feedback appreciated. >> >> Rainer > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com > From rgerhards at hq.adiscon.com Thu Jan 29 10:42:48 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Thu, 29 Jan 2009 10:42:48 +0100 Subject: [rsyslog] rsyslog still crashes In-Reply-To: <577465F99B41C842AAFBE9ED71E70ABA44FA8F@grfint2.intern.adiscon.com> References: <1232046859.22744.39.camel@localhost.localdomain><577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44FA8F@grfint2.intern.adiscon.com> Message-ID: <577465F99B41C842AAFBE9ED71E70ABA44FA9A@grfint2.intern.adiscon.com> Hi all, I had another interesting discussion with Lorenzo today. Those of you interested in details my find the chatlog interesting: http://blog.gerhards.net/2009/01/some-more-on-rsyslog-data-race.html Rainer > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of Rainer Gerhards > Sent: Wednesday, January 28, 2009 6:32 PM > To: rsyslog-users > Subject: Re: [rsyslog] rsyslog still crashes > > Hi all, > > thanks to Lorenzo's help, we made good progress. It is too much to post > inside a mail, please have a look at my analysis of the bug: > > http://blog.gerhards.net/2009/01/rsyslog-data-race-analysis.html > > The short story is that we have at least improved the situation very > much and I hope to have fixes for all branches within the next couple > of > days. > > Rainer > > > -----Original Message----- > > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > > bounces at lists.adiscon.com] On Behalf Of Rainer Gerhards > > Sent: Friday, January 16, 2009 3:22 PM > > To: rsyslog-users > > Subject: Re: [rsyslog] rsyslog still crashes > > > > Lorenzo, > > > > I have created a new branch "raceDebug" and done a first commit to > it. > > The change is very lightweight. Please pull, compile as usual and > give > > it a try. It spits out some info to stdout from time to time > > (hopefully). I am not sure if it aborts, depending on the output it > may > > or may not. Even if we get messages, they are probably not enough to > > pinpoint the bug, but I wanted to do something very light to see if > the > > bug stays. > > > > Feedback appreciated. > > > > Rainer > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com From david at lang.hm Thu Jan 29 12:06:03 2009 From: david at lang.hm (david at lang.hm) Date: Thu, 29 Jan 2009 03:06:03 -0800 (PST) Subject: [rsyslog] rsyslog still crashes In-Reply-To: <577465F99B41C842AAFBE9ED71E70ABA44FA9A@grfint2.intern.adiscon.com> References: <1232046859.22744.39.camel@localhost.localdomain><577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44FA8F@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44FA9A@grfint2.intern.adiscon.com> Message-ID: On Thu, 29 Jan 2009, Rainer Gerhards wrote: > Hi all, > > I had another interesting discussion with Lorenzo today. Those of you > interested in details my find the chatlog interesting: > > http://blog.gerhards.net/2009/01/some-more-on-rsyslog-data-race.html so, distilling this down I think I am reading the following. 1. mixing mutex and atomic operations is a problem, one or the other is safe 2. reliable duplication of the problem requires fast machine multiple cores _not_ sharing L1 cache (early Intel 4-core machines or multi-socket machines) a complex rsyslog config that uses multiple thread heavily high traffic log volume to heavily load rsyslog high system load external to rsyslog increases the chancesof the race question, have you tried enabling/disabling preemption in the kernel on these systems to see if that affects the probability of having a problem? I'm eagerly waiting for the fixes to appear in the 4.1 branch to test them out. David Lang > Rainer > >> -----Original Message----- >> From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- >> bounces at lists.adiscon.com] On Behalf Of Rainer Gerhards >> Sent: Wednesday, January 28, 2009 6:32 PM >> To: rsyslog-users >> Subject: Re: [rsyslog] rsyslog still crashes >> >> Hi all, >> >> thanks to Lorenzo's help, we made good progress. It is too much to > post >> inside a mail, please have a look at my analysis of the bug: >> >> http://blog.gerhards.net/2009/01/rsyslog-data-race-analysis.html >> >> The short story is that we have at least improved the situation very >> much and I hope to have fixes for all branches within the next couple >> of >> days. >> >> Rainer >> >>> -----Original Message----- >>> From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- >>> bounces at lists.adiscon.com] On Behalf Of Rainer Gerhards >>> Sent: Friday, January 16, 2009 3:22 PM >>> To: rsyslog-users >>> Subject: Re: [rsyslog] rsyslog still crashes >>> >>> Lorenzo, >>> >>> I have created a new branch "raceDebug" and done a first commit to >> it. >>> The change is very lightweight. Please pull, compile as usual and >> give >>> it a try. It spits out some info to stdout from time to time >>> (hopefully). I am not sure if it aborts, depending on the output it >> may >>> or may not. Even if we get messages, they are probably not enough to >>> pinpoint the bug, but I wanted to do something very light to see if >> the >>> bug stays. >>> >>> Feedback appreciated. >>> >>> Rainer >> _______________________________________________ >> rsyslog mailing list >> http://lists.adiscon.net/mailman/listinfo/rsyslog >> http://www.rsyslog.com > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com > From rgerhards at hq.adiscon.com Thu Jan 29 11:08:04 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Thu, 29 Jan 2009 11:08:04 +0100 Subject: [rsyslog] rsyslog still crashes In-Reply-To: References: <1232046859.22744.39.camel@localhost.localdomain><577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44FA8F@grfint2.intern.adiscon.com><577465F99B41C842AAFBE9ED71E70ABA44FA9A@grfint2.intern.adiscon.com> Message-ID: <577465F99B41C842AAFBE9ED71E70ABA44FA9B@grfint2.intern.adiscon.com> A full answer follows soon, but in essence you got it :) I will be working on the 4.1 version today, thus the brief reply ;) > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of david at lang.hm > Sent: Thursday, January 29, 2009 12:06 PM > To: rsyslog-users > Subject: Re: [rsyslog] rsyslog still crashes > > On Thu, 29 Jan 2009, Rainer Gerhards wrote: > > > Hi all, > > > > I had another interesting discussion with Lorenzo today. Those of you > > interested in details my find the chatlog interesting: > > > > http://blog.gerhards.net/2009/01/some-more-on-rsyslog-data-race.html > > so, distilling this down I think I am reading the following. > > 1. mixing mutex and atomic operations is a problem, one or the other is > safe > > 2. reliable duplication of the problem requires > > fast machine > multiple cores _not_ sharing L1 cache (early Intel 4-core machines or > multi-socket machines) > a complex rsyslog config that uses multiple thread heavily > high traffic log volume to heavily load rsyslog > high system load external to rsyslog increases the chancesof the race > > question, have you tried enabling/disabling preemption in the kernel on > these systems to see if that affects the probability of having a > problem? > > I'm eagerly waiting for the fixes to appear in the 4.1 branch to test > them > out. > > David Lang > > > > Rainer > > > >> -----Original Message----- > >> From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > >> bounces at lists.adiscon.com] On Behalf Of Rainer Gerhards > >> Sent: Wednesday, January 28, 2009 6:32 PM > >> To: rsyslog-users > >> Subject: Re: [rsyslog] rsyslog still crashes > >> > >> Hi all, > >> > >> thanks to Lorenzo's help, we made good progress. It is too much to > > post > >> inside a mail, please have a look at my analysis of the bug: > >> > >> http://blog.gerhards.net/2009/01/rsyslog-data-race-analysis.html > >> > >> The short story is that we have at least improved the situation very > >> much and I hope to have fixes for all branches within the next > couple > >> of > >> days. > >> > >> Rainer > >> > >>> -----Original Message----- > >>> From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > >>> bounces at lists.adiscon.com] On Behalf Of Rainer Gerhards > >>> Sent: Friday, January 16, 2009 3:22 PM > >>> To: rsyslog-users > >>> Subject: Re: [rsyslog] rsyslog still crashes > >>> > >>> Lorenzo, > >>> > >>> I have created a new branch "raceDebug" and done a first commit to > >> it. > >>> The change is very lightweight. Please pull, compile as usual and > >> give > >>> it a try. It spits out some info to stdout from time to time > >>> (hopefully). I am not sure if it aborts, depending on the output it > >> may > >>> or may not. Even if we get messages, they are probably not enough > to > >>> pinpoint the bug, but I wanted to do something very light to see if > >> the > >>> bug stays. > >>> > >>> Feedback appreciated. > >>> > >>> Rainer > >> _______________________________________________ > >> rsyslog mailing list > >> http://lists.adiscon.net/mailman/listinfo/rsyslog > >> http://www.rsyslog.com > > _______________________________________________ > > rsyslog mailing list > > http://lists.adiscon.net/mailman/listinfo/rsyslog > > http://www.rsyslog.com > > > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com From mrdemeanour at jackpot.uk.net Thu Jan 29 12:12:41 2009 From: mrdemeanour at jackpot.uk.net (Mr. Demeanour) Date: Thu, 29 Jan 2009 11:12:41 +0000 Subject: [rsyslog] rsyslog still crashes In-Reply-To: <577465F99B41C842AAFBE9ED71E70ABA44FA8F@grfint2.intern.adiscon.com> References: <1232046859.22744.39.camel@localhost.localdomain><577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44FA8F@grfint2.intern.adiscon.com> Message-ID: <49818F29.7070000@jackpot.uk.net> Rainer Gerhards wrote: > Hi all, > > thanks to Lorenzo's help, we made good progress. It is too much to post > inside a mail, please have a look at my analysis of the bug: > > http://blog.gerhards.net/2009/01/rsyslog-data-race-analysis.html > > The short story is that we have at least improved the situation very > much and I hope to have fixes for all branches within the next couple of > days. Bravo, Rainer! That is the most challenging and tricky to nail of all kinds of bug, and I'm very impressed. -- Jack. From friedl at hq.adiscon.com Thu Jan 29 17:16:57 2009 From: friedl at hq.adiscon.com (Florian Riedl) Date: Thu, 29 Jan 2009 17:16:57 +0100 Subject: [rsyslog] rsyslog 4.1.4 (devel) released Message-ID: <577465F99B41C842AAFBE9ED71E70ABA44FABC@grfint2.intern.adiscon.com> Hi all, rsyslog 4.1.4, a member of the development branch, has been released today. rsyslog 4.1.4, a member of the development branch, has been released today. It is primarily a stability update. Most importantly, this version addresses a potential segfault which occurred rather seldom and primarily on very fast and busy systems. The only other change is a fix for the $PreserveFQDN config directive, which did not properly affect locally emitted messages. This is a recommended update for all users of the development branch. Download http://www.rsyslog.com/Downloads-req-viewdownloaddetails-lid-147.phtml Changelog http://www.rsyslog.com/Article341.phtml As always, feedback is appreciated. Florian Riedl -- Support ======= Improving rsyslog is costly, but you can help! We are looking for organizations that find rsyslog useful and wish to contribute back. You can contribute by reporting bugs, improve the software, or donate money or equipment. Commercial support contracts for rsyslog are available, and they help finance continued maintenance. Adiscon GmbH, a privately held German company, is currently funding rsyslog development. We are always looking for interesting development projects. For details on how to help, please see http://www.rsyslog.com/doc-how2help.html . From rgerhards at hq.adiscon.com Thu Jan 29 17:36:41 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Thu, 29 Jan 2009 17:36:41 +0100 Subject: [rsyslog] rsyslog still crashes In-Reply-To: References: <1232046859.22744.39.camel@localhost.localdomain> <577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44FA8F@grfint2.intern.adiscon.com> Message-ID: <1233247001.19733.14.camel@rf10up.intern.adiscon.com> On Thu, 2009-01-29 at 00:36 -0800, david at lang.hm wrote: > On Wed, 28 Jan 2009, Rainer Gerhards wrote: > > > Hi all, > > > > thanks to Lorenzo's help, we made good progress. It is too much to post > > inside a mail, please have a look at my analysis of the bug: > > > > http://blog.gerhards.net/2009/01/rsyslog-data-race-analysis.html > > > > The short story is that we have at least improved the situation very > > much and I hope to have fixes for all branches within the next couple of > > days. > > I just finished reading through this excellant write-up > > one small thing. > > you quote the spec > > Accesses to cacheable memory that are split across bus widths, cache > lines, and page boundaries are not guaranteed to be atomic > > and then conclude that > > So aligned word-access does not guarantee (not even enhance the chance) of > atomicity. > > I read that to mean that the alignment requirements are more complicated, > not that alignment is useless. I should probably have quoted more of Intel's manual. But in essence you need to read at least the first full two pages to get the in-depth idea. The issue is not alignment requirements. As hardware gets more and more parallel, and caches get to more and more levels, and on-chip cores coexist with those from other sockets ... keeping memory coherent is a costly job. In early CPUs, Intel made memory access atomic if some alignment requirements were met. That was cheap. In new CPUs that atomicity is expensive. On the other hand, most data access do not need atomicity. So why incur the cost for many operations when only few need it? In the end result, Intel has remove guaranteed atomicity from those memory accesses. In order to get atomicity, the program must tell the CPU *explicitly* that it wants that feature. To do so, a "LOCK" prefix (opcode) must be placed before the actual opcode (note that this is only supported for some operations). So you get the best of two world: fast execution time for the majority of code and atomicity where you need it (but it then incurs the cost). The bottom line is that what was an atomic operation on an old CPU is no longer an atomic operation on a new CPU. If you need that, you need to include that extra "LOCK" opcode. As I briefly said in the blogpost, I have not check old Intel manuals. So I do not know if they formerly guaranteed, as part of the instruction set architecture, that these operations were atomic. I guess they did not. If so, I as a programmer made some assumptions about the micro-architecture that no longer hold true. My fault... But even if it is Intel's fault, the C programming language does not guarantee atomicity nor does the compiler guarantee a specific translation to machine code. So I, working on the C level, used assumptions that were not valid (and as I said I knew it was dangerous, but it worked too well for too long... ;)) > > you should also look at the code that's generated by -Os, with the heavily > cached systems that we have nowdays it's common that the code being > smaller (and therefor more of the code fitting into the L1 cache) is more > of an advantage than the optimizations that -O3 provides. That's a good reminder. I've just checked the gcc docs. There are some things that I do not like about -Os, especially as it disables proper alignment of many structures, including code. That can lead to sub-optimal cache performance. On the other hand -O3 does things like loop unrolling, which definitely is a bad idea with modern cache systems. My preliminarily conclusion is that -O2 is probably best, and may be tuned by turning on and off specific optimizations via their specific compiler switches. > > congradulations on tracking down a nasty and subtle issue. Thanks - but let's first see if this was the only issue and if things run smooth everywhere. But it looks very promising. Rainer > > David Lang > > > > Rainer > > > >> -----Original Message----- > >> From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > >> bounces at lists.adiscon.com] On Behalf Of Rainer Gerhards > >> Sent: Friday, January 16, 2009 3:22 PM > >> To: rsyslog-users > >> Subject: Re: [rsyslog] rsyslog still crashes > >> > >> Lorenzo, > >> > >> I have created a new branch "raceDebug" and done a first commit to it. > >> The change is very lightweight. Please pull, compile as usual and give > >> it a try. It spits out some info to stdout from time to time > >> (hopefully). I am not sure if it aborts, depending on the output it > > may > >> or may not. Even if we get messages, they are probably not enough to > >> pinpoint the bug, but I wanted to do something very light to see if > > the > >> bug stays. > >> > >> Feedback appreciated. > >> > >> Rainer > > _______________________________________________ > > rsyslog mailing list > > http://lists.adiscon.net/mailman/listinfo/rsyslog > > http://www.rsyslog.com > > > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com From david at lang.hm Fri Jan 30 04:51:28 2009 From: david at lang.hm (david at lang.hm) Date: Thu, 29 Jan 2009 19:51:28 -0800 (PST) Subject: [rsyslog] rsyslog still crashes In-Reply-To: <1233247001.19733.14.camel@rf10up.intern.adiscon.com> References: <1232046859.22744.39.camel@localhost.localdomain> <577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44FA8F@grfint2.intern.adiscon.com> <1233247001.19733.14.camel@rf10up.intern.adiscon.com> Message-ID: On Thu, 29 Jan 2009, Rainer Gerhards wrote: > On Thu, 2009-01-29 at 00:36 -0800, david at lang.hm wrote: >> On Wed, 28 Jan 2009, Rainer Gerhards wrote: >> >>> Hi all, >>> >>> thanks to Lorenzo's help, we made good progress. It is too much to post >>> inside a mail, please have a look at my analysis of the bug: >>> >>> http://blog.gerhards.net/2009/01/rsyslog-data-race-analysis.html >>> >>> The short story is that we have at least improved the situation very >>> much and I hope to have fixes for all branches within the next couple of >>> days. >> >> I just finished reading through this excellant write-up >> >> one small thing. >> >> you quote the spec >> >> Accesses to cacheable memory that are split across bus widths, cache >> lines, and page boundaries are not guaranteed to be atomic >> >> and then conclude that >> >> So aligned word-access does not guarantee (not even enhance the chance) of >> atomicity. >> >> I read that to mean that the alignment requirements are more complicated, >> not that alignment is useless. > > I should probably have quoted more of Intel's manual. But in essence you > need to read at least the first full two pages to get the in-depth idea. > The issue is not alignment requirements. As hardware gets more and more > parallel, and caches get to more and more levels, and on-chip cores > coexist with those from other sockets ... keeping memory coherent is a > costly job. > > In early CPUs, Intel made memory access atomic if some alignment > requirements were met. That was cheap. In new CPUs that atomicity is > expensive. On the other hand, most data access do not need atomicity. So > why incur the cost for many operations when only few need it? In the end > result, Intel has remove guaranteed atomicity from those memory > accesses. In order to get atomicity, the program must tell the CPU > *explicitly* that it wants that feature. To do so, a "LOCK" prefix > (opcode) must be placed before the actual opcode (note that this is only > supported for some operations). So you get the best of two world: fast > execution time for the majority of code and atomicity where you need it > (but it then incurs the cost). > > The bottom line is that what was an atomic operation on an old CPU is no > longer an atomic operation on a new CPU. If you need that, you need to > include that extra "LOCK" opcode. > > As I briefly said in the blogpost, I have not check old Intel manuals. > So I do not know if they formerly guaranteed, as part of the instruction > set architecture, that these operations were atomic. I guess they did > not. If so, I as a programmer made some assumptions about the > micro-architecture that no longer hold true. My fault... But even if it > is Intel's fault, the C programming language does not guarantee > atomicity nor does the compiler guarantee a specific translation to > machine code. So I, working on the C level, used assumptions that were > not valid (and as I said I knew it was dangerous, but it worked too well > for too long... ;)) the new C0x standard will add atomic ops and guarentees (some of which are not nessasarily provided by the chip, but have to be provided by the compiler/library instead), so watch for it, but test the performance of them before you trust them >> >> you should also look at the code that's generated by -Os, with the heavily >> cached systems that we have nowdays it's common that the code being >> smaller (and therefor more of the code fitting into the L1 cache) is more >> of an advantage than the optimizations that -O3 provides. > > That's a good reminder. I've just checked the gcc docs. There are some > things that I do not like about -Os, especially as it disables proper > alignment of many structures, including code. That can lead to > sub-optimal cache performance. I know the linux kernel has many things where the alignment is critical for proper functioning, but they are still able to support -Os, so there is some way to specify alignment even for -Os > On the other hand -O3 does things like loop unrolling, which definitely > is a bad idea with modern cache systems. > > My preliminarily conclusion is that -O2 is probably best, and may be > tuned by turning on and off specific optimizations via their specific > compiler switches. this has been the prevailing wisdom for many years, but I've seen myself many cases where -Os has ended up being faster in the real world, in spite of the various things that -O2 does 'better' is it the case that -Os would break things? or just that you think it's alignment may not be as good? David Lang >> congradulations on tracking down a nasty and subtle issue. > > Thanks - but let's first see if this was the only issue and if things > run smooth everywhere. But it looks very promising. > > Rainer >> >> David Lang >> >> >>> Rainer >>> >>>> -----Original Message----- >>>> From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- >>>> bounces at lists.adiscon.com] On Behalf Of Rainer Gerhards >>>> Sent: Friday, January 16, 2009 3:22 PM >>>> To: rsyslog-users >>>> Subject: Re: [rsyslog] rsyslog still crashes >>>> >>>> Lorenzo, >>>> >>>> I have created a new branch "raceDebug" and done a first commit to it. >>>> The change is very lightweight. Please pull, compile as usual and give >>>> it a try. It spits out some info to stdout from time to time >>>> (hopefully). I am not sure if it aborts, depending on the output it >>> may >>>> or may not. Even if we get messages, they are probably not enough to >>>> pinpoint the bug, but I wanted to do something very light to see if >>> the >>>> bug stays. >>>> >>>> Feedback appreciated. >>>> >>>> Rainer >>> _______________________________________________ >>> rsyslog mailing list >>> http://lists.adiscon.net/mailman/listinfo/rsyslog >>> http://www.rsyslog.com >>> >> _______________________________________________ >> rsyslog mailing list >> http://lists.adiscon.net/mailman/listinfo/rsyslog >> http://www.rsyslog.com > > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com > From david at lang.hm Fri Jan 30 05:56:55 2009 From: david at lang.hm (david at lang.hm) Date: Thu, 29 Jan 2009 20:56:55 -0800 (PST) Subject: [rsyslog] rsyslog still crashes In-Reply-To: <1233247001.19733.14.camel@rf10up.intern.adiscon.com> References: <1232046859.22744.39.camel@localhost.localdomain> <577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44FA8F@grfint2.intern.adiscon.com> <1233247001.19733.14.camel@rf10up.intern.adiscon.com> Message-ID: On Thu, 29 Jan 2009, Rainer Gerhards wrote: >> >> congradulations on tracking down a nasty and subtle issue. > > Thanks - but let's first see if this was the only issue and if things > run smooth everywhere. But it looks very promising. > bad news, on my system the HUP doesn't always reopen the files now. high speed box receiving messages via UDP, idle except for a gzip compressing the files (which are rotated once a min), the system runs fine for a few min (higher performance than before, it's now writing ~93,000 messages/sec instead of ~78,000 messages/sec), but it sometimes mangles handling a HUP and gets stuck. I have to do a kill -9 to kill and restart it. this is with the new HUP behavior. David Lang From david at lang.hm Fri Jan 30 06:13:07 2009 From: david at lang.hm (david at lang.hm) Date: Thu, 29 Jan 2009 21:13:07 -0800 (PST) Subject: [rsyslog] rsyslog still crashes In-Reply-To: References: <1232046859.22744.39.camel@localhost.localdomain> <577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44FA8F@grfint2.intern.adiscon.com> <1233247001.19733.14.camel@rf10up.intern.adiscon.com> Message-ID: On Thu, 29 Jan 2009, david at lang.hm wrote: > On Thu, 29 Jan 2009, Rainer Gerhards wrote: > >>> >>> congradulations on tracking down a nasty and subtle issue. >> >> Thanks - but let's first see if this was the only issue and if things >> run smooth everywhere. But it looks very promising. >> > > bad news, on my system the HUP doesn't always reopen the files now. > > high speed box receiving messages via UDP, idle except for a gzip > compressing the files (which are rotated once a min), the system runs fine > for a few min (higher performance than before, it's now writing ~93,000 > messages/sec instead of ~78,000 messages/sec), but it sometimes mangles > handling a HUP and gets stuck. I have to do a kill -9 to kill and restart > it. > > this is with the new HUP behavior. interesting note on memory useage. I'm using the default fixed array queue type on this box with a 1K max message length. if I hammer the box with a steady ~120K messages/sec (while it can write 93K/sec) the queue builds up to where it takes ~12G of ram. at this point the throughput takes a nose dive (not just dropping inbound packets, but also the number of packets written is much less) if I kill the sender, it starts emptying it's queue (interestingly, not quite as fast as if it is also recieving some messages), but the memory isn't freed up until I start sending it messages again. David Lang From rgerhards at hq.adiscon.com Thu Jan 29 19:34:50 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Thu, 29 Jan 2009 19:34:50 +0100 Subject: [rsyslog] rsyslog still crashes In-Reply-To: References: <1232046859.22744.39.camel@localhost.localdomain> <577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44FA8F@grfint2.intern.adiscon.com> <1233247001.19733.14.camel@rf10up.intern.adiscon.com> Message-ID: <1233254090.19733.22.camel@rf10up.intern.adiscon.com> On Thu, 2009-01-29 at 21:13 -0800, david at lang.hm wrote: > On Thu, 29 Jan 2009, david at lang.hm wrote: > interesting note on memory useage. > > I'm using the default fixed array queue type on this box with a 1K max > message length. if I hammer the box with a steady ~120K messages/sec > (while it can write 93K/sec) the queue builds up to where it takes ~12G of > ram. at this point the throughput takes a nose dive (not just dropping > inbound packets, but also the number of packets written is much less) > > if I kill the sender, it starts emptying it's queue (interestingly, not > quite as fast as if it is also recieving some messages), but the memory > isn't freed up until I start sending it messages again. This actually is expected behavior - and it has lots to do with "last message repeated n time". In order to implement that functionality, I need to hold on the the last message until a new one comes in (so that I can compare new to old). As such, a message that is fully processed can not immediately be freed. This happens, when the next message comes in - whenever this be. Note that each output has separate "last message..." status, so each action keeps a copy of the previous message until a new one arrives. What now happens is that when the queue builds up, malloc extends the data segment size. It is fair to assume that the last message received - on a very busy system will probably end up at a high location in the data segment (but note it is just a probability - it may even receive a very low location, if that was just freed immediately before). When the queue is now drained, we free everything but this message. As the message is still referenced for "last m...", it can not be freed. As it has a high address, the data segment size can not be reduced. As such, rsyslog still holds the whole data segement, with it containing almost no actually allocated memory. I do not know if the runtime system has a way to tell the OS it now uses a "sparse data segement", but I guess it doesn't do that. When the next message comes in (hours later?), the previous message can be freed, and the runtime can then reduce the data segment size (which should result in a sharp decrease of memory usage seen). This is one of the reasons I don't like "last message...". I hope this clarifies. Rainer From rgerhards at hq.adiscon.com Thu Jan 29 20:40:33 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Thu, 29 Jan 2009 20:40:33 +0100 Subject: [rsyslog] rsyslog still crashes In-Reply-To: References: <1232046859.22744.39.camel@localhost.localdomain> <577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44FA8F@grfint2.intern.adiscon.com> <1233247001.19733.14.camel@rf10up.intern.adiscon.com> Message-ID: <1233258033.19733.27.camel@rf10up.intern.adiscon.com> Hi David, thanks for this note, but I think it is not related to the fix (I'll think a bit harder about that, but so far I can not find any connection between the two). The way the HUP is done is sub-optimal. Under typical load (one hup a day), you don't see any issue. If you hup very frequently (like the once a min you do) and have heavy traffic, that's another story. To solve that case, some rework on the hup internals, actually even on the interface definition, is needed. I'd hold all such work unless I found a solution to the race bug - because it would have made the environment even more different. Now that I have at least one issue, I think I can go ahead and begin to introduce more intrusive changes again. In any case, I'll have a more in-depth look at the hup handlers. The new non-restart type of hup should be almost resistant against the issue you report. Rainer On Thu, 2009-01-29 at 20:56 -0800, david at lang.hm wrote: > On Thu, 29 Jan 2009, Rainer Gerhards wrote: > > >> > >> congradulations on tracking down a nasty and subtle issue. > > > > Thanks - but let's first see if this was the only issue and if things > > run smooth everywhere. But it looks very promising. > > > > bad news, on my system the HUP doesn't always reopen the files now. > > high speed box receiving messages via UDP, idle except for a gzip > compressing the files (which are rotated once a min), the system runs fine > for a few min (higher performance than before, it's now writing ~93,000 > messages/sec instead of ~78,000 messages/sec), but it sometimes mangles > handling a HUP and gets stuck. I have to do a kill -9 to kill and restart > it. > > this is with the new HUP behavior. > > David Lang > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com From rgerhards at hq.adiscon.com Thu Jan 29 21:25:27 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Thu, 29 Jan 2009 21:25:27 +0100 Subject: [rsyslog] rsyslog still crashes In-Reply-To: References: <1232046859.22744.39.camel@localhost.localdomain> <577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44FA8F@grfint2.intern.adiscon.com> <1233247001.19733.14.camel@rf10up.intern.adiscon.com> Message-ID: <1233260727.19733.71.camel@rf10up.intern.adiscon.com> On Thu, 2009-01-29 at 19:51 -0800, david at lang.hm wrote: > the new C0x standard will add atomic ops and guarentees (some of which are > not nessasarily provided by the chip, but have to be provided by the > compiler/library instead), so watch for it, but test the performance of > them before you trust them This is very important work, especially if you think about future advances in hardware design. However, I think we will be years away from the point where one can actually use this and hope to be somewhat portable. Same for performance: early implementation will probably be sub-optimal (though it should be fairly simple to map current compiler-specific options for atomic ops to the new standard once... but we know what happens when new standards come out...). > > On the other hand -O3 does things like loop unrolling, which definitely > > is a bad idea with modern cache systems. > > > > My preliminarily conclusion is that -O2 is probably best, and may be > > tuned by turning on and off specific optimizations via their specific > > compiler switches. > > this has been the prevailing wisdom for many years, but I've seen myself > many cases where -Os has ended up being faster in the real world, in spite > of the various things that -O2 does 'better' I think the phrase "it depends on the scenario" is very important here. > is it the case that -Os would break things? or just that you think it's > alignment may not be as good? It does not break things. The alignment for any structures that are passed as part of the API should be properly contained in the header files. However, I have not specifically tested this. The point is just that, at least on some machines, non-aligned addresses severely hit cache performance. So optimizing for size, and as a side-effect generating unaligned data accesses, can be a real performance drawback. It may well cost more performance than the improved L1 (or trace cache) performance offers. In any case, if we go down to that level, I think there are better places to test and optimize - not to mention that on the upper layer (OS calls!) there is still room for improvement. On of my favorite CPU-level optimizations is the "exception system" that is currently in use in rsyslog. Thanks to your message, I've finally written down some information on it. I've done that on the forum, so that I can easily keep a permanent record of the discussion (and in an easier-to-follow form than with the mail archive): http://kb.monitorware.com/optimizing-exception-handling-t8911.html Feedback is appreciated. Rainer From rgerhards at hq.adiscon.com Fri Jan 30 14:34:07 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Fri, 30 Jan 2009 14:34:07 +0100 Subject: [rsyslog] rsyslog 4.1.4 - one (small) bug left Message-ID: <577465F99B41C842AAFBE9ED71E70ABA44FADC@grfint2.intern.adiscon.com> Hi all, I have now basically ported the race bugfix to all branches (verification and double-check still in the works). While doing this, I noticed that one small issues slipped my attention with yesterday's 4.1.4 version. If compiled with atomics, I unlock an already unlocked mutex (which is destroyed with the very next statement) in msgDestruct. That should not have any really bad effects (but you never know...). The master branch is now updated, so you may want to pull a fixed version from there. I will not do a new release just for this reason - it'll be included in the next version. Please note that git as of now already contains all the race fix for all branches, but mostly untested. Just in case if you'd like to get them quickly. I will keep you posted. Rainer From rgerhards at hq.adiscon.com Fri Jan 30 16:47:55 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Fri, 30 Jan 2009 16:47:55 +0100 Subject: [rsyslog] hang on HUP - was: rsyslog still crashes In-Reply-To: References: <1232046859.22744.39.camel@localhost.localdomain> <577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44FA8F@grfint2.intern.adiscon.com> <1233247001.19733.14.camel@rf10up.intern.adiscon.com> Message-ID: <1233330475.19733.88.camel@rf10up.intern.adiscon.com> On Thu, 2009-01-29 at 20:56 -0800, david at lang.hm wrote: > high speed box receiving messages via UDP, idle except for a gzip > compressing the files (which are rotated once a min), the system runs fine > for a few min (higher performance than before, it's now writing ~93,000 > messages/sec instead of ~78,000 messages/sec), but it sometimes mangles > handling a HUP and gets stuck. I have to do a kill -9 to kill and restart > it. > > this is with the new HUP behavior. I cross-checked the HUP processing. So far, I do not see why it hangs (and if it is related to the HUP processing). Can you reproduce it with debug log running. I guess no, but if so, could you provide me a log with ~1000 log lines before the hang? If debug log is no option, a stack trace from the abort would be great. Rainer From david at lang.hm Fri Jan 30 18:19:21 2009 From: david at lang.hm (david at lang.hm) Date: Fri, 30 Jan 2009 09:19:21 -0800 (PST) Subject: [rsyslog] rsyslog still crashes In-Reply-To: <1233258033.19733.27.camel@rf10up.intern.adiscon.com> References: <1232046859.22744.39.camel@localhost.localdomain> <577465F99B41C842AAFBE9ED71E70ABA44F9B8@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44F9BE@grfint2.intern.adiscon.com> <577465F99B41C842AAFBE9ED71E70ABA44FA8F@grfint2.intern.adiscon.com> <1233247001.19733.14.camel@rf10up.intern.adiscon.com> <1233258033.19733.27.camel@rf10up.intern.adiscon.com> Message-ID: On Thu, 29 Jan 2009, Rainer Gerhards wrote: > Hi David, > > thanks for this note, but I think it is not related to the fix (I'll > think a bit harder about that, but so far I can not find any connection > between the two). > > The way the HUP is done is sub-optimal. Under typical load (one hup a > day), you don't see any issue. If you hup very frequently (like the once > a min you do) and have heavy traffic, that's another story. To solve > that case, some rework on the hup internals, actually even on the > interface definition, is needed. I'd hold all such work unless I found a > solution to the race bug - because it would have made the environment > even more different. Now that I have at least one issue, I think I can > go ahead and begin to introduce more intrusive changes again. > > In any case, I'll have a more in-depth look at the hup handlers. The new > non-restart type of hup should be almost resistant against the issue you > report. I was using the new non-restart type. I'll be doing more testing today and over the weekend. it's posible that I ended up with mixed versions with the modules again (just before going home last night I deleted them all and then did the install to make sure) David Lang > Rainer > > On Thu, 2009-01-29 at 20:56 -0800, david at lang.hm wrote: >> On Thu, 29 Jan 2009, Rainer Gerhards wrote: >> >>>> >>>> congradulations on tracking down a nasty and subtle issue. >>> >>> Thanks - but let's first see if this was the only issue and if things >>> run smooth everywhere. But it looks very promising. >>> >> >> bad news, on my system the HUP doesn't always reopen the files now. >> >> high speed box receiving messages via UDP, idle except for a gzip >> compressing the files (which are rotated once a min), the system runs fine >> for a few min (higher performance than before, it's now writing ~93,000 >> messages/sec instead of ~78,000 messages/sec), but it sometimes mangles >> handling a HUP and gets stuck. I have to do a kill -9 to kill and restart >> it. >> >> this is with the new HUP behavior. >> >> David Lang >> _______________________________________________ >> rsyslog mailing list >> http://lists.adiscon.net/mailman/listinfo/rsyslog >> http://www.rsyslog.com > > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com > From david at lang.hm Fri Jan 30 18:28:56 2009 From: david at lang.hm (david at lang.hm) Date: Fri, 30 Jan 2009 09:28:56 -0800 (PST) Subject: [rsyslog] rsyslog 4.1.4 - one (small) bug left In-Reply-To: <577465F99B41C842AAFBE9ED71E70ABA44FADC@grfint2.intern.adiscon.com> References: <577465F99B41C842AAFBE9ED71E70ABA44FADC@grfint2.intern.adiscon.com> Message-ID: On Fri, 30 Jan 2009, Rainer Gerhards wrote: > Hi all, > > I have now basically ported the race bugfix to all branches > (verification and double-check still in the works). While doing this, I > noticed that one small issues slipped my attention with yesterday's > 4.1.4 version. If compiled with atomics, I unlock an already unlocked > mutex (which is destroyed with the very next statement) in msgDestruct. > That should not have any really bad effects (but you never know...). The > master branch is now updated, so you may want to pull a fixed version > from there. I will not do a new release just for this reason - it'll be > included in the next version. so 4.1.4 should be using the atomics for queue management not mutexes? David Lang > Please note that git as of now already contains all the race fix for all > branches, but mostly untested. Just in case if you'd like to get them > quickly. > > I will keep you posted. > > Rainer > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com > From rgerhards at hq.adiscon.com Fri Jan 30 17:28:47 2009 From: rgerhards at hq.adiscon.com (Rainer Gerhards) Date: Fri, 30 Jan 2009 17:28:47 +0100 Subject: [rsyslog] rsyslog 4.1.4 - one (small) bug left In-Reply-To: References: <577465F99B41C842AAFBE9ED71E70ABA44FADC@grfint2.intern.adiscon.com> Message-ID: <577465F99B41C842AAFBE9ED71E70ABA44FAE1@grfint2.intern.adiscon.com> > -----Original Message----- > From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog- > bounces at lists.adiscon.com] On Behalf Of david at lang.hm > Sent: Friday, January 30, 2009 6:29 PM > To: rsyslog-users > Subject: Re: [rsyslog] rsyslog 4.1.4 - one (small) bug left > > On Fri, 30 Jan 2009, Rainer Gerhards wrote: > > > Hi all, > > > > I have now basically ported the race bugfix to all branches > > (verification and double-check still in the works). While doing this, > I > > noticed that one small issues slipped my attention with yesterday's > > 4.1.4 version. If compiled with atomics, I unlock an already unlocked > > mutex (which is destroyed with the very next statement) in > msgDestruct. > > That should not have any really bad effects (but you never know...). > The > > master branch is now updated, so you may want to pull a fixed version > > from there. I will not do a new release just for this reason - it'll > be > > included in the next version. > > so 4.1.4 should be using the atomics for queue management not mutexes? It depends... If atomics are available, they are the preferred method. If not available, the code falls back to mutexes. Rainer