[rsyslog] untra-reliable speed test
Kenneth Marshall
ktm at rice.edu
Fri May 8 19:54:38 CEST 2009
On Fri, May 08, 2009 at 10:38:55AM -0700, david at lang.hm wrote:
> On Fri, 8 May 2009, Kenneth Marshall wrote:
>
> > On Fri, May 08, 2009 at 01:18:37AM -0700, david at lang.hm wrote:
> >> On Fri, 8 May 2009, Rainer Gerhards wrote:
> >>
> >>>> -----Original Message-----
> >>>> From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog-
> >>>> bounces at lists.adiscon.com] On Behalf Of david at lang.hm
> >>>>
> >>>> I have a box put togeaterh for a first cut at a speed test of rsyslog in
> >>>> untra-reliable mode. the outline below is intended to minimize the number
> >>>> of variables.
> >>>>
> >>>> the box is a dual quad-core opteron with 8G of ram, one SATA drive and a
> >>>> fusionIO SSD PCIE drive, currently running RHEL 5.3 kernel 2.6.18-53
> >>>> (redhat stock kernel) I intend to format the SSD with ext2 (as the
> >>>> application is providing data integrity, and to avoid the known
> >>>> performance problems with ext3 and fsync)
> >>>
> >>> Just a question, because I do not know enough about ext2: does ext2 guarantee
> >>> that when an application does fsync, all data, INCLUDING related file system
> >>> control structures are written to disk? Or, to phrase it the other way
> >>> around, can ext2 guarantee that fsync'ed data can always be read after a
> >>> power failure. I think along the lines of some control structures not being
> >>> written, thus the fsynced app data may be present on the disk, but cannot be
> >>> accessed any longer. In the worst case, would it be possible that a whole
> >>> file be lost during a file system check after reboot?
> >>>
> >>> My *uneducated* understanding is that ext3 does guard against this (thus the
> >>> performance problems) but ext2 does not.
> >>
> >> the performance problem with ext3 is that it forces ALL pending writes to
> >> disk when anything does a fsync
> >>
> >> now that you mention it, I think that with all filesystems other than ext2
> >> you need to do a fsync on the directory as well as on the file
> >>
> >>> If my understanding would be correct (and I don't say so), we would need to
> >>> use ext3.
> >>
> > FYI,
> >
> > I think if you use ext3 with data=writeback, you will not have the
> > flush everything problem. Of course, you will need to precreate the
> > files.
>
> with an application that does fsync, at that point you have no reliability
> gain compared to ext2, but you still have the overhead of the journal
> (including the need to write the data twice, once to the journal, once to
> it's final location)
>
> David Lang
>
I thought that with data=writeback only meta-data is committed to the
journal, not the file data.
Ken
> > Regards,
> > Ken
> >
> >> I'll try both (and later on, when I use by own kernel rather than the
> >> redhat one I'll also test XFS)
> >>
> >> I think that if no other disk activity is taking place ext3 maynot be too
> >> bad (one other advantage that ext2 would have over ext3 and XFS is that
> >> journaling filesystems have to write whatever they journal twice (once to
> >> the journal and once to the final location)
> >>
> >>>> for the rsyslog test I am thinking the following
> >>>>
> >>>> useing rsyslog 4.1.7
> >>>> enable input file
> >>>
> >>> Not sure if I got this bullet point right. Do you mean you intend to use
> >>> imfile for input generation?
> >>
> >> yes, that was my intent. just to simplify things by making the test
> >> completely self contained to the one box.
> >>
> >>> In any case, I would suggest to do a test with UDP and one with TCP senders,
> >>> both sending at maximum rate. With UDP, we would see a message loss rate,
> >>> while with TCP we would see the actual number of messages that the system can
> >>> process. So TCP is probably the more meaningful number, but packet loss rate
> >>> for UDP - a common use case - would also be interesting, at least I think so.
> >>
> >> will do.
> >>
> >> I will be interested in seeing the UDP loss rate, I suspect that with
> >> appropriate OS tuning I can get it down to zero loss rate at the data
> >> rates that the rest of the system maintains (the OS has a buffer prior to
> >> rsyslog's input process that can cover delays on the input threads)
> >>
> >>>> set the main queue mode to disk
> >>>> enable fsyncs everwhere
> >>>
> >>>
> >>> Just as a reminder: this includes $MainMsgQueueCheckpointInterval 1 (which is
> >>> a *real* performance eater and puts a lot of burden on the consistency of the
> >>> file system's control structures, thus my question on ext2 vs. ext3 above).
> >>
> >> does this do a fsync on the directory.
> >>
> >>>> set the output to log *.* to a file
> >>>>
> >>>> run a cron job that rolls the log file once a min and sends a HUP to
> >>>> rsyslog
> >>>>
> >>>> create a large file of log information
> >>>>
> >>>> run this for a while and then count the number of logs in each rolled log
> >>>> file. hopefully the number will be reasonably consistant.
> >>>>
> >>>> does this sound like a reasonable approach? or is this going to not be
> >>>> representitive for some reason?
> >>>
> >>> With the few comments above, I think this is a very reasonable approach and
> >>> should provide very good insight.
> >>>
> >>> Actually, I hope that it can prove my point that this setup is too slow
> >>> wrong...
> >>
> >> there will definantly be a performance issue at some point here, the
> >> question is if it's fast enough to be useable.
> >>
> >> the drive claims to be able to do >100,000 I/O ops/sec. if we can manage
> >> to get a few thousand logs/sec written on this, it will be extremely
> >> usable.
> >>
> >> David Lang
> >> _______________________________________________
> >> rsyslog mailing list
> >> http://lists.adiscon.net/mailman/listinfo/rsyslog
> >> http://www.rsyslog.com
> >>
> > _______________________________________________
> > rsyslog mailing list
> > http://lists.adiscon.net/mailman/listinfo/rsyslog
> > http://www.rsyslog.com
> >
> _______________________________________________
> rsyslog mailing list
> http://lists.adiscon.net/mailman/listinfo/rsyslog
> http://www.rsyslog.com
>
More information about the rsyslog
mailing list