[rsyslog-notify] Forum Thread: Re: Dequeue Perfomance - (Mode 'reply')

noreply at adiscon.com noreply at adiscon.com
Sun May 3 02:22:20 CEST 2015


User: kamermans 
Forumlink: http://kb.monitorware.com/viewtopic.php?p=25464#p25464

Message: 
----------
After some more investigating, it seems rsyslog consumes all the system
memory, then resets the stats, as is seen here, where I've pasted all the
forwardRelp queue logs for the entire event window.

This happens twice and is easily seen in the memory logs from the first
post.

[code:1t83qgmw]
Fri May  1 14:42:58 2015: forwardRelp:
origin=core.queue size=0 enqueued=2481042698 full=0
discarded.full=0 discarded.nf=0 maxqsize=5388178
Fri May  1 14:47:58 2015: forwardRelp:
origin=core.queue size=0 enqueued=2481579868 full=0
discarded.full=0 discarded.nf=0 maxqsize=5388178
Fri May  1 14:52:58 2015: forwardRelp:
origin=core.queue size=0 enqueued=2482160130 full=0
discarded.full=0 discarded.nf=0 maxqsize=5388178
Fri May  1 14:57:58 2015: forwardRelp:
origin=core.queue size=0 enqueued=2482714001 full=0
discarded.full=0 discarded.nf=0 maxqsize=5388178
Fri May  1 15:02:58 2015: forwardRelp:
origin=core.queue size=0 enqueued=2483254939 full=0
discarded.full=0 discarded.nf=0 maxqsize=5388178
Fri May  1 15:07:58 2015: forwardRelp:
origin=core.queue size=363652 enqueued=2483834606 full=0
discarded.full=0 discarded.nf=0 maxqsize=5388178
Fri May  1 15:12:58 2015: forwardRelp:
origin=core.queue size=814483 enqueued=2484411901 full=0
discarded.full=0 discarded.nf=0 maxqsize=5388178
Fri May  1 15:17:58 2015: forwardRelp:
origin=core.queue size=1274067 enqueued=2484980029 full=0
discarded.full=0 discarded.nf=0 maxqsize=5388178
Fri May  1 15:22:58 2015: forwardRelp:
origin=core.queue size=1744294 enqueued=2485553680 full=0
discarded.full=0 discarded.nf=0 maxqsize=5388178
Fri May  1 15:27:58 2015: forwardRelp:
origin=core.queue size=2158328 enqueued=2486106466 full=0
discarded.full=0 discarded.nf=0 maxqsize=5388178
Fri May  1 15:32:59 2015: forwardRelp:
origin=core.queue size=2535338 enqueued=2486636564 full=0
discarded.full=0 discarded.nf=0 maxqsize=5388178
Fri May  1 15:37:59 2015: forwardRelp:
origin=core.queue size=2948052 enqueued=2487210558 full=0
discarded.full=0 discarded.nf=0 maxqsize=5388178
Fri May  1 15:42:59 2015: forwardRelp:
origin=core.queue size=3354548 enqueued=2487782430 full=0
discarded.full=0 discarded.nf=0 maxqsize=5388178
Fri May  1 15:47:59 2015: forwardRelp:
origin=core.queue size=3755921 enqueued=2488346107 full=0
discarded.full=0 discarded.nf=0 maxqsize=5388178
Fri May  1 15:52:59 2015: forwardRelp:
origin=core.queue size=4144260 enqueued=2488897774 full=0
discarded.full=0 discarded.nf=0 maxqsize=5388178
Fri May  1 15:57:59 2015: forwardRelp:
origin=core.queue size=4504873 enqueued=2489416083 full=0
discarded.full=0 discarded.nf=0 maxqsize=5388178
Fri May  1 16:02:59 2015: forwardRelp:
origin=core.queue size=4875391 enqueued=2489942249 full=0
discarded.full=0 discarded.nf=0 maxqsize=5388178
Fri May  1 16:07:59 2015: forwardRelp:
origin=core.queue size=5238472 enqueued=2490464562 full=0
discarded.full=0 discarded.nf=0 maxqsize=5388178
Fri May  1 16:12:59 2015: forwardRelp:
origin=core.queue size=5604250 enqueued=2490988548 full=0
discarded.full=0 discarded.nf=0 maxqsize=5604250
Fri May  1 16:17:59 2015: forwardRelp:
origin=core.queue size=6002156 enqueued=2491552342 full=0
discarded.full=0 discarded.nf=0 maxqsize=6002569
Fri May  1 16:23:00 2015: forwardRelp:
origin=core.queue size=6367317 enqueued=2492087487 full=0
discarded.full=0 discarded.nf=0 maxqsize=6367317
Fri May  1 16:28:00 2015: forwardRelp:
origin=core.queue size=6733870 enqueued=2492617368 full=0
discarded.full=0 discarded.nf=0 maxqsize=6734380
Fri May  1 16:33:00 2015: forwardRelp:
origin=core.queue size=7104686 enqueued=2493146904 full=0
discarded.full=0 discarded.nf=0 maxqsize=7104686
Fri May  1 16:38:00 2015: forwardRelp:
origin=core.queue size=7487485 enqueued=2493690983 full=0
discarded.full=0 discarded.nf=0 maxqsize=7487901
Fri May  1 16:43:00 2015: forwardRelp:
origin=core.queue size=7874621 enqueued=2494236327 full=0
discarded.full=0 discarded.nf=0 maxqsize=7874621
## Reset here at about 16:45 UTC, 12:45 EST ##
Fri May  1 16:48:43 2015: forwardRelp:
origin=core.queue size=371954 enqueued=526464 full=0
discarded.full=0 discarded.nf=0 maxqsize=371954
Fri May  1 16:53:43 2015: forwardRelp:
origin=core.queue size=788688 enqueued=1103454 full=0
discarded.full=0 discarded.nf=0 maxqsize=789194
Fri May  1 16:58:43 2015: forwardRelp:
origin=core.queue size=1187523 enqueued=1662545 full=0
discarded.full=0 discarded.nf=0 maxqsize=1187523
Fri May  1 17:03:43 2015: forwardRelp:
origin=core.queue size=1566792 enqueued=2210262 full=0
discarded.full=0 discarded.nf=0 maxqsize=1566792
Fri May  1 17:08:43 2015: forwardRelp:
origin=core.queue size=1983428 enqueued=2787666 full=0
discarded.full=0 discarded.nf=0 maxqsize=1983428
Fri May  1 17:13:43 2015: forwardRelp:
origin=core.queue size=2400295 enqueued=3358133 full=0
discarded.full=0 discarded.nf=0 maxqsize=2400295
Fri May  1 17:18:43 2015: forwardRelp:
origin=core.queue size=2794020 enqueued=3910578 full=0
discarded.full=0 discarded.nf=0 maxqsize=2794020
Fri May  1 17:23:43 2015: forwardRelp:
origin=core.queue size=3196027 enqueued=4474889 full=0
discarded.full=0 discarded.nf=0 maxqsize=3196027
Fri May  1 17:28:43 2015: forwardRelp:
origin=core.queue size=3586002 enqueued=5024096 full=0
discarded.full=0 discarded.nf=0 maxqsize=3586241
Fri May  1 17:33:44 2015: forwardRelp:
origin=core.queue size=3987742 enqueued=5577388 full=0
discarded.full=0 discarded.nf=0 maxqsize=3988245
Fri May  1 17:38:44 2015: forwardRelp:
origin=core.queue size=4408286 enqueued=6147948 full=0
discarded.full=0 discarded.nf=0 maxqsize=4408372
Fri May  1 17:43:44 2015: forwardRelp:
origin=core.queue size=4822375 enqueued=6698229 full=0
discarded.full=0 discarded.nf=0 maxqsize=4822398
Fri May  1 17:48:44 2015: forwardRelp:
origin=core.queue size=5235788 enqueued=7249370 full=0
discarded.full=0 discarded.nf=0 maxqsize=5235788
Fri May  1 17:53:44 2015: forwardRelp:
origin=core.queue size=5663951 enqueued=7823453 full=0
discarded.full=0 discarded.nf=0 maxqsize=5663951
Fri May  1 17:58:44 2015: forwardRelp:
origin=core.queue size=6079632 enqueued=8388638 full=0
discarded.full=0 discarded.nf=0 maxqsize=6079632
Fri May  1 18:03:44 2015: forwardRelp:
origin=core.queue size=6485067 enqueued=8938969 full=0
discarded.full=0 discarded.nf=0 maxqsize=6485283
Fri May  1 18:08:44 2015: forwardRelp:
origin=core.queue size=6896346 enqueued=9490024 full=0
discarded.full=0 discarded.nf=0 maxqsize=6896346
Fri May  1 18:13:44 2015: forwardRelp:
origin=core.queue size=7350013 enqueued=10081419 full=0
discarded.full=0 discarded.nf=0 maxqsize=7350013
Fri May  1 18:18:44 2015: forwardRelp:
origin=core.queue size=7770721 enqueued=10637807 full=0
discarded.full=0 discarded.nf=0 maxqsize=7771205
Fri May  1 18:23:45 2015: forwardRelp:
origin=core.queue size=8173250 enqueued=11169872 full=0
discarded.full=0 discarded.nf=0 maxqsize=8173250
Fri May  1 18:28:45 2015: forwardRelp:
origin=core.queue size=8589705 enqueued=11713303 full=0
discarded.full=0 discarded.nf=0 maxqsize=8589705
Fri May  1 18:33:45 2015: forwardRelp:
origin=core.queue size=9012898 enqueued=12258864 full=0
discarded.full=0 discarded.nf=0 maxqsize=9013266
Fri May  1 18:38:45 2015: forwardRelp:
origin=core.queue size=9452544 enqueued=12814222 full=0
discarded.full=0 discarded.nf=0 maxqsize=9452544
## Reset here at about 18:40 UTC, 14:40 EST ##
Fri May  1 18:44:21 2015: forwardRelp:
origin=core.queue size=462548 enqueued=574360 full=0
discarded.full=0 discarded.nf=0 maxqsize=462548
Fri May  1 18:49:21 2015: forwardRelp:
origin=core.queue size=920664 enqueued=1143068 full=0
discarded.full=0 discarded.nf=0 maxqsize=920664
Fri May  1 18:54:22 2015: forwardRelp:
origin=core.queue size=1378725 enqueued=1708649 full=0
discarded.full=0 discarded.nf=0 maxqsize=1378725
Fri May  1 18:59:22 2015: forwardRelp:
origin=core.queue size=1825761 enqueued=2261157 full=0
discarded.full=0 discarded.nf=0 maxqsize=1826229
Fri May  1 19:04:22 2015: forwardRelp:
origin=core.queue size=2274764 enqueued=2806928 full=0
discarded.full=0 discarded.nf=0 maxqsize=2274764
Fri May  1 19:09:22 2015: forwardRelp:
origin=core.queue size=2740240 enqueued=3367124 full=0
discarded.full=0 discarded.nf=0 maxqsize=2740240
Fri May  1 19:14:22 2015: forwardRelp:
origin=core.queue size=3200280 enqueued=3918812 full=0
discarded.full=0 discarded.nf=0 maxqsize=3200562
Fri May  1 19:19:22 2015: forwardRelp:
origin=core.queue size=3668806 enqueued=4478474 full=0
discarded.full=0 discarded.nf=0 maxqsize=3668806
Fri May  1 19:24:22 2015: forwardRelp:
origin=core.queue size=4128953 enqueued=5029757 full=0
discarded.full=0 discarded.nf=0 maxqsize=4128953
Fri May  1 19:29:22 2015: forwardRelp:
origin=core.queue size=4577887 enqueued=5568291 full=0
discarded.full=0 discarded.nf=0 maxqsize=4577887
Fri May  1 19:34:22 2015: forwardRelp:
origin=core.queue size=5021051 enqueued=6097471 full=0
discarded.full=0 discarded.nf=0 maxqsize=5021051
Fri May  1 19:39:22 2015: forwardRelp:
origin=core.queue size=5495478 enqueued=6655866 full=0
discarded.full=0 discarded.nf=0 maxqsize=5495478
Fri May  1 19:44:23 2015: forwardRelp:
origin=core.queue size=5500960 enqueued=7191780 full=0
discarded.full=0 discarded.nf=0 maxqsize=5703695
Fri May  1 19:49:23 2015: forwardRelp:
origin=core.queue size=5291930 enqueued=7762014 full=0
discarded.full=0 discarded.nf=0 maxqsize=5703695
Fri May  1 19:54:23 2015: forwardRelp:
origin=core.queue size=4577750 enqueued=8283802 full=0
discarded.full=0 discarded.nf=0 maxqsize=5703695
Fri May  1 19:59:23 2015: forwardRelp:
origin=core.queue size=3486336 enqueued=8811332 full=0
discarded.full=0 discarded.nf=0 maxqsize=5703695
Fri May  1 20:04:23 2015: forwardRelp:
origin=core.queue size=2035104 enqueued=9313380 full=0
discarded.full=0 discarded.nf=0 maxqsize=5703695
Fri May  1 20:09:23 2015: forwardRelp:
origin=core.queue size=446460 enqueued=9802432 full=0
discarded.full=0 discarded.nf=0 maxqsize=5703695
Fri May  1 20:14:23 2015: forwardRelp:
origin=core.queue size=981 enqueued=10316870 full=0
discarded.full=0 discarded.nf=0 maxqsize=5703695
Fri May  1 20:19:23 2015: forwardRelp:
origin=core.queue size=0 enqueued=10828560 full=0 discarded.full=0
discarded.nf=0 maxqsize=5703695
Fri May  1 20:24:23 2015: forwardRelp:
origin=core.queue size=411 enqueued=11333581 full=0
discarded.full=0 discarded.nf=0 maxqsize=5703695
Fri May  1 20:29:23 2015: forwardRelp:
origin=core.queue size=87 enqueued=11830337 full=0 discarded.full=0
discarded.nf=0 maxqsize=5703695
Fri May  1 20:34:24 2015: forwardRelp:
origin=core.queue size=0 enqueued=12305650 full=0 discarded.full=0
discarded.nf=0 maxqsize=5703695
Fri May  1 20:39:24 2015: forwardRelp:
origin=core.queue size=0 enqueued=12822335 full=0 discarded.full=0
discarded.nf=0 maxqsize=5703695
Fri May  1 20:44:24 2015: forwardRelp:
origin=core.queue size=0 enqueued=13302503 full=0 discarded.full=0
discarded.nf=0 maxqsize=5703695
Fri May  1 20:49:24 2015: forwardRelp:
origin=core.queue size=0 enqueued=13769247 full=0 discarded.full=0
discarded.nf=0 maxqsize=5703695
Fri May  1 20:54:24 2015: forwardRelp:
origin=core.queue size=0 enqueued=14221109 full=0 discarded.full=0
discarded.nf=0 maxqsize=5703695
Fri May  1 20:59:24 2015: forwardRelp:
origin=core.queue size=0 enqueued=14666145 full=0 discarded.full=0
discarded.nf=0 maxqsize=5703695
Fri May  1 21:04:24 2015: forwardRelp:
origin=core.queue size=0 enqueued=15096044 full=0 discarded.full=0
discarded.nf=0 maxqsize=5703695
[/code:1t83qgmw]

I've looked back at my historical logs and it seems this happens (to some
extent) every other day or so.  The pstats log suggests to me that RELP is
not able to dequeue messages fast enough to keep up, and when a slight
burst pushes something into the queue for any length of time, the cost of
dequeuing these items steals CPU time from main dequeuer, which causes more
queuing.  Basically, once things get really busy, RELP starts delaying its
own delivery, so it is not able to clear the queue until the incoming
traffic decreases [b:1t83qgmw]significantly[/b:1t83qgmw].

It looks like I'll need to try one of these:
  1. Get a bigger server
  2. Switch out LinkedList for FixedArray or disable DA mode, but I can't
risk losing messages
  3. Add more RELP forwarding queues and round-robin incoming requests to
them to make better use of the CPUs (maybe there is an rsyslog global lock
that prevents this from scaling)
  4. Switch out RELP for something faster, maybe omfwd via UDP, but the
whole point for RELP was to avoid lost messages, so perhaps TCP would be
more resilient to this.


More information about the rsyslog-notify mailing list