[rsyslog] Out of memory

mostolog at gmail.com mostolog at gmail.com
Wed Jan 18 13:04:06 CET 2017


> when you give us your config, please include the name= portion so we 
> can see what actions match up with what in impstats
Pasted at the end, but they are named according to what they do.

> at 10:37, the queue is small and maxrss is still low (8092)
>
> at 10:42, you got a huge burst of new traffic (104878-1889=102989 
> messages in this 5 min period), but the number processed is low. see 
> the submitted counts that only processed 70614-1889=68725 messages, so 
> 35001 in the queue with meory usage climging to 75992
100k/5min is our expected load. bursts will be much higher.

>
> at 10:47 you had received an additional 196015-104878=91137 messages, 
> but processed 162117-70614=91503 messages, gaining slightly with the 
> queue dropping to 33978 (with memory usage climbing to 104684)

> the fact that memory use climbed so much with the queue size dropping 
> is odd, but that could have been a burst from something processing 
> messages
>
> at 10:52 you had received an additonal 299439-196015=103424 
> messages.<but processed 265461-162118=103343 messages, loosing a 
> little ground with the queue size at 35002 (with memory usage climbing 
> to 135732)
>
> at 10:57 the pattern continues, with lots of messages being received 
> and processed (no significant change to the queue size), but memory 
> use continuing to climb.
and that's what I don't get.

> at 11:32 the number of new messages drops significantly and that lets 
> rsyslog deliver everything in the queue to catch up.
>
> what happens after that point?

atop VSIZE=842 / RSIZE=512 remain stuck for rsyslogd, no matter if 
there's not traffic anymore.

> it's worth noting that the queue sizes are instananious sizes, not 
> average sizes, so if you have a lot of messages that are delivered 
> just before the stats run happens, it will show a lot of messages in 
> the queue. your stats times only vary by a few seconds over the time 
> of the run, so it's impossible to say if the ~35K queue is due to a 
> burst of what. You could try changing the stats time to something much 
> smaller to see what happens, or change it to something that's not an 
> even number of minutes so that if something is generating logs every X 
> minutes, you won't stay in sync with them (try something like 100 
> seconds for example)
Setting impstats interval to 100 for the next round ;)

> But overall, it does look like there is a memory leak somewhere, can 
> you run a copy of rsyslog somewhere that will allow you to tinker with 
> the config significantly? change the output to go to a file instead of 
> ES (using the same template that you are using in ES would be good), 
> and then see what happens. If the memory leak stops, it's an 
> omelasticsearch issue, if not, we can try tinkering with the other 
> actions and see what makes a difference.
Going to do that, as this is actually the testing env.

Thanks as usual.


module(load="impstats" log.file="/data/stats.log")
syslog.=debug /data/rsyslog-stats
global(
     MaxMessageSize="32k"
     workDirectory="/data"
     parser.escapeControlCharactersOnReceive="off"
)
module(load="imrelp")
input(
     port="20514"
     type="imrelp"
     name="imrelp"
     ruleset="relp"
)

template(name="json" type="string" string="%$!%\n")
template(name="index" type="string" string="%$.index%@%$.interval%")
template(name="type" type="string" string="%$.type%")
template(name="ts" type="string" string="%timestamp:::date-rfc3339%")

module(load="mmjsonparse")
module(load="mmnormalize")
module(load="omelasticsearch")
ruleset(
     name="relp"
     queue.filename="relp"
     queue.maxdiskspace="1G"
     queue.SaveOnShutdown="on"
     queue.type="LinkedList"
     ) {
     action(
         name="json"
         cookie=""
         type="mmjsonparse"
     )
     if $parsesuccess == "FAIL" then {
         call error
         stop
     }
     action(
         name="norm"
         type="mmnormalize"
         variable="$!msg"
         rulebase="/etc/rsyslog.d/rsyslog.rb"
     )
     $IncludeConfig /etc/rsyslog.d/apps/conf/1*.conf
     $IncludeConfig /etc/rsyslog.d/apps/conf/2*.conf
     #there are no 1* neither 2* files

     # Set default index and type
     set $.index="unknown";
     set $.type="unknown";
#defaults
set $.interval=$$now & ":" & $$hour;
     if $!app != $!app then {
         call unknown
         stop
     }
     $IncludeConfig /etc/rsyslog.d/apps/conf/3*.conf
     #a few files like
     #else if $!app == "myapp" then {
     #    set $.index="account-app@" & $$now;
     #    set $.type="logs";
     #    call geoip
     #}
     call clean

     set $!host_forwarded=$hostname;
     set $!host_received=$$myhostname;
     set $!time_received=$timegenerated;
     set $@timestamp=exec_template("ts");
     action(
         name="elastic"
         action.resumeRetryCount="-1"
         action.reportsuspension="on"
         type="omelasticsearch"
         server="server"
         serverport="9200"
         searchIndex="index"
         dynSearchIndex="on"
         searchType="type"
         dynSearchType="on"
         template="json"
     )
}
$IncludeConfig /etc/rsyslog.d/apps/conf/4*.conf
#a few files like:
#ruleset(name="geoip"){
#    if $!ip != "" then {
#        set $!geo="true";
#        unset $!ip;
#    }
#}

module(load="builtin:omfile")
ruleset(name="error"){
     action(
         name="error"
         type="omfile"
         file="/data/rsyslog-errors.log"
     )
}
ruleset(name="unknown"){
     action(
         name="unk"
         type="omfile"
         file="/data/rsyslog-unknown.log"
     )
}




More information about the rsyslog mailing list