[rsyslog] rsyslog - what's next?
david at lang.hm
david at lang.hm
Mon Jul 13 22:38:52 CEST 2009
On Mon, 13 Jul 2009, RB wrote:
> On Mon, Jul 13, 2009 at 12:11, <david at lang.hm> wrote:
>> for Internet access I completely agree with you, but for a log server you
>> don't have IPs changing names very frequently. As a result it becomes
>> practical to cache the lookup unconditionally until a restart/HUP. even in
>> a 'highly dynamic' environment you are usually adding servers instead of
>> changing the names of IP addresses.
>
> The problem with that approach is that it assumes it is acceptable to
> interrupt service for any trivial DNS change. How do you propose to
> deal withDNS-level availability tools like round-robin or Cisco's GSS?
> There are many approaches that use DNS as both a scalability and a
> redundancy tool that TTL-agnostic caching will not interact well with.
HUP does not interrupt services.
>> the problem is that to do a name lookup requires a LOT of steps
>
> I'm well aware of the series of steps, but am unconvinced caching
> belongs in the application. Have you tried using local caching (nscd)
> or segregating the DNS traffic from the production syslog traffic?
> Most enterprise-grade configurations (for whatever app) I've seen use
> a separate interface for administration and metadata like reverse
> lookups anyway.
yes, doing a local DNS cacheing server does not avoid all the local
filesystem lookups and host file parsing. it does speed up the network
connection.
>> even if you have the name in the /etc/hosts file, the overhead of looking
>> in the filesystem and parsing the file is significant.
>
> How significant? Unless the host is poorly configured, /etc/hosts is
> going to be in the filesystem cache and presented at near-memory
> speeds until it gets invalidated or evicted.
the system calls to access it (and to first lookup if it should be
accessed at all) take enough time to be significant at these speeds.
>> without doing DNS lookups, rsyslog is able to recieve UDP logs at almost
>> 400,000 per second (effectivly Gig-E wire speed with 256 byte log
>> messages), _very_ few DNS servers can handle requests at that speed. in
>
> To clarify since I'm not looking at the code right now: are you saying
> rsyslog performs a blocking lookup per-packet? If so that's certainly
> a sub-optimal approach, but there are often better ways (namely
> delayed resolution) to fix that than breaking RFC compatibility.
correct, it does all input processing in a single thread. so if that
thread is stalled doing a lookup it's not processing other packets. when
the data is first inserted into the message queue it is complete with all
lookups done.
>> fact, at that request rate, you use about half of a Gig-E connection just
>> for the DNS requests (more if you request more data, like what the TTL
>> would be, or don't give it the best name the first time and need to make
>> multiple requests with different domains attached or something like that)
>
> Your example invalidates itself - the TTL comes with the query whether
> or not you use it, and any caching mechanism (TTL-sensitive or not)
> will shortcut such sub-optimal queries after the first one.
>
> Caching is a good thing as long as it's done in a compliant manner or
> documented in a fashion that clearly identifies it as a broken (if
> performant) approach.
you want the ability to cache the name lookup no matter what method is
used. only DNS provides a TTL, hosts files, NIS, LDAP, etc do not include
a TTL.
I agree that the details of how the cache works should be documented.
David Lang
More information about the rsyslog
mailing list