[rsyslog] PostgreSQL: Problems with character encoding

Rainer Gerhards rgerhards at hq.adiscon.com
Fri Jan 22 10:51:41 CET 2010


V5 has the capability to discard messages that cause an action failure.
However, this is mostly untested yet, AND the action must support it by
providing proper status information - it must differentiate between
system-induced errors (which can be retried) and message-induced errors
(which need the discard). ompgsql currently does not provide that status
information.

Rainer

> -----Original Message-----
> From: rsyslog-bounces at lists.adiscon.com [mailto:rsyslog-
> bounces at lists.adiscon.com] On Behalf Of Jakob Haufe
> Sent: Wednesday, January 20, 2010 7:21 PM
> To: rsyslog at lists.adiscon.com
> Subject: Re: [rsyslog] PostgreSQL: Problems with character encoding
> 
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> On Wed, 6 Jan 2010 16:14:59 +0100
> Marc Schiffbauer <marc.schiffbauer at mightycare.de> wrote:
> 
> > which encoding should be chosen for the database when using postgres?
> 
> As far as I understand the syslog protocol (at least the legacy one),
> it has
> no concept of character encodings at all.  So if you simply want to
> make sure
> that everything ends up in the database "as is", then choose SQL_ASCII.
> 
> > My rsyslog version is 4.4.3.
> >
> > Which client_encoding does rsyslog use in ompgsql?
> 
> Right now, it does net set an encoding by itself, so the database
> default
> applies. If I'm not mistaken, you can even set that per user from
> inside of
> postgres. So I would rather vote against another configuration
> parameter here.
> 
> > I currently have set UTF-8 on the database. It worked for a while
> until
> > some special message arrived at the server where postgres denies the
> INSERT:
> >
> > 2010-01-06 16:13:11 CET syslog syslog ERROR:  invalid byte sequence
> for
> > encoding "UTF8": 0xd220
> > 2010-01-06 16:13:11 CET syslog syslog HINT:  This error can also
> happen if
> > the byte sequence does not match the encoding expected by the server,
> which
> > is controlled by "client_encoding".
> 
> Were you able to isolate the message? Or find out which program was
> sending
> it?
> 
> > Now rsyslog is not able to log anything... it is currently spooling
> to disk
> > because it "hangs" at this message not being accepted by postgres.
> 
> This is bad, because if the machine is an open syslog server that
> simply
> collects everything it gets, we have a potential DoS vector here.
> 
> I can think of three options:
> 
> * Drop the message and report that we did so. That would be rather
> easy,
>   but might not be what people want.
> 
> * Re-insert the message after converting it from ASCII to UTF-8 or
> whatever
>   the DB encoding is. But this might/will produce garbage if the input
> is not
>   ASCII. It also creates more load on the system if these messages are
>   frequent. Guessing the input encoding is hard or even impossible,
> depending
>   on the set you guess from.
> 
> * Make the database SQL_ASCII. This will silently accept anything but
> will
>   create nonsense from UTF/UCS encoded messages. Also might create
> trouble
>   for programs like phplogcon that analyze the logs.
> 
> For me, this sums up to one question:
> 
> Can we make ompgsql UTF/UCS-clean and at the same time not choke on
> non-UTF8
> strings? Everyone is trying to be UTF-8 clean these days, so it would
> be bad
> if ompgsql could not keep up.
> 
> Comments please.
> 
> Regards,
> Jakab Haufe (sur5r)
> 
> - --
> ceterum censeo microsoftem esse delendam.
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.10 (GNU/Linux)
> 
> iEYEARECAAYFAktXSW8ACgkQ1YAhDic+adbqXACeIJcx6GW6PhSXFO1YF72PafJG
> 7t8AoLNwnJYMZ4bssqMZt/nkTIPWs0LI
> =vuWN
> -----END PGP SIGNATURE-----
> _______________________________________________
> rsyslog mailing list
> http://lists.adiscon.net/mailman/listinfo/rsyslog
> http://www.rsyslog.com



More information about the rsyslog mailing list