[rsyslog] PostgreSQL: Problems with character encoding
david at lang.hm
david at lang.hm
Wed Jan 20 19:44:42 CET 2010
On Wed, 20 Jan 2010, Jakob Haufe wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
>
> This is bad, because if the machine is an open syslog server that simply
> collects everything it gets, we have a potential DoS vector here.
>
> I can think of three options:
>
> * Drop the message and report that we did so. That would be rather easy,
> but might not be what people want.
>
> * Re-insert the message after converting it from ASCII to UTF-8 or whatever
> the DB encoding is. But this might/will produce garbage if the input is not
> ASCII. It also creates more load on the system if these messages are
> frequent. Guessing the input encoding is hard or even impossible, depending
> on the set you guess from.
>
> * Make the database SQL_ASCII. This will silently accept anything but will
> create nonsense from UTF/UCS encoded messages. Also might create trouble
> for programs like phplogcon that analyze the logs.
>
> For me, this sums up to one question:
>
> Can we make ompgsql UTF/UCS-clean and at the same time not choke on non-UTF8
> strings? Everyone is trying to be UTF-8 clean these days, so it would be bad
> if ompgsql could not keep up.
my thought is that just like we have a filter to change control characters
to escape sequences, it would be good to have a filter to escape non-ascii
characters. this will mangle other character sets, but they are unlikly to
go through cleanly anyway.
David Lang
More information about the rsyslog
mailing list