[rsyslog] utf-8 encoded MSG

Rainer Gerhards rgerhards at hq.adiscon.com
Wed Aug 12 21:49:25 CEST 2009


The simple answer (unfortunately) is that utf8 is not yet supported. There are a number of subtle issues and i was so far hesitant to adress these (to do it right, we would need to work on changes of the logger api...)

----- Ursprüngliche Nachricht -----
Von: "Stanislav" <sparf at vingrad.ru>
An: "rsyslog at lists.adiscon.com" <rsyslog at lists.adiscon.com>
Gesendet: 12.08.09 15:33
Betreff: [rsyslog] utf-8 encoded MSG

Hello,



First of all, sorry for my English. It is very bad.

I am trying to make MS Log Parser (
http://www.microsoft.com/technet/scriptcenter/tools/logparser/default.mspx)
working correct with rsyslog server.

And now I have one problem.

I can’t understand how to create a correct syslog utf-8 message.

I have read in documentation (doc/syslog-protocol.html) that:
Conlusions/Suggestions

·         As it is not possible to definitely know the character encoding of
the application-provided message, MSG should *not* be specified to use UTF-8
exclusively. Instead, it is suggested that any encoding may be used but
UTF-8 is preferred. To detect UTF-8, the MSG should start with the UTF-8
byte order mask of "EF BB BF" if it is UTF-8 encoded (see section 155.9 of
http://www.unicode.org/versions/Unicode4.0.0/ch15.pdf)



For example here we have “EF BB BF” before UTF-8 encoded string.



0000   3c 33 30 3e 41 70 72 20 31 35 20 31 38 3a 33 36  <30>Apr 15 18:36

0010   3a 35 37 20 50 4b 2d 35 38 30 20 53 65 72 76 69  :57 PK-580 Servi

0020   63 65 43 6f 6e 74 72 6f 6c 4d 61 6e 61 67 65 72  ceControlManager

0030   20 ef bb bf d0 a1 d0 bb d1 83 d0 b6 d0 b1 d0 b0   ...............

0040   20 22 d0 90 d0 b4 d0 b0 d0 bf d1 82 d0 b5 d1 80   "..............

0050   20 d0 bf d1 80 d0 be d0 b8 d0 b7 d0 b2 d0 be d0   ...............

0060   b4 d0 b8 d1 82 d0 b5 d0 bb d1 8c d0 bd d0 be d1  ................

0070   81 d1 82 d0 b8 20 57 4d 49 22 20 d0 bf d0 b5 d1  ..... WMI" .....

0080   80 d0 b5 d1 88 d0 bb d0 b0 20 d0 b2 20 d1 81 d0  ......... .. ...

0090   be d1 81 d1 82 d0 be d1 8f d0 bd d0 b8 d0 b5 20  ...............

00a0   d0 a0 d0 b0 d0 b1 d0 be d1 82 d0 b0 d0 b5 d1 82  ................

00b0   2e                                               .



But in the database I see only broken encoding (with one space in the
beginning of the string):


Служба "Адаптер производительности WMI"
перешла в состояние Работает.


In correct form it looks like:
Служба "Адаптер производительности WMI" перешла в состояние Работает.


 P.S.We have rsyslog-3.18.6-4 installed with logging to mysql database.


Best regards.

Stanislav.
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com



More information about the rsyslog mailing list