Opened 9 years ago

Closed 9 years ago

Last modified 8 years ago

#10833 closed defect (fixed)

Offline ICQ Messages Sent with Wrong Encoding

Reported by: assen Owned by: MarkDoliner
Milestone: 2.7.4 Component: ICQ
Version: 2.6.3 Keywords: ICQ SoC



I had this problem with Pidgin 2.5 & 2.6 (Fedora 10/11), now I have it with Empathy (Fedora 12) - which makes me believe it lays buried somewhere in libpurple perhaps?

The problem: my locale is UTF-8. When I send an online ICQ message in Cyrillic (or whatever non-ASCII I chose), it is received by standard ICQ clients as expected. When I send the same message as offline message, the other end shows meaningless characters.

I made a tcpdump file (here attached) while sending the offline message. It shows two things:

  • message is sent as UTF-8 (verified with a text editor), but
  • "Block Character Set" field in the same packet is sent as "0x0003", which stands for "ISO-8859-1".

Interesting enough, same dump for an online message (also attached here) shows "Block Character Set" value of 0x0002, which is Unicode.

I'm therefore assuming that ICQ offline messages are sent with incorrect character set.

If I can provide any further information, please, let me know.

Attachments (2)

offline_message_dump.cap (435 bytes) - added by assen 9 years ago.
online_message_dump.cap (709 bytes) - added by assen 9 years ago.

Download all attachments as: .zip

Change History (7)

Changed 9 years ago by assen

Changed 9 years ago by assen

comment:1 Changed 9 years ago by darkrain42

It looks like this is being caused by purple_plugin_oscar_convert_to_best_encoding. In particular, it works for online messages likely because of this note:

     * If we're sending to an ICQ user, and they are in our
     * buddy list, and they are advertising the Unicode
     * capability, and they are online, then attempt to send
     * as UTF-16BE.

When the buddy is offline, my guess is that the following code is encoding to UTF-8, but the charset is stuck as LATIN_1, which seems wrong. I'm not familiar with the OSCAR prpls, so I'm not positive what should be happening here:

     * If this is AIM then attempt to send as ISO-8859-1.  If this is
     * ICQ then attempt to send as the user specified character encoding.
    charsetstr = "ISO-8859-1";
    if ((destbn != NULL) && oscar_util_valid_name_icq(destbn))
        charsetstr = purple_account_get_string(account, "encoding", OSCAR_DEFAULT_CUSTOM_ENCODING);

     * XXX - We need a way to only attempt to convert if we KNOW "from"
     * can be converted to "charsetstr"
    *msg = g_convert(from, -1, charsetstr, "UTF-8", NULL, &msglen, &err);
    if (*msg != NULL) {
        *charset = AIM_CHARSET_LATIN_1;
        *charsubset = 0x0000;
        *msglen_int = msglen;

You'd probably have better luck if you set the encoding value to UTF-16BE (which is more or less what is used for sending unicode messages), although I wonder how ICQ clients do/[are supposed to] interpret the charset value.

comment:2 Changed 9 years ago by ivan.komarov@…

  • Resolution set to fixed
  • Status changed from new to closed

(In d1c89d7bc669f5fa5636b54af6a36e6640d021e1):
Stop using custom encodings (and LATIN-1, for that matter) for sending OSCAR messages (ICBM, chat, Direct IM). Now, we use ASCII if a message contains ASCII characters only, and UTF-16 in all other cases.

That fixes #10833 (offline messages now will be sent as UTF-16) and also a whole bunch of potential problems we can get with charset 0x3. Different clients tend to interpret this charset differently; for instance, the official client always interprets it as LATIN-1, while alternative clients may decode it as some other user-specified 8-bit encoding. On the other hand, ASCII messages (charset 0x0) and UTF-16 messages (charset 0x2) are understood uniformly by all clients.

I also cleaned-up the code a little (got rid of code paths that were never executed, flags that were always set, unused struct members, etc.)

comment:3 Changed 9 years ago by MarkDoliner

  • Keywords ICQ SoC added

This change will most likely be merged into im.pidgin.pidgin after 2.7.3 is released. So should be released as 2.7.4 (or 2.8.0, if we bump the minor version).

comment:4 Changed 9 years ago by MarkDoliner

  • Milestone set to 2.7.4

comment:5 Changed 8 years ago by Robby

Ticket #8844 has been marked as a duplicate of this ticket.

Note: See TracTickets for help on using tickets.
All information, including names and email addresses, entered onto this website or sent to mailing lists affiliated with this website will be public. Do not post confidential information, especially passwords!