Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unicode character problem #399

Closed
SombraRO opened this issue May 12, 2018 · 20 comments
Closed

Unicode character problem #399

SombraRO opened this issue May 12, 2018 · 20 comments

Comments

@SombraRO
Copy link

SombraRO commented May 12, 2018

Every message that uses the character ç next to another Unicode returns a strange character.

Using encode: UTF-8

çã Shows how згo
çõ Shows how уш

This is problem with the Iconv-lite

This can only be reproduced if the message is sent from irc to discord irc can not be UTF-8

@SombraRO
Copy link
Author

@Throne3d and @ekmartin Can you confirm?

@Throne3d
Copy link
Collaborator

Looks to be reproducible. If I install mIRC, and disable the "UTF-8 encode/decode messages" option, and then send çõ and çã, it reproduces. (ça and ço do not, so it presumably needs to be a non-ASCII character following the ç.)

I would expect this not to be a problem with iconv-lite, though, but instead with jschardet, as it's presumably when it's trying to figure out what the encoding of the string is instead of when it's actually performing the change to the string encoding. ashtuchkin/iconv-lite#187 should probably be closed, in favor of opening an issue over at https://github.com/aadsm/jschardet/issues.

@SombraRO
Copy link
Author

@Throne3d Before when I was using iconv and node-icu-charset-detector worked well, after the update that was sent yesterday it started to occur problem.

@Throne3d
Copy link
Collaborator

@SombraRO Yep, I would expect that. It updated to use iconv-lite and jschardet. iconv-lite only converts between two known encodings, whereas jschardet is the place that tries to figure out what encoding the original message is in. It's more likely that jschardet is simply guessing wrong than that iconv-lite is converting between known encodings incorrectly, I expect.

@SombraRO
Copy link
Author

Thank you I will wait for a confirmation of jschardet .

@SombraRO
Copy link
Author

@Throne3d If jschardet doesn't give an answer any possibility to change the dependency?

I was looking on github and I found the chardet

https://github.com/runk/node-chardet

@Mikaela
Copy link
Contributor

Mikaela commented May 13, 2018

I started seeing this or something similar after updating to https://github.com/reactiflux/discord-irc/releases/tag/v2.6.1 but I am and my users are using UTF-8.

For example when I say Kyllä (Yes in Finnish) at IRC, it comes to Discord as Kyllä. And in longer strings of text, the ä/ö start turning into Chinese or Japanese characters randomly, but not always.

Someone following this said that this is a result of double-UTF-8 even if I am not sure what that means.

@redfellow
Copy link

redfellow commented May 16, 2018

I too started experiencing this today after switching from latest commit in this repo to the npm package. All ä and ö characters from IRC display as ä in Discord. I've got the fresh webhook functionality enabled as well.

If I disable ircOptions.encoding: "utf-8" the characters start to display as .

Update: rows break on random. Around half are ok, half are either full of ä or or
image

@JoinedSenses
Copy link

JoinedSenses commented May 24, 2018

I recieve this problem as well and it causes some issues, since I use a game server plugin that relays messages containing UTF-8 chars to an IRC channel.

@Mikaela
Copy link
Contributor

Mikaela commented May 30, 2018

I managed to workaround this by adding "encoding": null into "ircOptions": { }, as I don't need charset recording, because everyone is either using UTF-8 or that is not my problem.

Thanks to:

@Throne3d
Copy link
Collaborator

Throne3d commented May 30, 2018

It looks like swapping to jschardet was a mistake. I'll look into https://github.com/runk/node-chardet, as @SombraRO suggested, over the next few weeks, and if that doesn't fix it I'll revert to using ICU. (I figure that it's better for it to work properly with the right dependencies, and fail otherwise, than for it to fail weirdly across the board.)

(Edit: Made an issue over at irc-upd: Throne3d/node-irc#59.)

@Throne3d
Copy link
Collaborator

Throne3d commented Jun 2, 2018

Alright, I should have the issue fixed in the upstream repository. I'd like to check if it works in practice, before trying to get another release out with it in, and so I've updated a fork of discord-irc to use this updated version, over at https://github.com/Throne3d/discord-irc. If someone could try using that version of discord-irc (likely by checking out the repository from source), and reporting whether it resolves the issue, I'd appreciate it – otherwise, I'll probably cut a new release of the upstream library within a week, and then discord-irc should update to that soon afterwards.

@andreas-it-dev
Copy link

@Throne3d i can confirm that you fixed the issue for me in your upstream version.

thank you!

@andreas-it-dev
Copy link

but only for a few hours.. kept it running but after a while it stopped working.. IRC special characters were back "screwed" on the discord side

@Throne3d
Copy link
Collaborator

Throne3d commented Sep 5, 2018

That's really weird! Are they reoccurring in the exact same way as before? Did you see an error in the console when it started to fail again?

@ghost
Copy link

ghost commented Sep 25, 2018

I'm currently experiencing this issue with accented characters á, é, í, ó and ú which show up as a, e, i, o and u in Discord. Latest version of discord-irc is 2.6.1 at time of writing which we are running.

Is the version at https://github.com/Throne3d/discord-irc newer or older than this reactiflux version? I'm happy to test whichever needs testing.

@JoinedSenses
Copy link

If you dont use any sort of special encoding, one of the above comments solved my problem

#399 (comment)

@Throne3d
Copy link
Collaborator

Throne3d commented Oct 8, 2018

@tenleftfingers Sorry to take so long to get back on this. The version as of you writing should have actually been 2.6.2 - and is in the code on master branch, we just didn't add a new tag or release for it. This version (reactiflux, 2.6.2) should currently be more up to date than the code in my fork.

Do you still have trouble if you use 2.6.2?

@ghost
Copy link

ghost commented Oct 16, 2018

@Throne3d I used Mikaela's workaround with 2.6.1 to solve this issue #399 (comment)

(Sorry too, for my delay!)

@Throne3d
Copy link
Collaborator

Throne3d commented Aug 4, 2019

Closing this as it should be fixed in recent versions of discord-irc.

@Throne3d Throne3d closed this as completed Aug 4, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants