Unicode (UTF-8) support...
 

News:

29 December 2022 - PtokaX 0.5.3.0 (20th anniversary edition) released...
11 April 2017 - PtokaX 0.5.2.2 released...
8 April 2015 Anti child and anti pedo pr0n scripts are not allowed anymore on this board!
28 September 2015 - PtokaX 0.5.2.1 for Windows 10 IoT released...
3 September 2015 - PtokaX 0.5.2.1 released...
16 August 2015 - PtokaX 0.5.2.0 released...
1 August 2015 - Crowdfunding for ADC protocol support in PtokaX ended. Clearly nobody want ADC support...
30 June 2015 - PtokaX 0.5.1.0 released...
30 April 2015 Crowdfunding for ADC protocol support in PtokaX
26 April 2015 New support hub!
20 February 2015 - PtokaX 0.5.0.3 released...
13 April 2014 - PtokaX 0.5.0.2 released...
23 March 2014 - PtokaX testing version 0.5.0.1 build 454 is available.
04 March 2014 - PtokaX.org sites were temporary down because of DDOS attacks and issues with hosting service provider.

Main Menu

Unicode (UTF-8) support...

Started by PPK, 03 September, 2011, 00:36:15

Previous topic - Next topic

0 Members and 2 Guests are viewing this topic.

PPK

One of things i'm missing in PtokaX (and whole nmdc protocol) is full Unicode (using UTF-8 encoding) support. Many users actually use that because many clients already support it, but PtokaX missing it and will be nice to get proper support for that.
Advantage is of course that everyone will see characters correctly, when now for example users with central european encoding don't see correctly russian characters.
Disadvantage is that users without UTF-8 client don't see special characters correctly (in similar way as CE users don't see correctly RUS characters now). They will see correctly standard low ascii characters (characters 0 - 127 in ascii table).
NMDC protocol is UTF-8 compatible, it will not break anything (even original old neo-modus hub and client version 1 don't have problem with UTF-8 encoded protocol messages, only don't show correctly special characters).

Please vote if that change should happen ::)
"Most of you are familiar with the virtues of a programmer. There are three, of course: laziness, impatience, and hubris." - Larry Wall

Fox_home

I agree, also request to add support for UTF8 in scripts.

WAJIM


PPK

#3
Example image (Click to enlarge):


On top smart client with UTF-8 support. Correctly show UTF-8 text and it is smart enough to detect non-UTF-8 strings and show them correctly using actual system locale (in my case central european - win-1250 encoding).

On bottom PtokaX UDP-Debug Receiver, serve here as client without UTF-8 support ::)

Quote from: Fox_home on 03 September, 2011, 04:46:45
I agree, also request to add support for UTF8 in scripts.
Question is what you mean by UTF-8 support in scripts, you can actually use UTF-8 in scripts (as is shown on my image) but you need to avoid BOM (ie you can't save scripts in UTF-8 in windows notepad, but you can for example in PSPad where is possible to disable BOM). Problem with BOM is (or should be, because i seen code related to that in source) fixed in Lua 5.2 :P

Quote from: WAJIM on 04 September, 2011, 12:10:16
Optional, of course!  :-\
:'(
"Most of you are familiar with the virtues of a programmer. There are three, of course: laziness, impatience, and hubris." - Larry Wall

Fox_home

Quote from: PPK on 04 September, 2011, 19:05:13
Question is what you mean by UTF-8 support in scripts, you can actually use UTF-8 in scripts (as is shown on my image) but you need to avoid BOM (ie you can't save scripts in UTF-8 in windows notepad, but you can for example in PSPad where is possible to disable BOM). Problem with BOM is (or should be, because i seen code related to that in source) fixed in Lua 5.2 :P
:'(
thanks

dmvn

It's interesting to know, how much code (estimated value, of course!) should be revised to support UTF-8. Basic functions like strlen() works fine with UTF, so where is the root of all evil in PX sources?
And what about your plans on this refactoring?

PPK

In PtokaX core that should be really simple, because core is working with most text data as with data of some size in bytes and not care about encoding. Here will be only all input data (from users, scripts, language files, settings) checked if they are utf-8 and if not then converted.
In gui it is more complicated, as gui on windoze don't use utf-8 or ascii (one byte per character) encoding to display texts. So here all text data will be converted from/to unicode encoding used by winapi.

I don't know when i have time to make bigger change as this one is  :(
"Most of you are familiar with the virtues of a programmer. There are three, of course: laziness, impatience, and hubris." - Larry Wall

RPGamer

That's great, you must be like god level programmer.  :angel:
Is it on Github? How can one contribute to it?

SMF spam blocked by CleanTalk