Author
Topic: Unicode paths (Read 430 times)

I've posted an earlier version of this patch here, but since that discussion is about something that has been completed, I start a new topic for this.

The purpose of this patch is to convert the internal UTF-8 strings Simutrans uses throughout to UTF-16 when compiling for Windows, and then calling the wide Windows API functions, or similar Windows specific extensions of the C and gzip API. There are some hacks in already, but it does things very odd and only if a few places.

I've been running Simutrans with these changed for a year, and I've never noticed problems. However, I have only done sporadic testing of running Simutrans in a directory with a non-ASCII name. There might also be recent problem due to merging in other changes, though. And that Simutrans reads Unicode paths correctly does not mean that it can display the names correctly, if the font does not contain the glyphs.

Although prissi was against it, the implementation is still located in simio.cc and not simsys.c, because some of the modified code is shared between simutrans and makeobj, and dragging simsys into makeobj just causes problems.

Simutrans works well with unicode paths. I tested it with japanese (for instance with a japanese user name). It works, and also japanese file names work, they are saved correctly and will be loaded fine too. However, at least my version of bzip2 cannot open a file with utf16 and needs to use the short name anyway. And linux and mac use UTF8, as well as simutrans internally for all display actions. So where is the advantage?

Using windows specific extensions in the non-OS dependent part does not sound like a great idea to me, if it does not fix a real problem.

The display problem will be not solved by utf16, since the standard font has not the needed characters. Internally everything is utf8 already (which would allow even for more characters than utf16). Changing to freefont lib will solve the display problem, then you will see the correct name even in another language.

Well there is some strange code involving short path names and creation of non-existent files. My code just does things straight. And bzip2 does not open any files in Simutrans, it just operates on files already opened by fopen. From what I can tell, it has no idea what the file name is.

Yes, Linux and Mac uses UTF-8. I said this was for Windows only, however all path stuff goes through the new functions to keep the platform conditional compilation contained in one place (plus simsys_*.cc). Windows either uses "ANSI" or UTF-16 in its APIs. Since Simutrans is all UTF-8 internally, one must convert to one or the other. "ANSI" is deprecated. Most new parts of the API only get a Unicode implementation.

gzopen did not read unicode filenames when I tested it last time. It chocked on real Japanese characters. Maybe I need to test it again. (It is rarely used nowadys, only for network games since per default savinf is bzip2. Are you sure it does the correct stuff with utf8 characters?)

Oh, searchfolder works only for MSVC indeed, the Mingw builds just display garbage names. That is another very longstanding bug in the 102.2.2 release. Must be there again for a long time. The Japanese community should have complained!