I've just got a minor note everyone who might deal with character encodings (probably most of us) need to be aware of.
We're basically dealing with three different string encodings (not just two).
1) The system/GTK+ encoding, which is UTF-8 2) The locale encoding 3) The filesystem encoding
Now, though the latter two could often be set to the same, they do not have to be.
I'm going through cleaning up some file operations, and in that it seems that there could be a few subtle bugs hiding here and there, depending on the understanding of encodings and differences.
For the most part, we want strings we deal with to be UTF-8 as much as possible. When strings come in or go out through calls that might generate or need something other than UTF-8, we should translat at that point. So we end up needing UTF-8 for all but some filename IO stuff.
participants (1)
-
Jon A. Cruz