
On Sun, 2005-01-09 at 15:02, Jon A. Cruz wrote:
Bryce Harrington wrote:
I think you missed the context. This is for code that is not UTF-8.
Are you *sure*???
I did scan for context and didn't see anything that seemed to guarantee non-UTF-8.
Given that RedHat switched the default locale to UTF-8 back in RH 8.0, and Solaris back with Solaris 7 or 8, many base assumptions are now dangerous.
Not dangerous, wrong. And they always were, UTF-8 or no, ever since the advent of multi-byte encodings decades ago. There are only two cases possible in the Inkscape codebase:
1. we're dealing with an internal string, which should always be UTF-8, in which case c+N to advance N chars is hopelessly broken
2. we're dealing with an external string in the current locale's encoding, which may be any single or multibyte encoding (not just UTF-8), in which case c+N to advance N chars is hopelessly broken
If you need to do nontrivial string manipulation in the first case, please use the appropriate glib functions (documented at http://developer.gnome.org/doc/API/2.0/glib/glib-Unicode-Manipulation.html), or just use Glib::ustring, which does everything for you.
If you need to do nontrivial string manipulation in the second case, the easiest thing to do is to convert the string from the locale encoding to UTF-8 first. Glib::ustring also handles that automatically (though check its documentation at http://www.gtkmm.org/gtkmm2/docs/reference/html/classGlib_1_1ustring.html as the situations where it does conversion and the situations where it does not are not always obvious).
The easiest way is to always use Glib::ustring, and use Glib::locale_to_utf8() and Glib::utf8_to_locale() if you need to convert to/from it to strings in the current locale.
-mental