Alphabetical order internationalization
If I want to sort a list of strings in alphabetical order, obviously that is dependent on the alphabet being used. How do I find out the order of the alphabet being used? Or, better yet, is their a function to rank two strings based on that alphabet?
I don't see anything to do this. The standard functions just seem to use ASCII numbering, which of course doesn't really work for any language besides English. What happens when you sort by name in Nautilus? Is this a known problem I just don't realize?
--Ted
On Dec 5, 2005, at 11:37 PM, Ted Gould wrote:
If I want to sort a list of strings in alphabetical order, obviously that is dependent on the alphabet being used. How do I find out the order of the alphabet being used? Or, better yet, is their a function to rank two strings based on that alphabet?
Check strcoll()
That and qsort() might make you happy.
On Tue, Dec 06, 2005 at 08:37:44AM -0800, Jon A. Cruz wrote:
On Dec 5, 2005, at 11:37 PM, Ted Gould wrote:
If I want to sort a list of strings in alphabetical order, obviously that is dependent on the alphabet being used. How do I find out the order of the alphabet being used? Or, better yet, is their a function to rank two strings based on that alphabet?
Check strcoll()
That and qsort() might make you happy.
(I'd suggest std::sort from <algorithm> instead of C's qsort: it's both more type-safe and, incidentally, faster.)
I'd suggest using g_utf8_collate_key and sorting based on the keys rather than sorting using strcoll (or g_utf8_collate) directly, for two reasons:
(i) The documentation of g_utf8_collate recommends using g_utf8_collate_key when sorting a significant number of strings. (It cites speed reasons: n collation calculations instead of n log n.)
(ii) We can be certain that strcmp (on g_utf8_collate_key values) satisfies the conditions required for sorting (http://www.sgi.com/tech/stl/StrictWeakOrdering.html), whereas I wouldn't be so confident that all implementations of strcoll do: it's very common for custom comparison functions not to satisfy the conditions, in my experience.
pjrm.
Thanks guys! I knew I had to not be the first one doing this :)
I'll be inserting one string into a list of already sorted strings, so probably the keying, and choices of algorithms won't effect things that much. Also, we only have about 20 effects right now.
It seems that Glib::ustring does this by default in their implementation of the comparison functions. I think I'll probably go with that first, and if speed becomes an issue, optimize.
Thanks again, Ted
participants (3)
-
Jon A. Cruz
-
Peter Moulder
-
Ted Gould