
On 29-Aug-2013 17:59, Krzysztof KosiĆski wrote:
The cross-platform EMF/WMF support that was merged recently needs some cleanups.
- libunicode-convert. I don't exactly see why this particular set of
functions was separated into a mini-library; I would expect that to be part of libuemf. It is also somewhat misleadingly named. I would name these files e.g. ms-symbol-font-utils.[hc].
EMF, the file format, supports Unicode and everything that is done in a reasonable manner using Unicode in an EMF file is in libUEMF.
Unfortunately, PowerPoint has some preUnicode left in it. It uses uint8_t values to code characters in symbol, ZapfDingBats, and WingDings, with these values stuffed into the Unicode Private Use area that Microsoft uses. Strictly speaking this isn't EMF related, even though it comes in through an EMF file. I am pretty sure that PowerPoint is doing similar things when it writes to other formats like .cgm or its own .ppt. In any case, Inkscape cannot handle these nonUnicode text strings, so libunicode-convert is used to translate them into/out of Unicode for Inkscape, as best it can. (Some translations are 1 to none, others are 1 to many, or many to 1.)
I am not wedded to the current name. This is a one line description of what it does:
Convert between characters encoded in nonUnicode fonts (in Symbol, ZapfDingBats, or Wingdings fonts) and the closest Unicode equivalent.
Is this better:
Symbolic-Unicode-NonUnicode-Convert
?
- I think libuemf should be in a separate directory, like libcola,
not tucked away in src/extension/internal.
Agreed, it is where it is primarily as a historical accident.
- The new code should be made to conform to the Inkscape coding
style. In most places it's a matter of running the code through astyle to fix the spacing, but in some cases manual adjustments are needed. For example, the function names in libunicode-convert are utterly non-descriptive ('isNon()' and 'CanUTN()' are not good names for library functions), and the return statements in src/extension/emf-inout.cpp incorrectly use parentheses around return values. http://staging.inkscape.org/en/develop/coding-style/
Will reformat the files that are Inkscape specific. The complete libUEMF and libTERE packages are distributed independently (only some of their files are included with Inkscape). Speaking of libTERE, at some point that capability should be added to the other import drivers. Pretty much all of the graphics file formats have the same formatted string issue that EMF does. See
http://saf.bio.caltech.edu/PPT_G_P_I/#I2P_formatted_strings http://saf.bio.caltech.edu/PPT_G_P_I/#P2I_formatted_strings
- Some functions, such as msdepua(), have a rather strange mode of
operation, modifying the string in place. The interface of those should be changed so that they are easier to use.
I have a vague recollection of doing it that way because there was an issue related to how data was being passed around through the various levels of the Inkscape output drivers. When I get a chance I will revisit this and see if it can be cleaned up.
Unfortunately, I do not expect to have time to do any of this code clean up for at least a couple of weeks.
Regards,
David Mathog mathog@...1176... Manager, Sequence Analysis Facility, Biology Division, Caltech