On 29-Aug-2013 17:59, Krzysztof Kosiński wrote:
The cross-platform EMF/WMF support that was merged recently needs
some cleanups.
1. libunicode-convert. I don't exactly see why this particular set of
functions was separated into a mini-library; I would expect that to
be
part of libuemf. It is also somewhat misleadingly named. I would name
these files e.g. ms-symbol-font-utils.[hc].
EMF, the file format, supports Unicode and everything that is done in a
reasonable manner using Unicode in an EMF file is in libUEMF.
Unfortunately, PowerPoint has some preUnicode left in it. It uses
uint8_t values to code characters in symbol, ZapfDingBats, and
WingDings,
with these values stuffed into the Unicode Private Use area that
Microsoft uses.
Strictly speaking this isn't EMF related, even though it comes in
through an
EMF file. I am pretty sure that PowerPoint is doing similar things
when it writes
to other formats like .cgm or its own .ppt. In any case, Inkscape
cannot handle
these nonUnicode text strings, so libunicode-convert is used to
translate them
into/out of Unicode for Inkscape, as best it can. (Some translations
are 1 to none,
others are 1 to many, or many to 1.)
I am not wedded to the current name. This is a one line description of
what it does:
Convert between characters encoded in nonUnicode fonts (in Symbol,
ZapfDingBats, or Wingdings fonts)
and the closest Unicode equivalent.
Is this better:
Symbolic-Unicode-NonUnicode-Convert
?
2. I think libuemf should be in a separate directory, like libcola,
not tucked away in src/extension/internal.
Agreed, it is where it is primarily as a historical accident.
3. The new code should be made to conform to the Inkscape coding
style. In most places it's a matter of running the code through
astyle
to fix the spacing, but in some cases manual adjustments are needed.
For example, the function names in libunicode-convert are utterly
non-descriptive ('isNon()' and 'CanUTN()' are not good names for
library functions), and the return statements in
src/extension/emf-inout.cpp incorrectly use parentheses around return
values.
http://staging.inkscape.org/en/develop/coding-style/
Will reformat the files that are Inkscape specific. The complete
libUEMF
and libTERE packages are distributed independently (only some of their
files
are included with Inkscape). Speaking of libTERE, at some point that
capability
should be added to the other import drivers. Pretty much all of the
graphics file
formats have the same formatted string issue that EMF does. See
http://saf.bio.caltech.edu/PPT_G_P_I/#I2P_formatted_strings
http://saf.bio.caltech.edu/PPT_G_P_I/#P2I_formatted_strings
4. Some functions, such as msdepua(), have a rather strange mode of
operation, modifying the string in place. The interface of those
should be changed so that they are easier to use.
I have a vague recollection of doing it that way because there was an
issue related to how data
was being passed around through the various levels of the Inkscape
output drivers. When I get
a chance I will revisit this and see if it can be cleaned up.
Unfortunately, I do not expect to have time to do any of this code
clean up for at least a couple of weeks.
Regards,
David Mathog
mathog@...1176...
Manager, Sequence Analysis Facility, Biology Division, Caltech