On 23-Nov-2012 19:48, Martin Owens wrote:
hey guys,
I'm developing an extension to manage translations (which I do via
launchpad and xml2po) but I'm having trouble with tspans.
The problem seems to be that inkscape saves multiple values for the x
attribute for some (not all) tspan sections. Specifying the letter
placements is death to translations as the number and size of letters
is
guaranteed to be different.
Is there any api way to strip out these bumbling attributes or better
have them not appear in the first place?
Hmm. Well, this may not be what you are after, but...
I have been working on code to reassemble formatted, editable text from
component pieces. The idea is that something like
this in Inkscape:
(E:bold)(=mc:no special formatting)(2:superscript)
when present in an EMF or PS file, for instance, is represented by 3
separately formatted text strings:
{E,=mc,2}
These are currently read back into Inkscape as just those pieces. It
looks exactly like the original, but the pieces
are not assembled, so the whole is not editable. My code tries to
reassemble the pieces from its position, font information, etc. and
makes <test><tspan> records to match. This work is not done but the
current version does pretty well at figuring out where paragraphs start
and end, figures out the justifications and so forth, and generating
editable Inkscape SVG. It works with rotated text, but at present
cannot figure out when the first sentence of a paragraph belongs with
the remainder if the first is indented by starting it an offset (as
opposed to by using leading spaces.)
For your purposes, would it be sufficient if after reassembly the
formatting information was discarded and just the logical
information retained? That would give you sentences and paragraphs
(super and subscripts would be problematical.)
Regards,
David Mathog
mathog@...1176...
Manager, Sequence Analysis Facility, Biology Division, Caltech