
On 23-Nov-2012 19:48, Martin Owens wrote:
hey guys,
I'm developing an extension to manage translations (which I do via launchpad and xml2po) but I'm having trouble with tspans.
The problem seems to be that inkscape saves multiple values for the x attribute for some (not all) tspan sections. Specifying the letter placements is death to translations as the number and size of letters is guaranteed to be different.
Is there any api way to strip out these bumbling attributes or better have them not appear in the first place?
Hmm. Well, this may not be what you are after, but...
I have been working on code to reassemble formatted, editable text from component pieces. The idea is that something like this in Inkscape:
(E:bold)(=mc:no special formatting)(2:superscript)
when present in an EMF or PS file, for instance, is represented by 3 separately formatted text strings: {E,=mc,2}
These are currently read back into Inkscape as just those pieces. It looks exactly like the original, but the pieces are not assembled, so the whole is not editable. My code tries to reassemble the pieces from its position, font information, etc. and makes <test><tspan> records to match. This work is not done but the current version does pretty well at figuring out where paragraphs start and end, figures out the justifications and so forth, and generating editable Inkscape SVG. It works with rotated text, but at present cannot figure out when the first sentence of a paragraph belongs with the remainder if the first is indented by starting it an offset (as opposed to by using leading spaces.)
For your purposes, would it be sufficient if after reassembly the formatting information was discarded and just the logical information retained? That would give you sentences and paragraphs (super and subscripts would be problematical.)
Regards,
David Mathog mathog@...1176... Manager, Sequence Analysis Facility, Biology Division, Caltech