hey guys,
I'm developing an extension to manage translations (which I do via launchpad and xml2po) but I'm having trouble with tspans.
The problem seems to be that inkscape saves multiple values for the x attribute for some (not all) tspan sections. Specifying the letter placements is death to translations as the number and size of letters is guaranteed to be different.
Is there any api way to strip out these bumbling attributes or better have them not appear in the first place?
Thanks for your time.
Best Regards, Martin Owens
On 23-Nov-2012 19:48, Martin Owens wrote:
hey guys,
I'm developing an extension to manage translations (which I do via launchpad and xml2po) but I'm having trouble with tspans.
The problem seems to be that inkscape saves multiple values for the x attribute for some (not all) tspan sections. Specifying the letter placements is death to translations as the number and size of letters is guaranteed to be different.
Is there any api way to strip out these bumbling attributes or better have them not appear in the first place?
Hmm. Well, this may not be what you are after, but...
I have been working on code to reassemble formatted, editable text from component pieces. The idea is that something like this in Inkscape:
(E:bold)(=mc:no special formatting)(2:superscript)
when present in an EMF or PS file, for instance, is represented by 3 separately formatted text strings: {E,=mc,2}
These are currently read back into Inkscape as just those pieces. It looks exactly like the original, but the pieces are not assembled, so the whole is not editable. My code tries to reassemble the pieces from its position, font information, etc. and makes <test><tspan> records to match. This work is not done but the current version does pretty well at figuring out where paragraphs start and end, figures out the justifications and so forth, and generating editable Inkscape SVG. It works with rotated text, but at present cannot figure out when the first sentence of a paragraph belongs with the remainder if the first is indented by starting it an offset (as opposed to by using leading spaces.)
For your purposes, would it be sufficient if after reassembly the formatting information was discarded and just the logical information retained? That would give you sentences and paragraphs (super and subscripts would be problematical.)
Regards,
David Mathog mathog@...1176... Manager, Sequence Analysis Facility, Biology Division, Caltech
On Mon, 2012-11-26 at 13:08 -0800, mathog wrote:
For your purposes, would it be sufficient if after reassembly the formatting information was discarded and just the logical information retained?
Hey mathog, Interesting work you have there, I have a friend of mine who has been pulling her hair out because no tools can take a pdf and turn it roughly into a working document (I did try and explain the situation to her though).
The problem here is that we know with certanty that these letters are a part of the same piece of text. They're in the same tspan and text box. When you look in the svg you see something like:
<tspan x="0,1,2,3,4" y="0">Hello</tspan>
Which basically means put the first letter at 0, the second at 1 etc. These documents were all created natively in inkscape and don't involve imports (so far).
Martin,
On 26-Nov-2012 13:51, Martin Owens wrote:
The problem here is that we know with certanty that these letters are a part of the same piece of text. They're in the same tspan and text box. When you look in the svg you see something like:
<tspan x="0,1,2,3,4" y="0">Hello</tspan>
Select the affected text and do:
Text -> Remove manual kerns
and the x="" stuff will go, at least it will for everything but the first letter.
Regards,
David Mathog mathog@...1176... Manager, Sequence Analysis Facility, Biology Division, Caltech
On Mon, 2012-11-26 at 14:43 -0800, mathog wrote:
Select the affected text and do:
Text -> Remove manual kerns
and the x="" stuff will go, at least it will for everything but the first letter.
Any way to remove that from a whole bunch of svg files? At least I know the function now :-D
Martin,
On 26-Nov-2012 21:05, Martin Owens wrote:
On Mon, 2012-11-26 at 14:43 -0800, mathog wrote:
Select the affected text and do:
Text -> Remove manual kerns
and the x="" stuff will go, at least it will for everything but the first letter.
Any way to remove that from a whole bunch of svg files? At least I know the function now :-D
I don't think the command line options in inkscape are going to help (somebody please correct me if I'm wrong). I would probably do it with perl, but there is also inkscapebatch
http://sourceforge.net/projects/inkscapebatch/
which looks like maybe it could be used for this. (I have never used it.)
Regards,
David Mathog mathog@...1176... Manager, Sequence Analysis Facility, Biology Division, Caltech
participants (2)
-
Martin Owens
-
mathog