On Mon, 2012-11-26 at 13:08 -0800, mathog wrote:
For your purposes, would it be sufficient if after reassembly the formatting information was discarded and just the logical information retained?
Hey mathog, Interesting work you have there, I have a friend of mine who has been pulling her hair out because no tools can take a pdf and turn it roughly into a working document (I did try and explain the situation to her though).
The problem here is that we know with certanty that these letters are a part of the same piece of text. They're in the same tspan and text box. When you look in the svg you see something like:
<tspan x="0,1,2,3,4" y="0">Hello</tspan>
Which basically means put the first letter at 0, the second at 1 etc. These documents were all created natively in inkscape and don't involve imports (so far).