On Wed, 2012-11-28 at 08:48 -0800, mathog wrote:
There is another problem with some PS files - they drop all the spaces. So that "this is text" becomes the character set {t,h,i,s,i,s,t,e,x,t}. The code I'm working on will have an option to try to reinsert the spaces based on the letter spacing.
For your next trick, try and work out where the text is starting and ending in it's flow and construct a text box to contain it.
Martin,
From MAILER-DAEMON Thu Nov 29 14:42:21 2012
X-ACL-Warn: MIME-Version: 1.0 In-Reply-To: <1354128112.12065.0.camel@...2056...> References: mailman.57205.1353964106.2176.inkscape-devel@lists.sourceforge.net <op.wogrumqhxr72zo@...2910...> <000ac02d3e52b6ff3f4b50421105a5f2@...2855...> <1354128112.12065.0.camel@...2056...> Date: Thu, 29 Nov 2012 06:42:06 -0800 Message-ID: <CA+aQ9usrtdcM0kOkmXKRXAuHStek6prtQ0Gb740j0QDkVhZB7w@...401...> From: inkscape-devel.neophyte_rep@...2295... To: inkscape-devel@lists.sourceforge.net X-Spamgourmet: X-Spam-Score: 0.1 (/) X-Spam-Report: Spam Filtering performed by mx.sourceforge.net. See http://spamassassin.org/tag/ for more details. -0.0 RCVD_IN_DNSWL_NONE RBL: Sender listed at http://www.dnswl.org/, no trust [216.75.62.102 listed in list.dnswl.org] 0.1 DKIM_SIGNED Message has a DKIM or DK signature, not necessarily valid 0.0 T_DKIM_INVALID DKIM-Signature header exists but is not valid X-Headers-End: 1Te5Jg-0003Vz-0Y Subject: Re: [Inkscape-devel] Translations and r11895 bug in flow text X-BeenThere: inkscape-devel@lists.sourceforge.net X-Mailman-Version: 2.1.9 Precedence: list Reply-To: inkscape-devel.neophyte_rep@...2295... List-Id: <inkscape-devel.lists.sourceforge.net> List-Unsubscribe: https://lists.sourceforge.net/lists/listinfo/inkscape-devel, mailto:inkscape-devel-request@lists.sourceforge.net?subject=unsubscribe List-Archive: http://sourceforge.net/mailarchive/forum.php?forum_name=inkscape-devel List-Post: mailto:inkscape-devel@lists.sourceforge.net List-Help: mailto:inkscape-devel-request@lists.sourceforge.net?subject=help List-Subscribe: https://lists.sourceforge.net/lists/listinfo/inkscape-devel, mailto:inkscape-devel-request@lists.sourceforge.net?subject=subscribe X-List-Received-Date: Thu, 29 Nov 2012 14:42:22 -0000 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8bit
Perhaps a collaboration with the authors of "Layout-aware text extraction from full-text PDF of scientific articles" < http://code.google.com/p/lapdftext/ > would be productive? It is reviewed here < http://www.scfbm.org/content/7/1/7 >.
On Wed, Nov 28, 2012 at 10:41 AM, Martin Owens - doctormo@...400... wrote:
On Wed, 2012-11-28 at 08:48 -0800, mathog wrote:
There is another problem with some PS files - they drop all the spaces. So that "this is text" becomes the character set {t,h,i,s,i,s,t,e,x,t}. The code I'm working on will have an option to try to reinsert the spaces based on the letter spacing.
For your next trick, try and work out where the text is starting and ending in it's flow and construct a text box to contain it.
Martin,
Keep yourself connected to Go Parallel: INSIGHTS What's next for parallel hardware, programming and related areas? Interviews and blogs by thought leaders keep you ahead of the curve. http://goparallel.sourceforge.net _______________________________________________ Inkscape-devel mailing list Inkscape-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/inkscape-devel