Preserving shapes better when deleting nodes
Hi,
I've been using Inkscape quite a while now for tracing cartoon-like images. The paths usually end up having way too many nodes which is a problem for the vinyl cutter I use to plot these images.
I tried using the simplify path tool but it makes the path usually too simple and changes the overall appearance.
Therefore I use the path tool (F2) to select unneeded nodes and delete them, but this leads to curves which poorly resemble the original curve, so I have to adjust the curve by hand which consumes a lot of time since I have to clone the path first so I can see the original shape and adjust the simplified curve to fit it better.
I made some examples, see attachment demonstration.svgz.
I also formulated a target function for minimizing the distance between two curves and calculated the partial derivatives of this function. I'm pretty confident I will be able to use a gradient based method for finding the two control points which lead to minimal distance between the original curve and the new one.
Thoughts?
Sincerely Yours Alexander
could you indicate how the original image was produced (the one with 1393 nodes) ? are you using something like Trace Bitmap?
Alvin
-- View this message in context: http://inkscape.13.x6.nabble.com/Preserving-shapes-better-when-deleting-node... Sent from the Inkscape - Dev mailing list archive at Nabble.com.
On 11/04/13 05:37, alvinpenner wrote:
could you indicate how the original image was produced (the one with 1393 nodes) ?
I used trace bitmap with some brightness cutoff value I forgot in this image:
http://izaak.jellinek.com/tuxes/images/big%20tux.png
Alexander
in Trace Bitmap, in the Options tab, there are parameters called Threshold and Tolerance which apparently can be used for smoothing. Have you tried increasing these parameters?
Alvin
-- View this message in context: http://inkscape.13.x6.nabble.com/Preserving-shapes-better-when-deleting-node... Sent from the Inkscape - Dev mailing list archive at Nabble.com.
IMHO Alvin suggestions are correct but are moving the problem away so I judge them a bit off-topic. There are many cases where you start from an unecessarily complex path which you need to simplify and working on the source is not always possible. Personally I'd prefer to see the proposal from Alexander realized: find the control points that minimize the distance from the original path. It would be very useful for my workflow too: I often have to do the same, i.e. keeping the reference in a bottom locked layer and readjusting the (usually very wrong) simplified semi-transparent lines. It's a very time-consuming process.
-- View this message in context: http://inkscape.13.x6.nabble.com/Preserving-shapes-better-when-deleting-node... Sent from the Inkscape - Dev mailing list archive at Nabble.com.
On 11-04-13 02:11, Alexander Brock wrote:
... I also formulated a target function for minimizing the distance between two curves and calculated the partial derivatives of this function. I'm pretty confident I will be able to use a gradient based method for finding the two control points which lead to minimal distance between the original curve and the new one.
Thoughts?
I see some issues with your maths, in particular that you seem to suggest integrating by arc-length, while the curves generally will not have the same length. In general, it is not that easy to get a natural correspondence between two parametric curves. However, don't let that stop you from experimenting. Feel free to come up with a proposal for how to do this, with some examples of the effect. (A patch would obviously be welcome if your suggestion is an improvement.)
One thing you might try to avoid needing a correspondence between positions on the curves is to look at the area between the curves, and see if you can find a decent approach of minimizing that. There should also be a fairly large amount of literature on approximating a sequence of points by a spline.
BTW, I would double-check what functionality lib2geom already has.
On 11/04/13 14:03, Jasper van de Gronde wrote:
I see some issues with your maths, in particular that you seem to suggest integrating by arc-length, while the curves generally will not have the same length. In general, it is not that easy to get a natural correspondence between two parametric curves.
I want to address that issue by choosing equidistant integration points on the original curve and then find for each of them the parameter t for which the distance between the new curve and the chosen point on the original curve becomes minimal. The latter unfortunately needs to be calculated in every iteration but I think the set of parameters (t_1, ... t_n) created by the iteration before will be a good initial guess and it wont change much near the optimal point.
However, don't let that stop you from experimenting. Feel free to come up with a proposal for how to do this, with some examples of the effect. (A patch would obviously be welcome if your suggestion is an improvement.)
I found the lib2geom files but I didn't find the code which handles deleting nodes, can somebody please tell me where to look for that?
One thing you might try to avoid needing a correspondence between positions on the curves is to look at the area between the curves, and see if you can find a decent approach of minimizing that.
That's interesting, I will check if I can calculate derivatives for that.
There should also be a fairly large amount of literature on approximating a sequence of points by a spline.
I found some interesting articles and I will read them later:
http://jimherold.com/2012/04/20/least-squares-bezier-fit/ http://iut-arles.univ-provence.fr/web/romain-raffin/sites/romain-raffin/IMG/...
Alexander
On 11/04/13 21:32, Alexander Brock wrote:
I found the lib2geom files but I didn't find the code which handles deleting nodes, can somebody please tell me where to look for that?
I found it: src/ui/tool/path-manipulator.cpp
Line 644 points to the problem // TODO the fitting algorithm sucks - rewrite it to be awesome
I will try :-)
Alexander
On 11-04-13 21:32, Alexander Brock wrote:
On 11/04/13 14:03, Jasper van de Gronde wrote:
... However, don't let that stop you from experimenting. Feel free to come up with a proposal for how to do this, with some examples of the effect. (A patch would obviously be welcome if your suggestion is an improvement.)
I found the lib2geom files but I didn't find the code which handles deleting nodes, can somebody please tell me where to look for that?
lib2geom does not handle it itself, but it provides the basic tools used for manipulating paths internally. This includes some "not so basic" stuff that might be useful to you. Most of it is based on representing paths using an alternative basis that is slightly easier to work with than the Bézier basis, but this is mostly transparent, as all sorts of tools exist to manipulate this representation (including computing derivates and such).
The actual deleteNodes method seems to be the one in src/ui/tool/path-manipulator.cpp.
On 11/04/13 02:11, Alexander Brock wrote:
I also formulated a target function for minimizing the distance between two curves and calculated the partial derivatives of this function. I'm pretty confident I will be able to use a gradient based method for finding the two control points which lead to minimal distance between the original curve and the new one.
I have implemented a simple gradient based method and I'm very pleased with (most of) the results. I have discovered a few problems and I have questions:
1. The original fitting method fits the curve to a set of points obtained by taking 10 samples from each segment. This is a problem if the segments have very different lengths, the influence of very small parts of the original shape is way too big. I could handle this by multiplying each term by the length of the corresponding segment but I think it's better to get a set of evenly distributed points in the first place.
See Path 1 in the attached svg file for an example of a bad fit due to different segment lengths
2. In most tests the method stops at some point because no improvement occurs no matter how small the step size is chosen allthough the fit isn't perfect yet. See path 2/3 for examples. Currently the algorithm just stops but I'd like to continue somehow. I could choose a random search direction and try to find a better solution or maybe just take random samples from a small neighbourhood of the current set of parameters.
3. make -j3 takes > 2m to finish even when I only change one line in bezier-utils.cpp on a Core2Duo, 2,4GHz, is that normal? make install then takes >1m. I use ccache, g++ and turned optimization off.
4. Where can I find examples for automated tests?
5. I Attached the result of bzr diff >> path-optimization.patch for those interested in playing with the code. I don't think it's suitable for production yet.
Alexander
On 13/04/13 00:42, Alexander Brock wrote:
- The original fitting method fits the curve to a set of points
obtained by taking 10 samples from each segment. This is a problem if the segments have very different lengths, the influence of very small parts of the original shape is way too big. I could handle this by multiplying each term by the length of the corresponding segment
Turns out this doesn't help, see attached example. I will try and get a evenly distributed set of points on the original curve.
Alexander
On 13-04-13 00:42, Alexander Brock wrote:
... 2. In most tests the method stops at some point because no improvement occurs no matter how small the step size is chosen allthough the fit isn't perfect yet. See path 2/3 for examples. Currently the algorithm just stops but I'd like to continue somehow. I could choose a random search direction and try to find a better solution or maybe just take random samples from a small neighbourhood of the current set of parameters.
If I understand correctly you roughly do the following: 1) Find points on original curve(s). 2) Estimate new curve. 3) Find correspondence between points on the original curve and the new curve. 4) Move new curve closer to original curve. 5) Repeat 3 and 4 until convergence.
This is not an uncommon set up (lookup EM optimization for example, there is also a less statistically-minded method like that, but I can't remember the name right now), but can be tricky to get right. In particular, if you consider both the the correspondence you compute and the "fit", then I doubt both steps improve (or maintain) both, and this might give you some trouble.
- make -j3 takes > 2m to finish even when I only change one line in
bezier-utils.cpp on a Core2Duo, 2,4GHz, is that normal? make install then takes >1m. I use ccache, g++ and turned optimization off.
More or less, linking often takes most time for these small compiles.
- Where can I find examples for automated tests?
Is this what you're aiming at? http://wiki.inkscape.org/wiki/index.php/Testing#Developers
On 15/04/13 10:16, Jasper van de Gronde wrote:
If I understand correctly you roughly do the following:
- Find points on original curve(s).
- Estimate new curve.
- Find correspondence between points on the original curve and the new
curve. 4) Move new curve closer to original curve. 5) Repeat 3 and 4 until convergence.
Exactly.
- make -j3 takes > 2m to finish even when I only change one line in
bezier-utils.cpp on a Core2Duo, 2,4GHz, is that normal? make install then takes >1m. I use ccache, g++ and turned optimization off.
More or less, linking often takes most time for these small compiles.
I'd like to compile only the parts which are necessary for testing the algorithm. I will try to export sets of parameters given to the optimization function and have a smaller executable without the gui stuff for faster development.
I think I found a serious problem with the current code. _deleteStretch in src/ui/tool/path-manipulator.cpp puts 10 points from each bezier segment of the original curve into bezier_data but it seems to do a very poor job. I patched path-manipulator.cpp to write the points into "/tmp/inkscape-logfile" and plotted it using gnuplot, the points taken from the second segment don't lie on the original curve at all.
Can somebody verify that the error is not in my code?
Alexander
I forgot to attach the drawing I used in the test.
I started with the black line, added a point by double clicking to get the red line and deleted the new point to get the green line.
Alexander
On 15/04/13 23:28, Alexander Brock wrote:
I think I found a serious problem with the current code. _deleteStretch in src/ui/tool/path-manipulator.cpp puts 10 points from each bezier segment of the original curve into bezier_data but it seems to do a very poor job. I patched path-manipulator.cpp to write the points into "/tmp/inkscape-logfile" and plotted it using gnuplot, the points taken from the second segment don't lie on the original curve at all.
I think I figured this out. I made a simple curve with two segments, looked at the control points of the segments and found that the order was wrong:
Line 635 in path-manipulator.cpp:
Geom::CubicBezier bc(*cur, *cur->front(), *cur.next(), *cur.next()->back());
The third and the fourth argument should be switched:
Geom::CubicBezier bc(*cur, *cur->front(), *cur.next()->back(), *cur.next());
Alexander
2013/4/16 Alexander Brock <a.brock@...2965...>:
I think I figured this out. I made a simple curve with two segments, looked at the control points of the segments and found that the order was wrong:
Line 635 in path-manipulator.cpp:
Geom::CubicBezier bc(*cur, *cur->front(), *cur.next(), *cur.next()->back());
The third and the fourth argument should be switched:
Geom::CubicBezier bc(*cur, *cur->front(), *cur.next()->back(), *cur.next());
Alexander
You are entirely correct. I don't know what I thought when I wrote this.
By the way - some things I wrote, such as caching the number of selected nodes in the path _num_selected, are a big mistake (pointless premature optimization) and should be removed. I know there is at least one bug caused by not updating _num_selected correctly which leads to an infinite loop.
Regards, Krzysztof
On 16/04/13 00:59, Krzysztof Kosiński wrote:
You are entirely correct. I don't know what I thought when I wrote this.
It happens to everyone. I just figured out that I mixed up Bernstein basis polynomials when calculating the gradient. Now I get convergence in more cases than before and it's 4 times faster.
By the way - some things I wrote, such as caching the number of selected nodes in the path _num_selected, are a big mistake (pointless premature optimization) and should be removed. I know there is at least one bug caused by not updating _num_selected correctly which leads to an infinite loop.
I will have a look at this.
I'm now wondering how I should address performance. I want the whole process to finish within 0.1s because that's what seems to be considered a instant reaction by most humans. I don't like the idea of stopping after 0.1s passed or convergence whatever happens first because it would lead to worse fits on slower machines / non-deterministic behaviour on machines under heavy load. But I also don't want users on slow machines having to wait a second for the result of a simple node deletion. Thoughts?
Alexander
On Fri, Apr 19, 2013 at 2:02 AM, Alexander Brock <a.brock@...2965...> wrote:
I'm now wondering how I should address performance. I want the whole process to finish within 0.1s because that's what seems to be considered a instant reaction by most humans. I don't like the idea of stopping after 0.1s passed or convergence whatever happens first because it would lead to worse fits on slower machines / non-deterministic behaviour on machines under heavy load. But I also don't want users on slow machines having to wait a second for the result of a simple node deletion. Thoughts?
My main thought is that, I'm personally more interested in more correct behavior over performance. Are things any slower? Is there any noticeable difference in performance? If it's not slower, it's already an improvement then.
Cheers, Josh
2013/4/19 Alexander Brock <a.brock@...2965...>:
I'm now wondering how I should address performance. I want the whole process to finish within 0.1s because that's what seems to be considered a instant reaction by most humans. I don't like the idea of stopping after 0.1s passed or convergence whatever happens first because it would lead to worse fits on slower machines / non-deterministic behaviour on machines under heavy load. But I also don't want users on slow machines having to wait a second for the result of a simple node deletion. Thoughts?
My opinion is mostly the same as Josh's. Aim for the completely correct behavior first, then look for simplifications if there are performance problems with the completely correct solution.
Regards, Krzysztof
On Tue, 2013-04-16 at 00:51 +0200, Alexander Brock wrote:
The third and the fourth argument should be switched
Nice work Alexander! is there any way to turn this into a test so we can detect such a problem in the future?
I've been watching your adventure with the deletion of nodes, I oft been bitten by massive inaccuracies which gnash at my workflow. I'm keen to test the patch, although I don't have a local build infrastructure atm and inkscape is non-trivial to build.
Best Regards, Martin Owens
On 16/04/13 01:09, Martin Owens wrote:
On Tue, 2013-04-16 at 00:51 +0200, Alexander Brock wrote:
The third and the fourth argument should be switched
Nice work Alexander! is there any way to turn this into a test so we can detect such a problem in the future?
You could test if the created points lie on the original curve, but I'd rather test if deleting previously inserted, unneccessary nodes leads to a set of control points within a sufficiently small neighbourhood of the original points.
I've been watching your adventure with the deletion of nodes, I oft been bitten by massive inaccuracies which gnash at my workflow. I'm keen to test the patch, although I don't have a local build infrastructure atm and inkscape is non-trivial to build.
Which platform are you running? I'm running debian testing and building worked without any problems.
Alexander
On Tue, 2013-04-16 at 01:49 +0200, Alexander Brock wrote:
I'm running debian testing and building worked without any problems.
This is Ubuntu 13.04, Inkscape builds fine, it's just non-trivial because it takes about 3 days to compile fully on this computer's CPU. Although I don't have to deal with the hard drive space problems I used to which hurt inkscape devel too.
I was dreaming the other day of how hard it would be to have compiled binaries for each of the unlinked parts for commits somewhere. Such that a developer could run a special make command to download all those relevant binaries and have them for the next step. They'd obviously need to be date/time stamped correctly to skip correctly, not sure how hard such a thing would be to set up.
Martin,
I'd like to have a very small exxecutable which only executes a number of little tests with the algorithm I wrote, so I can play with it and fine-tune it faster, without having to compile and link the complete inkscape every time. I placed the attached bezier-tests.cpp in the src directory and tried to compile it:
g++ -Wall -pedantic -I2geom -Ibind -Idebug -Idialogs -Idisplay -Idom -Iextension -Ifilters -Ihelper -Iio -Ilibavoid -Ilibcola -Ilibcroco -Ilibgdl -Ilibnrtype -Ilibvpsc -Ilivarot -Ilive_effects -Ipixmaps -Isvg -Itrace -Iui -Iutil -Iwidgets -Ixml -I. bezier-tests.cpp
But it tells me that there is a file missing:
./2geom/point.h:39:20: fatal error: config.h: No such file or directory
Can somebody tell me how to set it up properly? I'd like to have something like a make target which only compiles the stuff neccessary for using code from bezier-utils.cpp
Alexander
Hey Alexander,
You probably need our geometry library (or parts of it) as well which you can find out a little more at http://lib2geom.sourceforge.net/
I'm not sure if what you're working on should actually be implemented upstream in lib2geom or directly in Inkscape as you're looking to do, perhaps someone with more insight could chime in.
Cheers, Josh
On Wed, Apr 17, 2013 at 1:58 PM, Alexander Brock <a.brock@...2965...> wrote:
I'd like to have a very small exxecutable which only executes a number of little tests with the algorithm I wrote, so I can play with it and fine-tune it faster, without having to compile and link the complete inkscape every time. I placed the attached bezier-tests.cpp in the src directory and tried to compile it:
g++ -Wall -pedantic -I2geom -Ibind -Idebug -Idialogs -Idisplay -Idom -Iextension -Ifilters -Ihelper -Iio -Ilibavoid -Ilibcola -Ilibcroco -Ilibgdl -Ilibnrtype -Ilibvpsc -Ilivarot -Ilive_effects -Ipixmaps -Isvg -Itrace -Iui -Iutil -Iwidgets -Ixml -I. bezier-tests.cpp
But it tells me that there is a file missing:
./2geom/point.h:39:20: fatal error: config.h: No such file or directory
Can somebody tell me how to set it up properly? I'd like to have something like a make target which only compiles the stuff neccessary for using code from bezier-utils.cpp
Alexander
Precog is a next-generation analytics platform capable of advanced analytics on semi-structured data. The platform includes APIs for building apps and a phenomenal toolset for data science. Developers can use our toolset for easy data analysis & visualization. Get a free account! http://www2.precog.com/precogplatform/slashdotnewsletter _______________________________________________ Inkscape-devel mailing list Inkscape-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/inkscape-devel
2013/4/17 Alexander Brock <a.brock@...2965...>:
I'd like to have a very small exxecutable which only executes a number of little tests with the algorithm I wrote, so I can play with it and fine-tune it faster, without having to compile and link the complete inkscape every time. I placed the attached bezier-tests.cpp in the src directory and tried to compile it:
g++ -Wall -pedantic -I2geom -Ibind -Idebug -Idialogs -Idisplay -Idom -Iextension -Ifilters -Ihelper -Iio -Ilibavoid -Ilibcola -Ilibcroco -Ilibgdl -Ilibnrtype -Ilibvpsc -Ilivarot -Ilive_effects -Ipixmaps -Isvg -Itrace -Iui -Iutil -Iwidgets -Ixml -I. bezier-tests.cpp
Direct compiling is very hard to get right in this way. For geometry-related work you only really need lib2geom. I would try checking out the source of lib2geom ( https://launchpad.net/lib2geom ), compiling it, then using this:
g++ -Wall bezier-tests.cpp -l2geom -o bezier-tests
You may need to add -I and -L flags which point to directories with lib2geom's config.h file and the lib2geom.a file, respectively.
This library is not yet API-stable, so you can make big changes as long as you then change Inkscape to be compatible. Once you have something interesting to submit, I can give you commit access. The objects: Point, Rect, Interval, Affine, Zoom, Scale, Translate, HShear, VShear, and those derived from Curve are considered "mostly done" so you should model any new API on them. Other objects may or may not have sensible APIs.
Regards, Krzysztof
On 18/04/13 02:47, Krzysztof Kosiński wrote:
Direct compiling is very hard to get right in this way. For geometry-related work you only really need lib2geom. I would try checking out the source of lib2geom ( https://launchpad.net/lib2geom ), compiling it, then using this:
g++ -Wall bezier-tests.cpp -l2geom -o bezier-tests
You may need to add -I and -L flags which point to directories with lib2geom's config.h file and the lib2geom.a file, respectively.
I switched to building lib2geom instead of inkscape which allows me to build an optimized binary within 2sec, thank you very much for suggesting that :-)
Alexander
participants (7)
-
Alexander Brock
-
alvinpenner
-
Jasper van de Gronde
-
Josh Andler
-
Krzysztof Kosiński
-
LucaDC
-
Martin Owens