Making Inkscape faster by avoiding excessive updates
Hello List
Whenever something is changed in the document, Inkscape does the following:
1. Store the change in the DOM tree (aka SP tree). 2. Write the change to the XML, also storing undo data. 3. In response to this XML change, update / recompute the values in the DOM tree.
This means that when a path is changed in the node editor, the following happens:
1. Geom::PathVector is constructed from the UI nodes. 2. A path string is constructed. 3. Path string is written to XML. 4. XML observer catches the XML write event. 5. Path string which was just written is parsed, and Geom::PathVector is created.
I guess you can see where this is inefficient. Note however that due to the fact we support "output precision" (i.e. by default the values written into the file have at most 8 digits after the decimal point, so we lose any precision above that), the PathVector created in step 5 is different than the vector created in step 1.
One way to improve Inkscape's performance is to avoid the readback step after every change. In many cases, it leads to large portions of the DOM tree being unnecessarily recomputed. However, not doing readback is rather dangerous - if what we write does not actually parse back to what we have, we end up with data corruption - we are displaying a document that does not match the one that we would display when reopening the file.
Therefore, the way to go would be to have the development version always do readback, while the release version would just assume it wrote the correct data. In the case of paths and other coordinate values, I think we should always store them with full double precision, taking advantage of the double-conversion library (I incorporated it into in 2Geom), and only allow reducing the precision in a separate output filter.
What do you think?
Best regards, Krzysztof
On 05-Apr-2015 14:53, Krzysztof Kosiński wrote:
Whenever something is changed in the document, Inkscape does the following:
- Store the change in the DOM tree (aka SP tree).
- Write the change to the XML, also storing undo data.
- In response to this XML change, update / recompute the values in
the DOM tree.
Seems like (1) is redundant given (3). In other words, and I'm not saying this very well, if we view the "job" of Inkscape as just this:
A. maintain a syntactically correct SVG file. B. maintain an object model of that SVG file for the purposes of rendering, selecting, moving things around.
Then any time a user's action working on the objects makes a change which should be reflected in the SVG, it should do (A) and then (B). From what you are saying, it is doing (B) (A) (B).
One way to improve Inkscape's performance is to avoid the readback step after every change. In many cases, it leads to large portions of the DOM tree being unnecessarily recomputed.
Yeah, I've seen that. It also can make debugging harder because you make one little change and all hell breaks loose, which you have to wade through in the debugger until you finally get to the one object change which is actually a result of the change.
However, not doing readback is rather dangerous - if what we write does not actually parse back to what we have, we end up with data corruption - we are displaying a document that does not match the one that we would display when reopening the file.
If, in terms of the above (A)->(B) can be limited with some sort of conserved mapping between the two, then the extent of the recalculation can be limited as well.
Therefore, the way to go would be to have the development version always do readback, while the release version would just assume it wrote the correct data.
Big red flag. Any time the release acts differently than the development you open the door for release bugs which cannot be reproduced in the development. In my experience this can be something as minor as changing the optimization level, which often exposes/hides memory corruption bugs.
In the case of paths and other coordinate values, I think we should always store them with full double precision, taking advantage of the double-conversion library (I incorporated it into in 2Geom), and only allow reducing the precision in a separate output filter.
That goes back to the SVG -> Object conversion. The one that counts is the value that is held in the SVG, and I believe that precision is set by that standard. That said, given the amount of memory in most computers these days, there isn't much reason to use single precision in favor of double precision.
Regards,
David Mathog mathog@...1176... Manager, Sequence Analysis Facility, Biology Division, Caltech
2015-04-06 18:14 GMT+02:00 mathog <mathog@...1176...>:
Then any time a user's action working on the objects makes a change which should be reflected in the SVG, it should do (A) and then (B). From what you are saying, it is doing (B) (A) (B).
No. The problem is that you have absolutely no idea what to do in step A until you have done step B.
For instance, if you want to transform an object, you need to first compute its new coordinates. You can't do this given only the unparsed XML.
The XML tree essentially stores text. The DOM tree stores the 'interpretation' of this text, e.g. coordinates of the object as doubles, path data, color values, and so on. To determine what to write in the XML as text, you first have to compute the new 'interpretation', i.e. coordinates of the transformed object, and then construct new XML from that. Reading back from XML is thus always redundant, as long as serialization to XML is not accidentally asymmetric, i.e. the function that reads and interprets XML will not give exactly the same DOM object as the one that generated that XML.
Therefore, the way to go would be to have the development version always do readback, while the release version would just assume it wrote the correct data.
Big red flag. Any time the release acts differently than the development you open the door for release bugs which cannot be reproduced in the development. In my experience this can be something as minor as changing the optimization level, which often exposes/hides memory corruption bugs.
Then perhaps we should use unit tests to verify that the serialization of objects to XML is correct, instead of making the versions behave differently.
That goes back to the SVG -> Object conversion. The one that counts is the value that is held in the SVG, and I believe that precision is set by that standard. That said, given the amount of memory in most computers these days, there isn't much reason to use single precision in favor of double precision.
SVG does not specify what should be the binary precision of numeric values in the user agent once they are parsed. It only specifies the textual format of numbers.
Currently Inkscape limits output to a specific number of decimal digits, and the default number of digits does not capture the full precision of doubles. What I propose is to use a library (double-conversion) that generates the shortest possible string that parses back to the same double value. This would allow us to avoid writing many unnecessary digits and guarantee perfect roundtrip between doubles and XML.
Regards, Krzysztof
participants (2)
-
Krzysztof Kosiński
-
mathog