Rewrite of style handling code, code review requested.

Hi all,
I have rewritten most of the style handling code in style.h and style.cpp to use real C++ classes. I would like you all to review the code and give feedback. The code is available at:
lp:~inkscape.dev/inkscape/style
Background:
I have found it quite difficult to add new properties to the Inkscape code as I try to implement SVG 2 features. The code in style.cpp is a convoluted mess. To add a new property, one needs to find a similar, already existing property and search through the code for it to find all the places where one should add code for the new property. This is quite error prone and time consuming.
Following the advice in the book "Refactoring, Improving the Design of Existing Code" (thanks to the Inkscape fund for a copy), I started to refactor the code bit by bit, while adding a bunch of unit tests to style-testing.h.
Scope:
I've limited my changes to the following files: style.h style.cpp style-internal.h style-internal.cpp, style-enums.h (plus minor changes to color.h, id-class.cpp, Makefile_insert). There are opportunities for further refactoring but they require modifying code in other files, something I wanted to avoid doing until I got the basic refactoring done.
Advantages:
It is now quite straight forward to add a new property, especially if it fits one of the pre-existing models. It can be as easy as adding the property to the SPStyle class declaration in style.h, initializing it in SPStyle::SPstyle, storing its address in the std::vector in SPStyle::SPStyle, and adding an entry in the switch statement in SPStyle::read_if_unset (this last requirement, could possibly be eliminated).
Not only will it be easier to add new properties, but it will also be easy to add new functionality. For example, it will be easy to add a variable to keep track of where a property is set (in a style sheet, in an attribute statement, or in a style-attribute), and write out the property in the same place. This is a requirement if we are ever to include stylesheet editing in Inkscape.
Disadvantages:
The code could be slower as there is more overhead in function calls. I've tried it out with a large file (the Tango icons) and don't see any measurable slow down.
The code does use more memory. It appears to use about 4k bytes more per object (this is really hard to measure accurately). One thing to think about in the code review is how memory use can be reduced.
(One possible place memory might be reduced is that it appears that in the rendering of Inkscape's icons, the instances of SPStyle that are created are not released. This means that around 4000 objects are being kept around.)
Status:
The code is mostly done. Here is a list of possible things to be worked on:
Short term (all internal to the SPStyle and SPIxxx classes):
* Finish SPColor class and use where appropriate ('color' and 'text-decoration-color').
* Memory use issues:
Remove unneeded destructors. Check variable order (to insure minimal memory use).
* Class Documentation
* Move SPIPaint/SPIFilter HREF functions into classes
* Find out why SPStyle::read() requires call to clear().
* Investigate using std::map instead of std::vector to allow removal of switch in SPStyle::read_if_unset(). A hash is already being used to look up the property so it is a more a question of memory usage than efficiency.
Medium term (requires changes external to style code):
* Update rest of code to use new SPStyle member functions.
* Restructure 'text' (i.e. font info) to match other properties
* Restructure SPIPaint/SPIColor to match other properties
* Convert text-decoration-style to SPIEnum
* Move sp_css_attr_xxxx functions to separate file.
* Correct quoting in 'font-family'
* Proper 'font' class
* SPIColor: Enum for color format: hsl(), named color.
Long term:
* Keep track of where a property is set (style-sheet, attribute, style attribute) and reuse on write.
* Proper 'ex' height handling
* CSS3 Color (HSL) + named colors (i.e. no longer convert red to #ff0000)
* CSS4 Color
* Proper handling of SPIFilter::cascade() and SPIFilter::merge()
* Dash array with units.
* Release SPStyle instances after icon generation.
* Check for code duplication between SPStyle classes and XML style handling.
* Add the ability to choose shorthand or longhand property formats.
Tav

2014-03-27 16:01 GMT+01:00 Tavmjong Bah <tavmjong@...8...>:
Hi all,
I have rewritten most of the style handling code in style.h and style.cpp to use real C++ classes. I would like you all to review the code and give feedback. The code is available at:
lp:~inkscape.dev/inkscape/style
First impressions:
1. The classes do not follow the coding style guidelines. Public methods should be in lowerCamelCase not in snake_case. Members supposed to be private / protected should start with _. http://www.inkscape.org/en/develop/coding-style/
2. If there is operator==, the class should privately derive from boost::equality_comparable<SPStyle> to automatically provide operator!=. There is no need to allow to override it separately. Moreover, some classes seem to overload operator!= with code identical to that present in their base class.
3. std::vector<SPIBase *> properties; - I understand this is supposed to allow easy iteration over style properties. This could be done without any per-instance memory overhead by defining custom iterators that use a static table of member pointers. I can provide a proof of concept implementation. Another approach would be to have a boost::ptr_map or boost::ptr_unordered_map of properties, so that only the properties which are actually present in the style are stored in memory, but I'm not sure how that would perform.
4. The unit-related functions commented with "no change" could be changed to use the functions from util/units.h. (Note that in general I think the recent unit refactoring needs more work, because as it stands now it doesn't solve all the problems and severely impacts performance in certain cases.)
5. Constructor of SPIBase should take Glib::ustring by const reference instead of by value. Members intended to be private should be prefixed with _.
6. Some classes unnecessarily implement an assignment operator which is completely equivalent to the default one.
7. Nitpick: In the SPIString constructor, no_name_string should be anonymous_string, sounds better.
8. The pattern of calling setStylePointer is suboptimal. If these properties always need an SPStyle pointer and cannot be sensibly instantiated without it, they should just take a reference to SPStyle as a construction parameter.
I'll send a second e-mail once I read the code more throrughly
Regards, Krzysztof

Hi Krzysztof,
Thanks for the quick feedback!
On Thu, 2014-03-27 at 16:32 +0100, Krzysztof Kosiński wrote:
2014-03-27 16:01 GMT+01:00 Tavmjong Bah <tavmjong@...8...>:
Hi all,
I have rewritten most of the style handling code in style.h and style.cpp to use real C++ classes. I would like you all to review the code and give feedback. The code is available at:
lp:~inkscape.dev/inkscape/style
First impressions:
- The classes do not follow the coding style guidelines. Public
methods should be in lowerCamelCase not in snake_case. Members supposed to be private / protected should start with _. http://www.inkscape.org/en/develop/coding-style/
Easy to do. I just pretty much followed the existing naming in the files, including the preexisting SPIPaint class.
- If there is operator==, the class should privately derive from
boost::equality_comparable<SPStyle> to automatically provide operator!=. There is no need to allow to override it separately. Moreover, some classes seem to overload operator!= with code identical to that present in their base class.
You'll need to explain this a bit... I am guessing I can just delete the operator!= from all except the base class as all it does is invert the operator== in all classes.
- std::vector<SPIBase *> properties; - I understand this is supposed
to allow easy iteration over style properties. This could be done without any per-instance memory overhead by defining custom iterators that use a static table of member pointers. I can provide a proof of concept implementation. Another approach would be to have a boost::ptr_map or boost::ptr_unordered_map of properties, so that only the properties which are actually present in the style are stored in memory, but I'm not sure how that would perform.
This sounds interesting. I'll need an example. There is already a name look up with a hash table taking place. Using a map in the SPStyle class would eliminate the need for this so it might not be a performance problem (in order to eliminate the long switch).
I thought about only keeping the properties that are actually used but that, I think, requires modifying the way the values are referenced throughout the code base (they are directly accessed). So this might be a long term goal.
- The unit-related functions commented with "no change" could be
changed to use the functions from util/units.h. (Note that in general I think the recent unit refactoring needs more work, because as it stands now it doesn't solve all the problems and severely impacts performance in certain cases.)
This can be a future goal.
- Constructor of SPIBase should take Glib::ustring by const reference
instead of by value. Members intended to be private should be prefixed with _.
Will change the Glib::ustring use. At the moment, almost everything is public due to the way the values are referenced in the rest of the code.
- Some classes unnecessarily implement an assignment operator which
is completely equivalent to the default one.
These can probably be removed now. In the original C code there was a template that required explicit assignment operators (at least the code wouldn't compile without them).
- Nitpick: In the SPIString constructor, no_name_string should be
anonymous_string, sounds better.
OK, I just need to put in something to make debugging easier. All the classes follow the same pattern (e.g. no_name_paint).
- The pattern of calling setStylePointer is suboptimal. If these
properties always need an SPStyle pointer and cannot be sensibly instantiated without it, they should just take a reference to SPStyle as a construction parameter.
I know this is a big ugly but the classes don't always need an SPStyle pointer to be useful. There is at least one instance in the code base where the SPIPaint class (which pre-existed my changes) is used to just read in a a string value.
I'll send a second e-mail once I read the code more throrughly
Looking forward to it.
I am also interested in general feedback... is this the proper way to go?
Tav

2014-03-27 17:56 GMT+01:00 Tavmjong Bah <tavmjong@...8...>:
- If there is operator==, the class should privately derive from
boost::equality_comparable<SPStyle> to automatically provide operator!=. There is no need to allow to override it separately. Moreover, some classes seem to overload operator!= with code identical to that present in their base class.
You'll need to explain this a bit... I am guessing I can just delete the operator!= from all except the base class as all it does is invert the operator== in all classes.
See here: http://www.boost.org/doc/libs/1_55_0b1/libs/utility/operators.htm
- std::vector<SPIBase *> properties; - I understand this is supposed
to allow easy iteration over style properties. This could be done without any per-instance memory overhead by defining custom iterators that use a static table of member pointers. I can provide a proof of concept implementation. Another approach would be to have a boost::ptr_map or boost::ptr_unordered_map of properties, so that only the properties which are actually present in the style are stored in memory, but I'm not sure how that would perform.
This sounds interesting. I'll need an example. There is already a name look up with a hash table taking place. Using a map in the SPStyle class would eliminate the need for this so it might not be a performance problem (in order to eliminate the long switch).
The iterator would contain a pointer to SPStyle and an integer index into a table like this (note, I'm not 100% sure about the typedef syntax):
typedef SPStyle::*SPIBase SPStyleMember; static SPStyleMember members[] = { &SPStyle::font_size, &SPStyle::font_style, ... NULL };
Incrementing the operator would increase the index, and dereferencing would access the member via the pointer, e.g.
SPIBase const &operator*() const { return _style->*(members[_member_index]); }
I thought about only keeping the properties that are actually used but that, I think, requires modifying the way the values are referenced throughout the code base (they are directly accessed). So this might be a long term goal.
There is a way to do this with zero-size mock objects, but this hack would be very elaborate. Converting everything to use getters would be a better long term solution.
- The pattern of calling setStylePointer is suboptimal. If these
properties always need an SPStyle pointer and cannot be sensibly instantiated without it, they should just take a reference to SPStyle as a construction parameter.
I know this is a big ugly but the classes don't always need an SPStyle pointer to be useful. There is at least one instance in the code base where the SPIPaint class (which pre-existed my changes) is used to just read in a a string value.
OK, I see. Another idea is to use offsetof. The object doesn't need to store the pointer to know where its parent is, it can just subtract the offset from the beginning of SPStyle to itself from its 'this' pointer. This technique is used extensively in the Linux kernel.
I am also interested in general feedback... is this the proper way to go?
At first glance, this is definitely a substantial improvement over the existing code.
Regards, Krzysztof

On 27-Mar-2014 08:01, Tavmjong Bah wrote:
Hi all,
I have rewritten most of the style handling code in style.h and
One thing that should probably change in that file are these structs:
struct SPILengthOrNormal { unsigned set : 1; unsigned inherit : 1; unsigned normal : 1; unsigned unit : 4; float value; float computed; };
struct SPILength { unsigned set : 1; unsigned inherit : 1; unsigned unit : 4; float value; float computed; };
The problem is that one is often cast from the other, and the "normal" field falls within "unit". There is room for error there, with bits intended for one field ending up in the other. To avoid this sort of issue entirely it would be better if these were:
struct SPILength { unsigned set : 1; unsigned inherit : 1; unsigned unit : 4; unsigned unused : 26 float value; float computed; };
struct SPILengthOrNormal { unsigned set : 1; unsigned inherit : 1; unsigned unit : 4; unsigned normal : 1; unsigned unused : 25 float value; float computed; };
with all bits set to zero on creation. (Declaring the unused bit field isn't strictly necessary, but it sometimes makes it easier to debug memory access issues if every bit can be accessed by name.)
Regards,
David Mathog mathog@...1176... Manager, Sequence Analysis Facility, Biology Division, Caltech

On 2014-03-27 16:01 +0100, Tavmjong Bah wrote:
I've limited my changes to the following files: style.h style.cpp style-internal.h style-internal.cpp, style-enums.h (plus minor changes to color.h, id-class.cpp, Makefile_insert).
JFYI - 'src/style-internal.cpp' is missing in the branch (r13221) but referenced in 'src/Makefile_insert' (-> build failure).
http://bazaar.launchpad.net/~inkscape.dev/inkscape/style/revision/13221
Regards, V

On Thu, 2014-03-27 at 18:50 +0100, su_v wrote:
On 2014-03-27 16:01 +0100, Tavmjong Bah wrote:
I've limited my changes to the following files: style.h style.cpp style-internal.h style-internal.cpp, style-enums.h (plus minor changes to color.h, id-class.cpp, Makefile_insert).
JFYI - 'src/style-internal.cpp' is missing in the branch (r13221) but referenced in 'src/Makefile_insert' (-> build failure).
http://bazaar.launchpad.net/~inkscape.dev/inkscape/style/revision/13221
Sorry, fixed
Tav

On 27-3-2014 16:01, Tavmjong Bah wrote:
Hi all,
I have rewritten most of the style handling code in style.h and style.cpp to use real C++ classes. I would like you all to review the code and give feedback. The code is available at:
lp:~inkscape.dev/inkscape/style
Some quick comments about all the bitfield usage:
1. style.h, lines135-
Don't use bitfields there. Simply use bool. Much less bugprone. I don't think there is any reason to use bitfields there.
2. style-internal.h
Have a look at the use of all bitfields there. A 1-bit bitfield to store some boolean value, I think is best replaced with a bool.
(can you run clang-format on the code? thanks!)
-Johan

Johan,
Thanks for the comments.
On Sun, 2014-03-30 at 21:50 +0200, Johan Engelen wrote:
On 27-3-2014 16:01, Tavmjong Bah wrote:
Hi all,
I have rewritten most of the style handling code in style.h and style.cpp to use real C++ classes. I would like you all to review the code and give feedback. The code is available at:
lp:~inkscape.dev/inkscape/style
Some quick comments about all the bitfield usage:
- style.h, lines135-
Don't use bitfields there. Simply use bool. Much less bugprone. I don't think there is any reason to use bitfields there.
Or, even better, get rid of them completely. They were not being used.
- style-internal.h
Have a look at the use of all bitfields there. A 1-bit bitfield to store some boolean value, I think is best replaced with a bool.
Done where it doesn't increase class memory size.
(can you run clang-format on the code? thank
I did a test. While some formatting is improved, it messes up other formatting. As it also makes diffing against previous code more difficult, I would prefer to wait on this.
Tav

---- Tavmjong Bah <tavmjong@...8...> wrote:
(can you run clang-format on the code? thank
I did a test. While some formatting is improved, it messes up other formatting. As it also makes diffing against previous code more difficult, I would prefer to wait on this.
Yeah, it cannot be run on the whole file yet. (I hope to work on clang-format a bit to make it better suited for our tastes) But perhaps you can run it on part of the source. I use the following, perhaps it is also available for your editor: http://www.sublimetext.com/forum/viewtopic.php?f=5&t=15133
I find it a hugely convenient tool. The offending parts of your code that I especially remember are the positioning of parens and braces.
cheers, Johan
participants (6)
-
unknown@example.com
-
Johan Engelen
-
Krzysztof Kosiński
-
mathog
-
su_v
-
Tavmjong Bah