
On Tue, 16 Nov 2004, MenTaLguY wrote:
On Tue, 16 Nov 2004, Bryce Harrington wrote:
This approach is consistent with Cairo, which is even more attentive to correctness (and correspondingly has similar performance concerns.)
I actually suspect that even current versions of Cairo, used carefully, should prove faster than the livarot-based renderer.
I'm basing the above off of comments from ishmal; he did a rough comparison iirc a couple weeks ago, and found it rendered right but was noticeably slower.
Perhaps for 0.41 we could establish some performance improvement goals, like we've been able to do so well with bug count? E.g., select a few test conditions and aim to improve the total performance on those tests by 15% or something?
I think 200% improvement is would be the minimum reasonable goal, if we were to take a benchmark-based approach.
Okay, 200% would be a nice achievement. :-)
I sort of assume that if we pick N testcases, we may be able to get order of magnitude gains out of some of them, but if we average over all of them the summed up number may be different. I guess we can figure all that out when those test cases are selected.
Micro-optimizations that yield ~15% improvements are easy to do (given a profiler), but they tend to mangle code in ways that obscure the patterns that can be exploited to yield order-of-magnitude improvements (through the use of a better algorithm for example).
Yes, although there are micro-optimizations that can be made that also result in making the code clearer, such as eliminating redundant code or making initializations only occur when they're really needed, etc. In theory, a really smart compiler should catch all these things, but gcc doesn't always catch these things.
That's not to say some smaller-scale optimizations aren't worthwhile. As you experienced when you were rewriting some of the if-cascades in livarot, simplification of code can have measurable (if not always order-of-magnitude) performance benefits.
*Nod* Exactly. That was also reasonably easy to do without understanding all of the algorithms and design. When you find a routine that's called a bunch of times, one tiny tweak can result in a big improvement However, there's only a limited amount that can be done that will make a significant effect, since there's a fixed number of routines that get called that much.
Also, some optimizations are path dependent. For example, with if loops, if you put the most likely case first in the list, it'll give a benefit (at least, it did for gcc), but if you run it a different way, then a different case may be the most frequently invoked, so it hurts. This is why I'd want to make sure to track across multiple test cases, so you can account for that.
Bryce