
On Fri, Nov 11, 2005 at 11:43:59AM -0500, mental@...3... wrote:
Quoting bulia byak <buliabyak@...400...>:
- When rendering transparent overlays, Inkscape spends a lot of
time calculating NR_PREMUL(c,a) defined as (((c) * (a) + 127)/255). I changed the division to
FAST_DIVIDE_BY_255(v) ((((v)<< 8) + (v) + 257) >> 16)
borrowed from Mozilla and got about 10% speedup on complex documents with transparency.
Ah, careful with macros. Because FAST_DIVIDE_BY_255 is a macro, the parameters get evaluated multiple times.
An alternative is to change ((v) << 8) + (v) to 257 * (v): I'm told that gcc has for ages emitted equivalent code for these, and I've just verified that g++-4.0.2 -O2 emits equivalent code on x86 (except swapping the roles of %eax and %edx). I believe it's also more readable. (I'd consider 0x101 instead of 257, though that assumes that v is already unsigned or at least non-negative.)
We could further change (v) * 257 + 257 to ((v) + 1) * 257. Again, this makes no difference to code emitted by g++-4.0.2 -O2 on x86 (not even differing in register assignment this time), so this one's just a readability decision (at least on this platform).
pjrm.