Quoting bulia byak <buliabyak@...400...>:
- When rendering transparent overlays, Inkscape spends a lot of
time calculating NR_PREMUL(c,a) defined as (((c) * (a) + 127)/255). I changed the division to
FAST_DIVIDE_BY_255(v) ((((v)<< 8) + (v) + 257) >> 16)
borrowed from Mozilla and got about 10% speedup on complex documents with transparency.
Ah, careful with macros. Because FAST_DIVIDE_BY_255 is a macro, the parameters get evaluated multiple times.
given:
#define NR_PREMUL(c,a) (FAST_DIVIDE_BY_255(((c) * (a) + 127)))
You get expansions like:
NR_PREMUL(exp1, exp2) =>
(FAST_DIVIDE_BY_255 (((exp1) * (exp2) + 127))) =>
(((((((exp1) * (exp2) + 127)))<< 8) + (((exp1) * (exp2) + 127))) + 257) >> 16))
Where exp1 and/or exp2 are complex expressions, this could be much slower than it needs to be, and if they have side-effects, the duplication may also alter the semantics of the code.
I suspect you will be able to get an additional performance benefit if you convert these macros to inline functions, because their parameters would no longer be getting calculated multiple times.
e.g.:
inline int fast_divide_by_255(int v) { return ( ( v << 8 ) + v + 257 ) >> 16; }
inline int nr_premul(int c, int a) { return c * a + 127; }
(I am making their arguments and return values 'int', since that is normally what the compiler promotes char arithmetic to anyway.)
-mental