
I just saw that this is (probably) already fixed in the Cairo branch (although the calculation used does have a small problem, see below). So it might be better to copy that. Or... we could finally make the switchover. Seriously, if switching to the Cairo branch fixes some things and "only" introduces some problems with certain transformation matrices (and only in the context of gradients?) then I think it might be time to just switch (well, move the changes to trunk). Especially since we've just released.
As for the computation in the Cairo branch (in struct MaskLuminanceToAlpha), it basically multiplies the coefficients by 255 and does the computation in integers. In principle not a bad way to do it, except that the rounded coefficients don't add up to 1. I would recommend adding one to the green component (so use 183 instead of 182), as its fractional part is closest to one (0.427, vs. 0.1875 and 0.3855). Also, since it's using 32 bit ints anyway a bit more precision could easily be used.
As for speed, the present code is probably fast enough. It might benefit from SSE or something like that though. It might also be possible to speed it up a bit by (partially) expanding the multiplies using shifts and adds. But this would have to be tried, with compilers and processors being quite good at this sort of stuff, and the coefficients not being that "sparse". In any case, there is no good reason to multiply the coefficients by 255, just use 256 (or some other, higher, power of 2), that way the divide can be done with a single shift (and that should definitely be somewhat faster).
On 2010-11-22 10:46, Jasper van de Gronde wrote:
On 2010-11-21 21:31, Tavmjong Bah wrote:
... The code to fix is in nr-arena-item.cpp at line 518 and 536... but as this is speed critical code, I don't know the best way of doing this. Can someone who is familiar with nr-pixops, etc. have a look?
This is the current code:
m = NR_PREMUL_112 (s[0] + s[1] + s[2], s[3]); d[0] = FAST_DIV_ROUND< 3 * 255> (m);
This is the desired code equivalent:
d[0] = (int)((s[0]*0.2125 + s[1]*0.7154 + s[2]*0.0721) * s[3]/255.0);
Thanks a lot for finding this, I think I'll be able to fix this reasonably fast, so I'll have a look. ...