
2010/9/5 Joshua A. Andler <scislac@...400...>:
If I am not mistaken, Krzysztof has already done the parallelization of the rest of the filters in his GSoC branch.
Cheers, Josh
That's right, but they use OpenMP only, no OpenCL for now.
Converting the filters to OpenCL might or might not give some performance gain. With multiple filters the overhead of transmitting the data to the graphics card will dominate and we might end up being slower than the CPU. Additionally, older Nvidia cards and all ATI cards older than HD 5xxx lack the ability to create OpenCL image objects. It's theoretically possible to implement the filters without them, using generic memory objects, but I'm not sure what the performance will be like.
The best approach would be to use cairo-gl surfaces and create OpenCL contexts that directly refer to OpenGL pixmap contents. This would bring the count of roundtrips down to 1 (at the end of rendering, to draw to the X surface provided by GTK). Unfortunately, the performance of cairo-gl is rather bad, especially on ATI hardware. I have a Radeon HD 4850, which is a mid-range gaming card, and the Cairo performance tests run ~10x slower on cairo-gl than on the image backend.
Regards, Krzysztof