Joel Holdsworth wrote:
Would it be possible to optimise the code with SSE vectorisation at all?
Most likely. Note that this would be pretty much orthogonal to using multithreading (and/or algorithmic enhancements) though. (Which definitely doesn't make it less attractive of course.)
That would require hand optimised assembly.
More or less. There are intrinsics that can be used which make it a lot easier. In fact I have some code like this to do rendering, which I still haven't incorporated because I wanted to check out liboil at the time and I may have to double-check the code to see if it's still okay.
Alternatively I wonder if liboil would have any loops that could be brought in?
Liboil is a very nice initiative, but it's reasonably limited in its scope. In theory we could use some of their compositing code, but: - Inkscape uses quite a few more pixel formats (which I doubt are all actually used, so this may be a moot point). - I have serious doubts about the quality of the computations. Based on a quick look at the source they repeatedly round the results (this makes it possible to optimize a bit more), which Inkscape also used to do and actually leads to visible artifacts.
For the blurring code I don't think much of their current code applies and at the moment I'm not too keen on trying to get liboil to change/improve (work-wise). But if you (or anyone else) feels up to it, be my guest :)