2008/11/18 Jasper van de Gronde <th.v.d.gronde@...528...>
Using GCC's support of OpenMP (now also in MingW!) I had a stab at making the outer loop of blurring parallel (the loops are typical DSP loops and shouldn't even throw an exception, so it should be perfectly safe). On my dual-core system blurring (a large image) is approximately twice as fast, so it seems to scale really well.
As I don't have access to a Linux system at the moment I haven't modified the Makefiles to add support, but it shouldn't be too difficult (see the changes to build.xml). Also, I've made it write a log to be able to analyze the performance, but it uses Win32 specific functions for timing. So, if anyone would like to have a go at making it work on Linux and/or testing it that would be great :)
The number of threads defaults to whatever omp_get_num_procs() returns, but on my system this doesn't work... So to use it, set the number of threads in the preferences, using: <group id="threading" numthreads="1" /> in the "options" group (so it's /options/threading/numthreads).
Obviously this is not the most efficient way to do this, as new threads are created for each blur. But so far the overhead doesn't appear to be too bad, compared with the actual blurring operation. And using OpenMP the impact on the code is minimal, I only had to put two #pragma's before the blur loops, allocate the temporary data it needs for all threads and add some preference for the number of threads. So if this works out it might provide a relatively easy way to start making use of all these dual- and quad-core machines people have lying around doing almost nothing these days :)
Patch attached, if wanted I can also provide a binary for Windows. This version writes a log file (blurlog.txt) that lists how long each blur takes.
Note that inkscape also needs mingwm10.dll to run with this patch. As it isn't included with devlibs at the moment you'll have to copy it to Inkscape's directory manually (it's in the MingW bin directory).
Very cool, am showing 40-50% speed up on my dual core on the bigger objects, total times quoted in the blurlog for 1vs 2 with 150 odd blurs in the drawing is 3.6sec vs 2.24 took a lot longer than that to render tho, so we're clearly eating a lot of time elsewhere too. Will stick this build on a USB drive and try it on my a dual processor machine at work with 2x quad xeons in it :D Will try get some better usertime estimates too. Definite step in the right direction tho...
Cheers
Sim