On 09/19/2016 02:25 PM, Yale Zhang wrote:
I'm proud to announce that I have completed vectorizing both the IIR and FIR filters. Speed ups are about 5x for IIR and ~20x for FIR (see spreadsheet). The code is a monstrosity, but given all the different cases ( {FIR, IIR} x {int16, float, double} x {RGBA, grayscale} as Jasper pointed out, it's expected. Earlier, I had only worked on the IIR, RGBA case, so it wasn't complete. It's unfortunate I let it languish for 3 years, but now hopefully, everyone can get a smoother experience.
Cool :-)
Please test it out and send some feedback and a path for checking in.
Here's my experience with your patch on Debian stretch (native, latest upgrades installed), gcc version 6.1.1 20160802 (Debian 6.1.1-11):
1. I needed to remove the "-DWIN32" flag since that triggered usage of the "Sleep" function which lead to a compiler error
2. I needed to place the keyword "inline" before the four PartialVectorMask* functions in lines 983, 988, 994, 1000 of file nr-filter-gaussian.cpp This problem was the reason:
http://stackoverflow.com/questions/13472341/inlining-failed-function-body-ca...
After that it compiled.
A unit test & benchmark function is included. The accuracy is always +- 1 intensity level, which should be quite safe.
To build it, please add this to your cmake command: -DCMAKE_CXX_FLAGS="-DWIN32 -mavx2 -mfma -fpermissive -flax-vector-conversions"
You'll also need to turn on -std=c++14 if compiling the unit test & benchmark
I ran cmake with -DCMAKE_CXX_FLAGS="-mavx2 -mfma -fpermissive -flax-vector-conversions -std=c++14".
How can I run the test and the benchmark? In the "bin" directory these files showed up: attributes-test, color-profile-test, dir-util-test inkscape, inkview, object-set-test, sp-object-test
Best Regards, Alexander