Tavmjong, Good, that was an effortless speedup, but I'll still go ahead with the vectorized implementation.
Before I set the filters back to SRGB, I'd like to know which is more intuitive to the user (e.g. linear response)? According to your explanation,
http://tavmjong.free.fr/blog/?p=765
you say for color interpolation, clearly SRGB is better. But I couldn't tell for filters which is better?
On Sun, Mar 3, 2013 at 6:39 AM, Tavmjong Bah <tavmjong@...8...> wrote:
Just check-in a patch that should use OpenMP when converting between sRGB and linearRGB. On my laptop with 8 threads it results in about an eight fold increase in speed with a Gaussian blur of radius 100.
Of course, further speed increases are welcomed.
Tav
On Sun, 2013-03-03 at 02:26 -0800, Yale Zhang wrote:
Hi. I'm using Inkscape to author a comic and the slow speed for certain things is very annoying. I'm already have OpenMP turned on and using 4 cores.
- large blurs are slow - I'm an expert with writing SIMD code, so I
was thinking about vectorizing the Gaussian IIR filter with SIMD intrinsics, even though it's harder than for a FIR. But I noticed there isn't any SIMD code in Inkscape so does that mean it's something to avoid. I'm pretty sure no current compiler is smart enough to vectorize it, and besides, Inkscape is compiled with -O2, meaning -ftree-vectorize isn't on by default.
- when there's a large image (raster based) background - scrolling in
a zoomed region is very slow I compiled the latest 0.49 code with GCC profiling and it shows this:
33.98 22.47 22.47 exp2l 21.29 36.55 14.08 log2l 17.57 48.17 11.62 pow 7.12 52.88 4.71 658 0.01 0.01 ink_cairo_surface_srgb_to_linear(_cairo_surface*) 6.72 57.32 4.44 563 0.01 0.01 ink_cairo_surface_linear_to_srgb(_cairo_surface*) 5.51 60.96 3.64 1216 0.00 0.00 Inkscape::Filters::FilterGaussian::~FilterGaussian() 5.23 64.42 3.46 internal_modf 0.59 64.81 0.39 _mcount_private 0.41 65.08 0.27 __fentry__ 0.12 65.16 0.08 GC_mark_from 0.09 65.22 0.06 5579 0.00 0.00 Geom::parse_svg_path(char const*, Geom::SVGPathSink&) 0.06 65.26 0.04 35320 0.00 0.00 bounds_exact_transformed(std::vector<Geom::Path, std::allocatorGeom::Path > const&, Geom::Affine const&) 0.06 65.30 0.04 8 0.01 0.01 convert_pixbuf_normal_to_argb32(_GdkPixbuf*) 0.05 65.33 0.03 885444 0.00 0.00 std::vector<Geom::Linear, std::allocatorGeom::Linear
::_M_fill_insert(__gnu_cxx::__normal_iterator<Geom::Linear*,
std::vector<Geom::Linear, std::allocatorGeom::Linear > >, unsigned long long, Geom::Linear const&)
The cost is absolutely dominated by ink_cairo_surface_srgb_to_linear() and ink_cairo_surface_linear_to_srgb(). My first instinct was to optimize those 2 functions, but then I thought why are those even being called every time I scroll through the image? Why not convert the images up front to linear and stay that way in memory?
If that can't be done, then my optimization approach is:
- replace ink_cairo_surface_srgb_to_linear() with a simple 3rd
degree polynomial approximation (0.902590573087882 - 0.010238759806148x + 0.002825455367280x^2 + 0.000004414767235x^3) and vectorize with SSE intrinsics. The approximation was calculated by minimizing the square error (maxError = 0.313) over the range [10, 255]. For x < 10, it uses simple scaling.
- replace ink_surface_linear_to_srgb() with a vectorized
implementation of pow(). Unlike srgb_to_linear(), a low degree polynomial can't be used due to the curve having larger high order derivatives. An alternative would be piece wise, low order polynomials.
The main question I have is what degree of accuracy is desired? Certainly, it doesn't need double precision pow() since the input is only 8 bits! Is +- 0.5 from the true value (before quantization) OK or do people depend on getting pixel perfect results?
UJ
Everyone hates slow websites. So do we. Make your web apps faster with AppDynamics Download AppDynamics Lite for free today: http://p.sf.net/sfu/appdyn_d2d_feb _______________________________________________ Inkscape-devel mailing list Inkscape-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/inkscape-devel