Tavmjong,
Good, that was an effortless speedup, but I'll still go ahead with the vectorized implementation.

Before I set the filters back to SRGB, I'd like to know which is more intuitive to the user (e.g. linear response)? According to your explanation,

http://tavmjong.free.fr/blog/?p=765

you say for color interpolation, clearly SRGB is better. But I couldn't tell for filters which is better?


On Sun, Mar 3, 2013 at 6:39 AM, Tavmjong Bah <tavmjong@...8...> wrote:

Just check-in a patch that should use OpenMP when converting between
sRGB and linearRGB. On my laptop with 8 threads it results in about an
eight fold increase in speed with a Gaussian blur of radius 100.

Of course, further speed increases are welcomed.

Tav

On Sun, 2013-03-03 at 02:26 -0800, Yale Zhang wrote:
> Hi. I'm using Inkscape to author a comic and the slow speed for
> certain things is very annoying. I'm already have OpenMP turned on and
> using 4 cores.
>
>
> 1. large blurs are slow - I'm an expert with writing SIMD code, so I
> was thinking about vectorizing the Gaussian IIR filter with SIMD
> intrinsics, even though it's harder than for a FIR. But I noticed
> there isn't any SIMD code in Inkscape so does that mean it's something
> to avoid.
> I'm pretty sure no current compiler is smart enough to vectorize it,
> and besides, Inkscape is compiled with -O2, meaning -ftree-vectorize
> isn't on by default.
>
>
>
>
> 2. when there's a large image (raster based) background - scrolling in
> a zoomed region is very slow
> I compiled the latest 0.49 code with GCC profiling and it shows this:
>
>
>  33.98     22.47    22.47                             exp2l
>  21.29     36.55    14.08                             log2l
>  17.57     48.17    11.62                             pow
>   7.12     52.88     4.71      658     0.01     0.01
>  ink_cairo_surface_srgb_to_linear(_cairo_surface*)
>   6.72     57.32     4.44      563     0.01     0.01
>  ink_cairo_surface_linear_to_srgb(_cairo_surface*)
>   5.51     60.96     3.64     1216     0.00     0.00
>  Inkscape::Filters::FilterGaussian::~FilterGaussian()
>   5.23     64.42     3.46                             internal_modf
>   0.59     64.81     0.39                             _mcount_private
>   0.41     65.08     0.27                             __fentry__
>   0.12     65.16     0.08                             GC_mark_from
>   0.09     65.22     0.06     5579     0.00     0.00
>  Geom::parse_svg_path(char const*, Geom::SVGPathSink&)
>   0.06     65.26     0.04    35320     0.00     0.00
>  bounds_exact_transformed(std::vector<Geom::Path,
> std::allocator<Geom::Path> > const&, Geom::Affine const&)
>   0.06     65.30     0.04        8     0.01     0.01
>  convert_pixbuf_normal_to_argb32(_GdkPixbuf*)
>   0.05     65.33     0.03   885444     0.00     0.00
>  std::vector<Geom::Linear, std::allocator<Geom::Linear>
> >::_M_fill_insert(__gnu_cxx::__normal_iterator<Geom::Linear*,
> std::vector<Geom::Linear, std::allocator<Geom::Linear> > >, unsigned
> long long, Geom::Linear const&)
>
>
>
>
> The cost is absolutely dominated by ink_cairo_surface_srgb_to_linear()
> and ink_cairo_surface_linear_to_srgb().  My first instinct was to
> optimize those 2 functions, but then I thought why are those even
> being called every time I scroll through the image?
> Why not convert the images up front to linear and stay that way in
> memory?
>
>
> If that can't be done, then my optimization approach is:
> 1. replace ink_cairo_surface_srgb_to_linear()  with a simple 3rd
> degree polynomial approximation (0.902590573087882 -
> 0.010238759806148x + 0.002825455367280x^2 +  0.000004414767235x^3) and
> vectorize with SSE intrinsics. The approximation was calculated by
> minimizing the square error (maxError = 0.313) over the range [10,
> 255]. For x < 10, it uses simple scaling.
>
>
> 2. replace ink_surface_linear_to_srgb()  with  a vectorized
> implementation of pow(). Unlike srgb_to_linear(),  a low degree
> polynomial can't be used due to the curve having larger high order
> derivatives. An alternative would be piece wise, low order
> polynomials.
>
>
> The main question I have is what degree of accuracy is desired?
> Certainly, it doesn't need double precision pow() since the input is
> only 8 bits! Is +- 0.5  from the true value (before quantization) OK
> or do people depend on getting pixel perfect results?
>
>
>
>
>  UJ
>
>
> ------------------------------------------------------------------------------
> Everyone hates slow websites. So do we.
> Make your web apps faster with AppDynamics
> Download AppDynamics Lite for free today:
> http://p.sf.net/sfu/appdyn_d2d_feb
> _______________________________________________
> Inkscape-devel mailing list
> Inkscape-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/inkscape-devel