I found that Inkscape runs 6% to 7% faster if I replace in nr-point.h
Coord operator[](unsigned i) const throw(std::out_of_range) { if ( i > Y ) { throw std::out_of_range("index out of range"); } return _pt[i]; }
with
inline Coord operator[](unsigned i) const { return _pt[i]; }
and the same for non-const version. I think it's safe to get rid of the range check because the indices are never calculated and are extremely unlikely to go off limits. So if nobody objects I will commit this change.
On Mon, 2 Aug 2004, bulia byak wrote:
I found that Inkscape runs 6% to 7% faster if I replace in nr-point.h
Coord operator[](unsigned i) const throw(std::out_of_range) { if ( i > Y ) { throw std::out_of_range("index out of range"); } return _pt[i]; }
with
inline Coord operator[](unsigned i) const { return _pt[i]; }
and the same for non-const version. I think it's safe to get rid of the range check because the indices are never calculated and are extremely unlikely to go off limits. So if nobody objects I will commit this change.
I'd have no problem with that commit. How are you benchmarking the speed-up? It might be worth unwrapping the loops in that file, as well.
Carl
I'd have no problem with that commit. How are you benchmarking the speed-up?
Simply by timing loading/quitting with a big file (tutorial-basic now loads in 2.59 sec averaged). But these two [] operators are on top of the profile no matter what you do, called hundreds of millions of times. Here's a profile I got for a lengthy node-editing session:
% cumulative self self total time seconds seconds calls s/call s/call name 13.79 9.54 9.54 644838540 0.00 0.00 NR::Point::operator[](unsigned int) const 12.86 18.45 8.90 140250097 0.00 0.00 NR::Point::Point(NR::Point const&) 10.84 25.95 7.50 295333634 0.00 0.00 NR::Point::operator[](unsigned int) 4.68 29.19 3.24 1828 0.00 0.00 Shape::BeginQuickRaster(float&, int&, float) 4.03 31.98 2.79 66910369 0.00 0.00 NR::operator-(NR::Point const&, NR::Point const&) 3.99 34.74 2.76 3717 0.00 0.01 Shape::DistanceLE(NR::Point, double) 3.90 37.44 2.70 95559872 0.00 0.00 FloatLigne::CmpBord(void const*, void const*) 3.71 40.01 2.56 40584303 0.00 0.00 NR::Point::operator=(NR::Point const&) 3.51 42.44 2.43 6697126 0.00 0.00 FloatLigne::InsertBord(int, float, int) 3.11 44.59 2.15 4114 0.00 0.00 Shape::PtWinding(NR::Point) const 2.12 46.05 1.47 50742 0.00 0.00 FloatLigne::Flatten() 1.94 47.40 1.34 30402734 0.00 0.00 NR::L1(NR::Point const&) 1.57 48.49 1.09 36406933 0.00 0.00 Shape::CmpQRs(void const*, void const*) 1.42 49.47 0.98 50742 0.00 0.00 Shape::QuickScan(float&, int&, float, FloatLigne*, bool, floa 1.40 50.44 0.97 15261101 0.00 0.00 NR::dot(NR::Point const&, NR::Point const&) 1.37 51.39 0.95 13770865 0.00 0.00 NR::operator/(NR::Point const&, double) 1.24 52.25 0.86 112635275 0.00 0.00 NR::Point::Point() 1.20 53.08 0.83 1004 0.00 0.01 Shape::ConvertToShape(Shape*, fill_typ, bool) 1.16 53.88 0.80 2457838 0.00 0.00 Shape::SwapPoints(int, int) 1.13 54.66 0.78 50742 0.00 0.00 IntLigne::Copy(FloatLigne*) 0.92 55.30 0.64 1482 0.00 0.00 Shape::DirectQuickScan(float&, int&, float, bool, float) ....
It might be worth unwrapping the loops in that file, as well.
Tried that, did not help, probably because the compiler unrolls them by itself. I will try inlining and removing checks in more NR:: stuff.
Bryce, you mentioned you have more ideas for optimizing livarot functions that you would try after the 0.39 - I think now is the time :)
Peter, since this is your code, do you have any objections to removing the checks?
On Tue, 2004-08-03 at 08:03, bulia byak wrote:
Peter, since this is your code, do you have any objections to removing the checks?
I think I was the one who added those particular checks, actually.
My hope had been that we could eventually abandon the operator[] that took ints, and use soley NR::Dim2 version instead.
It turns out there was no clean way to use the Dim2 version in for loops though.
I am a little surprised that the check itself was not optimized out, however. In most cases the compiler should be able to know the value of i at compile time.
-mental
On Tue, Aug 03, 2004 at 09:03:12AM -0300, bulia byak wrote:
I'd have no problem with that commit. How are you benchmarking the speed-up?
Simply by timing loading/quitting with a big file (tutorial-basic now loads in 2.59 sec averaged). But these two [] operators are on top of the profile no matter what you do,
This seems suspicious to me: if they have an item in the profile, then that suggests that they aren't getting inlined.
Offhand, I'd have expected that the test can be optimized away if the function is inlined, but of course there's no way of optimizing it away if it isn't inlined.
What compilation flags are you using? What compiler?
pjrm.
On Wed, Aug 04, 2004 at 11:04:12PM +1000, Peter Moulder wrote:
On Tue, Aug 03, 2004 at 09:03:12AM -0300, bulia byak wrote:
Simply by timing loading/quitting with a big file (tutorial-basic now loads in 2.59 sec averaged). But these two [] operators are on top of the profile no matter what you do,
This seems suspicious to me: if they have an item in the profile, then that suggests that they aren't getting inlined.
Offhand, I'd have expected that the test can be optimized away if the function is inlined, but of course there's no way of optimizing it away if it isn't inlined.
What compilation flags are you using? What compiler?
With g++-3.3 -O2 -pg -g2, adding `g_assert( i < 2 )' makes the function appear in the profile, presumably meaning it isn't inlined.
That's disappointing (and surprising). Could it be due to variadic g_log?
*compile*
Yes, changing it to `if ( i >= 2 ) abort();' makes it disappear from the profile again, and the times are indistinguishable from without having the check.
That's quite disturbing. That suggests that g_assert should be changed: currently it calls g_log with 7 arguments, which could easily be trimmed (though with trade-offs involved).
pjrm.
On Mon, 2004-08-02 at 19:20, bulia byak wrote:
I found that Inkscape runs 6% to 7% faster if I replace in nr-point.h
Coord operator[](unsigned i) const throw(std::out_of_range) { if ( i > Y ) { throw std::out_of_range("index out of range"); } return _pt[i]; }
with
inline Coord operator[](unsigned i) const { return _pt[i]; }
Well, I think that you are making two changes here. One that removes the check, and one that make the function inline. I'm guessing that the inline is the reason for the speed up, not removing the check. I think that this function is called in two ways. One directly: mypoint[X] and one from a loop: for(int i = 0; i < 2; i++) mypoint[i]. In both cases the compiler will optimize out the check (in the later it will have to happen after the loop is unrolled) if the function is inline.
Ofcourse, measured results trump theoretical ones. But I'm guessing the inline is the larger speed up. It would be nice to keep the check in if it isn't high cost.
--Ted
On Tue, 2004-08-03 at 23:55, Ted Gould wrote:
Ofcourse, measured results trump theoretical ones. But I'm guessing the inline is the larger speed up. It would be nice to keep the check in if it isn't high cost.
Unless I am much mistaken, aren't member functions whose bodies are defined within the class definition automatically inline?
It might be instructive to examine the generated assembly.
-mental
On Wed, Aug 04, 2004 at 09:42:17PM -0400, MenTaLguY wrote:
Unless I am much mistaken, aren't member functions whose bodies are defined within the class definition automatically inline?
In absence of a certain g++ flag, I believe they're automatically assumed to have inlining requested. But compilers don't always inline things one requests to be inlined. It seems there's something about the g_log call that g++-3.3 doesn't like inlining (maybe too many args, maybe the fact that it's variadic, maybe something in the rest of the g_assert macro).
pjrm.
On Wed, 2004-08-04 at 21:54, Peter Moulder wrote:
In absence of a certain g++ flag, I believe they're automatically assumed to have inlining requested. But compilers don't always inline things one requests to be inlined. It seems there's something about the g_log call that g++-3.3 doesn't like inlining (maybe too many args, maybe the fact that it's variadic, maybe something in the rest of the g_assert macro).
It might behoove us to see if we can develop a variation on the g_assert macro that g++'s optimizer doesn't hate, and if we can submit a patch to glib...
-mental
participants (5)
-
bulia byak
-
Carl Hetherington
-
MenTaLguY
-
Peter Moulder
-
Ted Gould