Windows: why do we build with -O2?
Hi all, Why do we build with -O2 instead of O3? We are missing out on: -O3: Optimize yet more. -O3 turns on all optimizations specified by -O2 and also turns on the -finline-functions, -funswitch-loops, -fpredictive-commoning, -fgcse-after-reload, -ftree-loop-vectorize, -ftree-slp-vectorize, -fvect-cost-model, -ftree-partial-pre and -fipa-cp-clone options.
regards, Johan
2014-05-15 23:56 GMT+02:00 Johan Engelen <jbc.engelen@...2592...>:
Hi all, Why do we build with -O2 instead of O3?
Because -O3 compilation takes a lot more time and memory than -O2, yet the improvement is minor in most cases.
It may make sense for compiling releases, but for regular development -O2 is enough. So unless you can show that under some specific circumstances Inkscape performs much better with -O3, I think the default should stay at -O2.
Regards, Krzysztof
On Thu, May 15, 2014, at 03:53 PM, Krzysztof Kosiński wrote:
2014-05-15 23:56 GMT+02:00 Johan Engelen <jbc.engelen@...2592...>:
Hi all, Why do we build with -O2 instead of O3?
Because -O3 compilation takes a lot more time and memory than -O2, yet the improvement is minor in most cases.
It may make sense for compiling releases, but for regular development -O2 is enough. So unless you can show that under some specific circumstances Inkscape performs much better with -O3, I think the default should stay at -O2.
Krzysztof summarized it pretty well here, however I did want to highlight the need for actual performance numbers. We generally would need to check both interactive and command-line performance. Then the type of data file or files involved can also be a big factor. Then finally some edge cases (such as zoomed far in, multiple windows for a single doc, etc.) need to be covered.
For my dev builds I have to use -O0 (to turn off optimizations) otherwise stepping through debuggers can be quite misleading. Stack traces from crashes can also be affected.
The -O2 level is pretty common to target, however some products actually choose to ship with -O0 instead. Some very high load/performance server systems I'd worked on were like that. In general one can often architect code nicely enough to not require compiler optimizations and can benefit from clear debugging data when the rare crash or forced dump occurs.
One other factor is in code clarity. Sometimes it is easy to get better performance not bumping up the optimization level, but instead by "de-optimizing" the source. That is, if one removes a coder's attempts at micro-optimizations it often leads to code that looks slower but, since it is clearer, will actually be much faster as a good compiler can do more with it.
But remember, the main take-away is that we need to get performance data from live end-to-end full test scenarios. And keep in mind that studies have shown developer intuition on performance to be wrong 80%+ of the time.
Please don't change the topic. The question is about a compiler flag, that may give better code for free, not about coding for performance.
I raise the question because perhaps the person who set O2 knew of any bugs caused by O3. I don't see any other reason for not using it. Sure, it won't improve performance (we all know why Inkscape is slow), but it won't make it slower either. Compile speed is not very much affected, and if one wants to use a debugger (I hardly ever do), one should change the flags anyway. Changing from O2 to O3 only for release to me seem like asking for trouble.
regards, Johan
---- "Jon A. Cruz" <jon@...18...> wrote:
On Thu, May 15, 2014, at 03:53 PM, Krzysztof Kosiński wrote:
2014-05-15 23:56 GMT+02:00 Johan Engelen <jbc.engelen@...2592...>:
Hi all, Why do we build with -O2 instead of O3?
Because -O3 compilation takes a lot more time and memory than -O2, yet the improvement is minor in most cases.
It may make sense for compiling releases, but for regular development -O2 is enough. So unless you can show that under some specific circumstances Inkscape performs much better with -O3, I think the default should stay at -O2.
Krzysztof summarized it pretty well here, however I did want to highlight the need for actual performance numbers. We generally would need to check both interactive and command-line performance. Then the type of data file or files involved can also be a big factor. Then finally some edge cases (such as zoomed far in, multiple windows for a single doc, etc.) need to be covered.
For my dev builds I have to use -O0 (to turn off optimizations) otherwise stepping through debuggers can be quite misleading. Stack traces from crashes can also be affected.
The -O2 level is pretty common to target, however some products actually choose to ship with -O0 instead. Some very high load/performance server systems I'd worked on were like that. In general one can often architect code nicely enough to not require compiler optimizations and can benefit from clear debugging data when the rare crash or forced dump occurs.
One other factor is in code clarity. Sometimes it is easy to get better performance not bumping up the optimization level, but instead by "de-optimizing" the source. That is, if one removes a coder's attempts at micro-optimizations it often leads to code that looks slower but, since it is clearer, will actually be much faster as a good compiler can do more with it.
But remember, the main take-away is that we need to get performance data from live end-to-end full test scenarios. And keep in mind that studies have shown developer intuition on performance to be wrong 80%+ of the time.
-- Jon A. Cruz jon@...18...
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 16.05.2014 12:05, jbc.engelen@...2592... wrote:
---- "Jon A. Cruz" wrote:
On Thu, May 15, 2014, at 03:53 PM, Krzysztof Kosiński wrote:
2014-05-15 23:56 GMT+02:00 Johan Engelen <jbc.engelen@...2592...>:
Hi all, Why do we build with -O2 instead of O3?
Because -O3 compilation takes a lot more time and memory than -O2, yet the improvement is minor in most cases.
It may make sense for compiling releases, but for regular development -O2 is enough. So unless you can show that under some specific circumstances Inkscape performs much better with -O3, I think the default should stay at -O2.
Krzysztof summarized it pretty well here, however I did want to highlight the need for actual performance numbers. [...]
For my dev builds I have to use -O0 (to turn off optimizations) otherwise stepping through debuggers can be quite misleading. Stack traces from crashes can also be affected.
The -O2 level is pretty common to target, however some products actually choose to ship with -O0 instead. Some very high load/performance server systems I'd worked on were like that. In general one can often architect code nicely enough to not require compiler optimizations and can benefit from clear debugging data when the rare crash or forced dump occurs.
One other factor is in code clarity. Sometimes it is easy to get better performance not bumping up the optimization level, but instead by "de-optimizing" the source. That is, if one removes a coder's attempts at micro-optimizations it often leads to code that looks slower but, since it is clearer, will actually be much faster as a good compiler can do more with it.
But remember, the main take-away is that we need to get performance data from live end-to-end full test scenarios. And keep in mind that studies have shown developer intuition on performance to be wrong 80%+ of the time.
Please don't change the topic. The question is about a compiler flag, that may give better code for free, not about coding for performance.
I raise the question because perhaps the person who set O2 knew of any bugs caused by O3. I don't see any other reason for not using it. Sure, it won't improve performance (we all know why Inkscape is slow), but it won't make it slower either. Compile speed is not very much affected, and if one wants to use a debugger (I hardly ever do), one should change the flags anyway. Changing from O2 to O3 only for release to me seem like asking for trouble.
Off-topic: For debug-builds you can use -Og (in recent GCC versions), which will enable optimizations that don't interfere with the debugging. Independently, compiling with -fno-omit-frame-pointer (regardless of -O level) improves stacktracing ability, particularly on Windows, where MS stack tracing code may not work at all without frame pointer.
On-topic: It's simple. If -O3 doesn't introduce any bugs (which can be verified by extensive testing, i guess) AND gives measurable performance bump - - do use it. Otherwise (if it breaks things or if it reduces performance (which is known to happen)) - don't.
Now, i haven't debugged much software with -O3, so i can't say whether doing default dev builds with -O3 is a wise choice. Debugging -O2 kind of works most of the time, once you learn its quirks, and you can always rebuild the offending module with -Og or -O0 (and maybe -g3 to boot!) if needed.
- -- O< ascii ribbon - stop html email! - www.asciiribbon.org
Johan,
from a Windows (or Mac for that matter), I have built the dev libraries with the following flags: "-std=c++11 -O3 -ffast-math -ftree-vectorize"
Hope that helps.
Thanks, Partha
On Thu, May 15, 2014 at 5:56 PM, Johan Engelen <jbc.engelen@...2592...>wrote:
Hi all, Why do we build with -O2 instead of O3? We are missing out on: -O3: Optimize yet more. -O3 turns on all optimizations specified by -O2and also turns on the -finline-functions, -funswitch-loops, -fpredictive-commoning, -fgcse-after-reload, -ftree-loop-vectorize, -ftree-slp-vectorize, -fvect-cost-model, -ftree-partial-pre and -fipa-cp-clone options.
regards, Johan
"Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE Instantly run your Selenium tests across 300+ browser/OS combos. Get unparalleled scalability from the best Selenium testing platform available Simple to use. Nothing to install. Get started now for free." http://p.sf.net/sfu/SauceLabs _______________________________________________ Inkscape-devel mailing list Inkscape-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/inkscape-devel
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 16.05.2014 14:44, Partha Bagchi wrote:
On Thu, May 15, 2014 at 5:56 PM, Johan Engelen wrote:
Hi all, Why do we build with -O2 instead of O3? We are missing out on: -O3: Optimize yet more. -O3 turns on all optimizations specified by -O2and also turns on the -finline-functions, -funswitch-loops, -fpredictive-commoning, -fgcse-after-reload, -ftree-loop-vectorize, -ftree-slp-vectorize, -fvect-cost-model, -ftree-partial-pre and -fipa-cp-clone options.
from a Windows (or Mac for that matter), I have built the dev libraries with the following flags: "-std=c++11 -O3 -ffast-math -ftree-vectorize"
By the way, this reminded me: you may be able to get better results by building with appropriate -march and -mtune options.
- -- O< ascii ribbon - stop html email! - www.asciiribbon.org
I am not sure you need mtune and march together. Also, how would this work if someone uses a Pentium 3 and you use AMD 64?
On Fri, May 16, 2014 at 7:03 AM, LRN <lrn1986@...400...> wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 16.05.2014 14:44, Partha Bagchi wrote:
On Thu, May 15, 2014 at 5:56 PM, Johan Engelen wrote:
Hi all, Why do we build with -O2 instead of O3? We are missing out on: -O3: Optimize yet more. -O3 turns on all optimizations specified by -O2and also turns on the -finline-functions, -funswitch-loops, -fpredictive-commoning, -fgcse-after-reload, -ftree-loop-vectorize, -ftree-slp-vectorize, -fvect-cost-model, -ftree-partial-pre and -fipa-cp-clone options.
from a Windows (or Mac for that matter), I have built the dev libraries with the following flags: "-std=c++11 -O3 -ffast-math -ftree-vectorize"
By the way, this reminded me: you may be able to get better results by building with appropriate -march and -mtune options.
O< ascii ribbon - stop html email! - www.asciiribbon.org -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (MingW32)
iQEcBAEBAgAGBQJTdfB5AAoJEOs4Jb6SI2CwssUIAIqp75nSCTWiIOAEuUufcd8i 22vw9UkCKOaHBj+dmUFPRE0SizqLSsVZLrjuJiqeT9JiiMYOgOjUagmKAtwi/LUv F+tXXfKA8dKgUtdOF1XdOjw8XtXgfwi3PJl+AB0/ZCOvu4W92ArnFQMtP1cdFXkH XXpQEP8iTM/gv2FDG+3y66OZUiEIF4MW0CZp4iNzSw3akeGILBytOamQDPq1X/6C TtR3MJU4rGKxoKdY1vnR0plkphXb0PyzKIAjeUxZRiFxNs76EgZN+wcSm5xhJ/tl Nzi1NXmpWExOIjXFylU0y3wmtUJ8Ugh47B+41Z3P8XgrNdfD7i5H7pya15rbenY= =uG2V -----END PGP SIGNATURE-----
"Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE Instantly run your Selenium tests across 300+ browser/OS combos. Get unparalleled scalability from the best Selenium testing platform available Simple to use. Nothing to install. Get started now for free." http://p.sf.net/sfu/SauceLabs _______________________________________________ Inkscape-devel mailing list Inkscape-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/inkscape-devel
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 16.05.2014 15:20, Partha Bagchi wrote:
On Fri, May 16, 2014 at 7:03 AM, LRN wrote:
On 16.05.2014 14:44, Partha Bagchi wrote:
On Thu, May 15, 2014 at 5:56 PM, Johan Engelen wrote:
Hi all, Why do we build with -O2 instead of O3? We are missing out on: -O3: Optimize yet more. -O3 turns on all optimizations specified by -O2and also turns on the -finline-functions, -funswitch-loops, -fpredictive-commoning, -fgcse-after-reload, -ftree-loop-vectorize, -ftree-slp-vectorize, -fvect-cost-model, -ftree-partial-pre and -fipa-cp-clone options.
from a Windows (or Mac for that matter), I have built the dev libraries with the following flags: "-std=c++11 -O3 -ffast-math -ftree-vectorize"
By the way, this reminded me: you may be able to get better results by building with appropriate -march and -mtune options.
I am not sure you need mtune and march together. Also, how would this work if someone uses a Pentium 3 and you use AMD 64?
The same way it would work if someone uses an i686 OS, while you're using x86_64 OS - provide separate binaries.
As far as i've been told, the best results (optimization-wise) are achieved by "-march=the_oldest_supported_CPU -mtune=generic". Hence my suggestion.
- -- O< ascii ribbon - stop html email! - www.asciiribbon.org
participants (6)
-
unknown@example.com
-
Johan Engelen
-
Jon A. Cruz
-
Krzysztof Kosiński
-
LRN
-
Partha Bagchi