Windows: why do we build with -O2?

newer
Idea: merge Inset, Outset, Dynamic...

Johan Engelen

15 May 2014 15 May '14

9:56 p.m.

Hi all, Why do we build with -O2 instead of O3? We are missing out on: -O3: Optimize yet more. -O3 turns on all optimizations specified by -O2 and also turns on the -finline-functions, -funswitch-loops, -fpredictive-commoning, -fgcse-after-reload, -ftree-loop-vectorize, -ftree-slp-vectorize, -fvect-cost-model, -ftree-partial-pre and -fipa-cp-clone options.

regards, Johan

Attachments:

attachment.htm (text/html — 1.1 KB)

Show replies by date

Krzysztof Kosiński

15 May 15 May

10:53 p.m.

2014-05-15 23:56 GMT+02:00 Johan Engelen <jbc.engelen@...2592...>:

...

Hi all, Why do we build with -O2 instead of O3?

Because -O3 compilation takes a lot more time and memory than -O2, yet the improvement is minor in most cases.

It may make sense for compiling releases, but for regular development -O2 is enough. So unless you can show that under some specific circumstances Inkscape performs much better with -O3, I think the default should stay at -O2.

Regards, Krzysztof

Jon A. Cruz

11:12 p.m.

On Thu, May 15, 2014, at 03:53 PM, Krzysztof Kosiński wrote:

...

2014-05-15 23:56 GMT+02:00 Johan Engelen <jbc.engelen@...2592...>:

...
Hi all, Why do we build with -O2 instead of O3?

Because -O3 compilation takes a lot more time and memory than -O2, yet the improvement is minor in most cases.

It may make sense for compiling releases, but for regular development -O2 is enough. So unless you can show that under some specific circumstances Inkscape performs much better with -O3, I think the default should stay at -O2.

Krzysztof summarized it pretty well here, however I did want to highlight the need for actual performance numbers. We generally would need to check both interactive and command-line performance. Then the type of data file or files involved can also be a big factor. Then finally some edge cases (such as zoomed far in, multiple windows for a single doc, etc.) need to be covered.

For my dev builds I have to use -O0 (to turn off optimizations) otherwise stepping through debuggers can be quite misleading. Stack traces from crashes can also be affected.

The -O2 level is pretty common to target, however some products actually choose to ship with -O0 instead. Some very high load/performance server systems I'd worked on were like that. In general one can often architect code nicely enough to not require compiler optimizations and can benefit from clear debugging data when the rare crash or forced dump occurs.

One other factor is in code clarity. Sometimes it is easy to get better performance not bumping up the optimization level, but instead by "de-optimizing" the source. That is, if one removes a coder's attempts at micro-optimizations it often leads to code that looks slower but, since it is clearer, will actually be much faster as a good compiler can do more with it.

But remember, the main take-away is that we need to get performance data from live end-to-end full test scenarios. And keep in mind that studies have shown developer intuition on performance to be wrong 80%+ of the time.

-- Jon A. Cruz jon@...18...

unknown＠example.com

16 May 16 May

8:05 a.m.

Please don't change the topic. The question is about a compiler flag, that may give better code for free, not about coding for performance.

I raise the question because perhaps the person who set O2 knew of any bugs caused by O3. I don't see any other reason for not using it. Sure, it won't improve performance (we all know why Inkscape is slow), but it won't make it slower either. Compile speed is not very much affected, and if one wants to use a debugger (I hardly ever do), one should change the flags anyway. Changing from O2 to O3 only for release to me seem like asking for trouble.

regards, Johan

---- "Jon A. Cruz" <jon@...18...> wrote:

...

On Thu, May 15, 2014, at 03:53 PM, Krzysztof Kosiński wrote:

...
2014-05-15 23:56 GMT+02:00 Johan Engelen <jbc.engelen@...2592...>:

...
Hi all, Why do we build with -O2 instead of O3?

Because -O3 compilation takes a lot more time and memory than -O2, yet the improvement is minor in most cases.

It may make sense for compiling releases, but for regular development -O2 is enough. So unless you can show that under some specific circumstances Inkscape performs much better with -O3, I think the default should stay at -O2.

Krzysztof summarized it pretty well here, however I did want to highlight the need for actual performance numbers. We generally would need to check both interactive and command-line performance. Then the type of data file or files involved can also be a big factor. Then finally some edge cases (such as zoomed far in, multiple windows for a single doc, etc.) need to be covered.

For my dev builds I have to use -O0 (to turn off optimizations) otherwise stepping through debuggers can be quite misleading. Stack traces from crashes can also be affected.

The -O2 level is pretty common to target, however some products actually choose to ship with -O0 instead. Some very high load/performance server systems I'd worked on were like that. In general one can often architect code nicely enough to not require compiler optimizations and can benefit from clear debugging data when the rare crash or forced dump occurs.

One other factor is in code clarity. Sometimes it is easy to get better performance not bumping up the optimization level, but instead by "de-optimizing" the source. That is, if one removes a coder's attempts at micro-optimizations it often leads to code that looks slower but, since it is clearer, will actually be much faster as a good compiler can do more with it.

But remember, the main take-away is that we need to get performance data from live end-to-end full test scenarios. And keep in mind that studies have shown developer intuition on performance to be wrong 80%+ of the time.

-- Jon A. Cruz jon@...18...

LRN

8:18 a.m.

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1

On 16.05.2014 12:05, jbc.engelen@...2592... wrote:

...

---- "Jon A. Cruz" wrote:

...
On Thu, May 15, 2014, at 03:53 PM, Krzysztof Kosiński wrote:

...
2014-05-15 23:56 GMT+02:00 Johan Engelen <jbc.engelen@...2592...>:

...
Hi all, Why do we build with -O2 instead of O3?

Because -O3 compilation takes a lot more time and memory than -O2, yet the improvement is minor in most cases.

It may make sense for compiling releases, but for regular development -O2 is enough. So unless you can show that under some specific circumstances Inkscape performs much better with -O3, I think the default should stay at -O2.

Krzysztof summarized it pretty well here, however I did want to highlight the need for actual performance numbers. [...]

For my dev builds I have to use -O0 (to turn off optimizations) otherwise stepping through debuggers can be quite misleading. Stack traces from crashes can also be affected.

The -O2 level is pretty common to target, however some products actually choose to ship with -O0 instead. Some very high load/performance server systems I'd worked on were like that. In general one can often architect code nicely enough to not require compiler optimizations and can benefit from clear debugging data when the rare crash or forced dump occurs.

One other factor is in code clarity. Sometimes it is easy to get better performance not bumping up the optimization level, but instead by "de-optimizing" the source. That is, if one removes a coder's attempts at micro-optimizations it often leads to code that looks slower but, since it is clearer, will actually be much faster as a good compiler can do more with it.

But remember, the main take-away is that we need to get performance data from live end-to-end full test scenarios. And keep in mind that studies have shown developer intuition on performance to be wrong 80%+ of the time.

Please don't change the topic. The question is about a compiler flag, that may give better code for free, not about coding for performance.

I raise the question because perhaps the person who set O2 knew of any bugs caused by O3. I don't see any other reason for not using it. Sure, it won't improve performance (we all know why Inkscape is slow), but it won't make it slower either. Compile speed is not very much affected, and if one wants to use a debugger (I hardly ever do), one should change the flags anyway. Changing from O2 to O3 only for release to me seem like asking for trouble.

Off-topic: For debug-builds you can use -Og (in recent GCC versions), which will enable optimizations that don't interfere with the debugging. Independently, compiling with -fno-omit-frame-pointer (regardless of -O level) improves stacktracing ability, particularly on Windows, where MS stack tracing code may not work at all without frame pointer.

On-topic: It's simple. If -O3 doesn't introduce any bugs (which can be verified by extensive testing, i guess) AND gives measurable performance bump - - do use it. Otherwise (if it breaks things or if it reduces performance (which is known to happen)) - don't.

Now, i haven't debugged much software with -O3, so i can't say whether doing default dev builds with -O3 is a wise choice. Debugging -O2 kind of works most of the time, once you learn its quirks, and you can always rebuild the offending module with -Og or -O0 (and maybe -g3 to boot!) if needed.

- -- O< ascii ribbon - stop html email! - www.asciiribbon.org

...PGP SIGNATURE...

-----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (MingW32) iQEcBAEBAgAGBQJTdcnVAAoJEOs4Jb6SI2CwyZYIAMt1ov1Z+cGgTr6c3XIzdA4i u0lKR4yFJL08c6bXwTg/2lK12Gf6pHgUx5Rinr8WeHyYb8NC0OPlOuz/lhLqmnqL IgzDFTHd8ksCP9QfgLKBCtDjglR1LhsLwmOeb69Ditr2/JhCVQ7Mo3PGtjsAg4Qx mcj4N9HB1bPgtxu5nxxHlsIQauGln7PNoT9aJVDY41LJRhnGQk1e8OJJX4WGBlKF S/eZR0pECkSOMxp3tSE0MihR04UTDTbpHRvSEh3kBJRDhb7deRs3TfKMdjzl7KEO 6XARgeLGzlxlnY3PJXljwB8TuqRtkI22lmTWvvNN/1I2bE7jJeXP/9Z1qIOevLE= =ddVQ -----END PGP SIGNATURE-----

Partha Bagchi

10:44 a.m.

Johan,

from a Windows (or Mac for that matter), I have built the dev libraries with the following flags: "-std=c++11 -O3 -ffast-math -ftree-vectorize"

Hope that helps.

Thanks, Partha

On Thu, May 15, 2014 at 5:56 PM, Johan Engelen <jbc.engelen@...2592...>wrote:

...

Hi all, Why do we build with -O2 instead of O3? We are missing out on: -O3: Optimize yet more. -O3 turns on all optimizations specified by -O2and also turns on the -finline-functions, -funswitch-loops, -fpredictive-commoning, -fgcse-after-reload, -ftree-loop-vectorize, -ftree-slp-vectorize, -fvect-cost-model, -ftree-partial-pre and -fipa-cp-clone options.

regards, Johan

"Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE Instantly run your Selenium tests across 300+ browser/OS combos. Get unparalleled scalability from the best Selenium testing platform available Simple to use. Nothing to install. Get started now for free." http://p.sf.net/sfu/SauceLabs _______________________________________________ Inkscape-devel mailing list Inkscape-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/inkscape-devel

LRN

11:03 a.m.

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1

On 16.05.2014 14:44, Partha Bagchi wrote:

...

On Thu, May 15, 2014 at 5:56 PM, Johan Engelen wrote:

...
Hi all, Why do we build with -O2 instead of O3? We are missing out on: -O3: Optimize yet more. -O3 turns on all optimizations specified by -O2and also turns on the -finline-functions, -funswitch-loops, -fpredictive-commoning, -fgcse-after-reload, -ftree-loop-vectorize, -ftree-slp-vectorize, -fvect-cost-model, -ftree-partial-pre and -fipa-cp-clone options.

from a Windows (or Mac for that matter), I have built the dev libraries with the following flags: "-std=c++11 -O3 -ffast-math -ftree-vectorize"

By the way, this reminded me: you may be able to get better results by building with appropriate -march and -mtune options.

- -- O< ascii ribbon - stop html email! - www.asciiribbon.org

...PGP SIGNATURE...

-----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (MingW32) iQEcBAEBAgAGBQJTdfB5AAoJEOs4Jb6SI2CwssUIAIqp75nSCTWiIOAEuUufcd8i 22vw9UkCKOaHBj+dmUFPRE0SizqLSsVZLrjuJiqeT9JiiMYOgOjUagmKAtwi/LUv F+tXXfKA8dKgUtdOF1XdOjw8XtXgfwi3PJl+AB0/ZCOvu4W92ArnFQMtP1cdFXkH XXpQEP8iTM/gv2FDG+3y66OZUiEIF4MW0CZp4iNzSw3akeGILBytOamQDPq1X/6C TtR3MJU4rGKxoKdY1vnR0plkphXb0PyzKIAjeUxZRiFxNs76EgZN+wcSm5xhJ/tl Nzi1NXmpWExOIjXFylU0y3wmtUJ8Ugh47B+41Z3P8XgrNdfD7i5H7pya15rbenY= =uG2V -----END PGP SIGNATURE-----

Partha Bagchi

11:20 a.m.

I am not sure you need mtune and march together. Also, how would this work if someone uses a Pentium 3 and you use AMD 64?

On Fri, May 16, 2014 at 7:03 AM, LRN <lrn1986@...400...> wrote:

...

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1

On 16.05.2014 14:44, Partha Bagchi wrote:

...
On Thu, May 15, 2014 at 5:56 PM, Johan Engelen wrote:

...
Hi all, Why do we build with -O2 instead of O3? We are missing out on: -O3: Optimize yet more. -O3 turns on all optimizations specified by -O2and also turns on the -finline-functions, -funswitch-loops, -fpredictive-commoning, -fgcse-after-reload, -ftree-loop-vectorize, -ftree-slp-vectorize, -fvect-cost-model, -ftree-partial-pre and -fipa-cp-clone options.

from a Windows (or Mac for that matter), I have built the dev libraries with the following flags: "-std=c++11 -O3 -ffast-math -ftree-vectorize"

By the way, this reminded me: you may be able to get better results by building with appropriate -march and -mtune options.

O< ascii ribbon - stop html email! - www.asciiribbon.org -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (MingW32)

iQEcBAEBAgAGBQJTdfB5AAoJEOs4Jb6SI2CwssUIAIqp75nSCTWiIOAEuUufcd8i 22vw9UkCKOaHBj+dmUFPRE0SizqLSsVZLrjuJiqeT9JiiMYOgOjUagmKAtwi/LUv F+tXXfKA8dKgUtdOF1XdOjw8XtXgfwi3PJl+AB0/ZCOvu4W92ArnFQMtP1cdFXkH XXpQEP8iTM/gv2FDG+3y66OZUiEIF4MW0CZp4iNzSw3akeGILBytOamQDPq1X/6C TtR3MJU4rGKxoKdY1vnR0plkphXb0PyzKIAjeUxZRiFxNs76EgZN+wcSm5xhJ/tl Nzi1NXmpWExOIjXFylU0y3wmtUJ8Ugh47B+41Z3P8XgrNdfD7i5H7pya15rbenY= =uG2V -----END PGP SIGNATURE-----

"Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE Instantly run your Selenium tests across 300+ browser/OS combos. Get unparalleled scalability from the best Selenium testing platform available Simple to use. Nothing to install. Get started now for free." http://p.sf.net/sfu/SauceLabs _______________________________________________ Inkscape-devel mailing list Inkscape-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/inkscape-devel

LRN

18 May 18 May

10:39 a.m.

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1

On 16.05.2014 15:20, Partha Bagchi wrote:

...

On Fri, May 16, 2014 at 7:03 AM, LRN wrote:

...
On 16.05.2014 14:44, Partha Bagchi wrote:

...
On Thu, May 15, 2014 at 5:56 PM, Johan Engelen wrote:

...
Hi all, Why do we build with -O2 instead of O3? We are missing out on: -O3: Optimize yet more. -O3 turns on all optimizations specified by -O2and also turns on the -finline-functions, -funswitch-loops, -fpredictive-commoning, -fgcse-after-reload, -ftree-loop-vectorize, -ftree-slp-vectorize, -fvect-cost-model, -ftree-partial-pre and -fipa-cp-clone options.

from a Windows (or Mac for that matter), I have built the dev libraries with the following flags: "-std=c++11 -O3 -ffast-math -ftree-vectorize"

By the way, this reminded me: you may be able to get better results by building with appropriate -march and -mtune options.

I am not sure you need mtune and march together. Also, how would this work if someone uses a Pentium 3 and you use AMD 64?

The same way it would work if someone uses an i686 OS, while you're using x86_64 OS - provide separate binaries.

As far as i've been told, the best results (optimization-wise) are achieved by "-march=the_oldest_supported_CPU -mtune=generic". Hence my suggestion.

- -- O< ascii ribbon - stop html email! - www.asciiribbon.org

...PGP SIGNATURE...

-----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (MingW32) iQEcBAEBAgAGBQJTeI3vAAoJEOs4Jb6SI2CwFbsH/j4J/onPVfCGbdEYCPEPtwCQ vUkhLbXYh6IFpjX7D2it7r98Hn4wnTWCWscia+ghb1i9FoTyk8pJpwK9rOeIgqwd XxqIKVLHrvnGYNleuo3f53tQUEGpMIr2M57ByyRSvQIoqoLmDU+CeDwsh4FG1K/D 8esiBTfyXwjJGjlPK6FfyAo+zX8GlGIZB4FWP9hDUoEXf607M3PbcUZ6jyaZbsQ9 uwUtT2C37xiROttcwXWFP/MA5x2iNH5p67BWArZvSmNU2ElrzPxCvUHdA/wdwTxu cwHA2tffAsN8dnO6uYx/5+D14XsqZylYSvaCgYlUlYdSz6MMAJZdU2PO3ALhj+k= =pS8f -----END PGP SIGNATURE-----

4061

Age (days ago)

4064

Last active (days ago)

List overview

Download

8 comments

6 participants

tags (0)

participants (6)

unknown＠example.com
Johan Engelen
Jon A. Cruz
Krzysztof Kosiński
LRN
Partha Bagchi