C++ and the -O3 compilation flag

_CG_Maxime · June 15, 2016, 6:39am

I think the reason is that we won’t be able to provide the line number where the program crashes if we add optimization flags. To my mind we should give to the user the possibility to activate or not the optimization flags, otherwise beginners will lose a feature. And we can’t change the optimization flags for submissions only because -O3 can generate bugs you don’t see without the optimizations (that would be a nightmare to debug something you only have when you submit).

What do you suggest?

Neumann · June 15, 2016, 7:26am

That’s the main argument here. The purpose of this website is to promote people’s skills to companies.C++ coders should be able to write beautiful code AND have a good ranking.

Agade · June 15, 2016, 8:18am

The absolute best thing would probably be, as you said, to give people the choice of the compilation options, probably in some “advanced” tab. It would also allow people to activate warnings which are currently disabled. However, I don’t know how easy it would be for you to add that feature to the website.

_CPC_Herbert · June 15, 2016, 9:27am

The line numbers will still be there as long as there is the debug info compiled in (-g), so this is not a valid argument.

Aveuh · June 15, 2016, 9:40am

Yep, the only thing that would change for the optimization would be if we wanted to do step by step debugging.

In anycase @MaximeC I guess two options are viable : including at least O2 in the compilation, or leaving the choice of compilation options.

The former is nice but as @player_one said, this won’t solve the problem that, after that someone might want to have -march=native or -funroll_loops and blahdiblah. The second option would be actually awesome for every language. But this might provide some security/exploit issues.

I kind of agree with @JBM on his argument towards less “computationally expensive contests”. But I also think that, Codingame is somewhat close to real life. You want perfs, you use a fast language, end of story. I love coding in Python, but I would never use it for a HPC code. So in that sense, there is no reason C++ should be penalized because other languages are slower.

And in the worst case you can still add Fortran to have another fast language

_CPC_Herbert · June 15, 2016, 9:48am

Regarding this topic, here is some more information about how things work.

Basically, CG compiles C++ code using “-O0 -g” compilation options, this means no optimizations, no inlining, nothing at all. Which gives, for std::max, std::vector and std::sort dummy samples, this kind of generated code: https://godbolt.org/g/YOoPB0
As you can see on the assembly side, this is really bad, every single function is called separately, nothing is optimized.

For comparison, here is the code generated when O3 is passed on the command-line instead: https://godbolt.org/g/kLkX62
You see that the compiler inlines a lot of functions, and does aggressive optimization. The call to std::max completely disappears, and the sort implementation is almost fully inlined.

When one uses the #pragma GCC optimize("O3") trick, here’s what happens: https://godbolt.org/g/EZdEqB
As you can see, each function gets optimized accordingly to the O3 flag, but they aren’t inlined at all. This is why any call to std::max is slower than a macro.

So, what happens here? Well, it looks like the pragma only tells GCC to optimize each function O3-style, but it doesn’t activate all the global optimization flags, such as inlining and stuff, and GCC still does this part with the O0-style…

Is it possible to do better with pragmas? Yes. Not as good as command-line O3, but still, quite good: https://godbolt.org/g/syhzgm
By adding another #pragma GCC optimize("inline"), we can override the implicit -fno-inline that comes from O0 optimization, and tell GCC to try inlining the functions that are explicitely marked as inline. Also, the #pragma GCC optimize("omit-frame-pointer") removes the useless stores of the frame pointer, which is enabled by O0 but useless most of the time.

As you can see, for std::max, which is marked as inline in the STL headers, this additional tricks make it as good as if it was compiled with command-line O3.

So why isn’t this still not as good as -O3? I’m not sure entirely, but I did notice that the pragma trick for enabling inlining works for functions marked with the inline keyword and for small functions. For functions not marked as such, some of them will not be considered for inlining, although if they would with -O3. This is also the case for every implicitly created functions, such as default constructors and assignment operators. This means that you should define these explicitly and mark them with inline, even if you want the default behavior:

struct bla {
  inline bla() = default;
  inline bla(bla const&) = default;
  inline bla(bla&&) = default;
  inline bla& operator=(bla const&) = default;
  inline bla& operator=(bla&&) = default;
};

Now, using all these tricks, you should have performance almost on par with -O3.

player_one · June 15, 2016, 10:23am

It’s been several years since I’ve done C++ on a regular basis, so my memory / information may be a bit outdated, but aren’t optimized stack traces unreliable? I seem to remember that it can sometimes indicate the wrong line numbers if the code has been significantly altered by the optimization process.

danBhentschel

Magus · June 15, 2016, 10:28am

I don’t know if it is easy, but you could simple add a new language “C++ with -O3”. No interface feature to add.

player_one · June 15, 2016, 11:39am

There’s a lot of logistics questions around this approach, though. Is this “language” available only for competitions, or is it available on all puzzles too? Is there an achievement associated with it? That would be a bit silly. What would be the upgrade path (for both CG and for users) if this stopgap solution were to eventually be expanded to a system that allowed [partial/full] control over compilation parameters?

Not trying to bash your idea. I think it has merit. Just understand that it’s not necessarily as simple as you make it sound.

danBhentschel

Aveuh · June 15, 2016, 11:45am

If the problem is the interface to add, couldn’t we simply add a sort of hashbang on the first line of the code, allowing to add stuff on the compilation line ?

Skywalker · June 15, 2016, 11:53am

Don’t forget that bash is available as a language.

Aveuh · June 15, 2016, 11:54am

Yeah you’re right. In any case I suppose everything is running in a VM so we can’t break anything

_CPC_Herbert · June 15, 2016, 12:11pm

Yes, if functions are inlined, you might only get the line number of their call site, but you will get line numbers anyway.

player_one · June 15, 2016, 12:24pm

Which, I presume, is why CG avoids optimization in C++. For new users just learning a language, this may be very confusing, especially if the new user doesn’t know the ins and outs of optimization, or even that such a thing exists! Not saying you’re wrong. Just pointing out that CG’s stance is a valid one.

danBhentschel

Thufir_Hawat · June 15, 2016, 2:09pm

Excellent! Thanks @rOut for figuring this out!
I’ve tried my (STL heavy) solution for the Skynet Revolution - Episode 2 puzzle with this pragma:
#pragma GCC optimize "O3,omit-frame-pointer,inline"
This gives me a factor 4-5 performance increase.

Magus · June 15, 2016, 6:40pm

I never said it is simple It is just an idea of how to implement this

ISMAX · June 15, 2016, 6:55pm

Thank you Maxime for your response.
I suggest to have the choice between at least the following options :

No optimization (for debugging purpose)
-O2 optimization flag (in case -O3 generates bugs)
-O3 optimization flag (for FULL speed C++ !)
The best would be to be able to give explicitly our own compilation flags in the IDE.

Thufir_Hawat · June 15, 2016, 9:30pm

-O2 optimization flag (in case -O3 generates bugs)

Do you have any evidence that an optimization in GCC 4.9.2. introduced a bug? And I don’t mean any invalid or buggy code that got it’s bug exposed under optimization (such as some dangerous const_cast cases which work in a debug build, but fail in a release build, for example).

_CPC_Herbert · June 15, 2016, 9:51pm

It’s not that it would generate buggy code, but that it could generate suboptimal bloated code with O3 compared to O2.

ISMAX · June 15, 2016, 9:52pm

Yes compiler optimizations can generate bugs (as @MaximeC pointed out in his initial response).
See http://stackoverflow.com/questions/2722302/can-compiler-optimization-introduce-bugs
The more aggressive/advanced are the optimizations performed by the compiler, the higher is the risk of introducing bugs.
But those bugs happen in exceptional cases and should not be problematic if we are able to turn compiler optimizations off (or use a lower optimization level like -O2 instead of -O3).