Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
155 views
in Technique[技术] by (71.8m points)

c++ - Is the compiler allowed to optimize out heap memory allocations?

Consider the following simple code that makes use of new (I am aware there is no delete[], but it does not pertain to this question):

int main()
{
    int* mem = new int[100];

    return 0;
}

Is the compiler allowed to optimize out the new call?

In my research, g++ (5.2.0) and Visual Studio 2015 do not optimize out the new call, while clang (3.0+) does. All tests have been made with full optimizations enabled (-O3 for g++ and clang, Release mode for Visual Studio).

Isn't new making a system call under the hood, making it impossible (and illegal) for a compiler to optimize that out?

EDIT: I have now excluded undefined behaviour from the program:

#include <new>  

int main()
{
    int* mem = new (std::nothrow) int[100];
    return 0;
}

clang 3.0 does not optimize that out anymore, but later versions do.

EDIT2:

#include <new>  

int main()
{
    int* mem = new (std::nothrow) int[1000];

    if (mem != 0)
      return 1;

    return 0;
}

clang always returns 1.

Question&Answers:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

The history seems to be that clang is following the rules laid out in N3664: Clarifying Memory Allocation which allows the compiler to optimize around memory allocations but as Nick Lewycky points out :

Shafik pointed out that seems to violate causality but N3664 started life as N3433, and I'm pretty sure we wrote the optimization first and wrote the paper afterwards anyway.

So clang implemented the optimization which later on became a proposal that was implemented as part of C++14.

The base question is whether this is a valid optimization prior to N3664, that is a tough question. We would have to go to the as-if rule covered in the draft C++ standard section 1.9 Program execution which says(emphasis mine):

The semantic descriptions in this International Standard define a parameterized nondeterministic abstract machine. This International Standard places no requirement on the structure of conforming implementations. In particular, they need not copy or emulate the structure of the abstract machine. Rather, conforming implementations are required to emulate (only) the observable behavior of the abstract machine as explained below.5

where note 5 says:

This provision is sometimes called the “as-if” rule, because an implementation is free to disregard any requirement of this International Standard as long as the result is as if the requirement had been obeyed, as far as can be determined from the observable behavior of the program. For instance, an actual implementation need not evaluate part of an expression if it can deduce that its value is not used and that no side effects affecting the observable behavior of the program are produced.

Since new could throw an exception which would have observable behavior since it would alter the return value of the program, that would seem to argue against it being allowed by the as-if rule.

Although, it could be argued it is implementation detail when to throw an exception and therefore clang could decide even in this scenario it would not cause an exception and therefore eliding the new call would not violate the as-if rule.

It also seems valid under the as-if rule to optimize away the call to the non-throwing version as well.

But we could have a replacement global operator new in a different translation unit which could cause this to affect observable behavior, so the compiler would have to have some way a proving this was not the case, otherwise it would not be able to perform this optimization without violating the as-if rule. Previous versions of clang did indeed optimize in this case as this godbolt example shows which was provided via Casey here, taking this code:

#include <cstddef>

extern void* operator new(std::size_t n);

template<typename T>
T* create() { return new T(); }

int main() {
    auto result = 0;
    for (auto i = 0; i < 1000000; ++i) {
        result += (create<int>() != nullptr);
    }

    return result;
}

and optimizing it to this:

main:                                   # @main
    movl    $1000000, %eax          # imm = 0xF4240
    ret

This indeed seems way too aggressive but later versions do not seem to do this.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...