Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
239 views
in Technique[技术] by (71.8m points)

c++ - Why do (only) some compilers use the same address for identical string literals?

https://godbolt.org/z/cyBiWY

I can see two 'some' literals in assembler code generated by MSVC, but only one with clang and gcc. This leads to totally different results of code execution.

static const char *A = "some";
static const char *B = "some";

void f() {
    if (A == B) {
        throw "Hello, string merging!";
    }
}

Can anyone explain the difference and similarities between those compilation outputs? Why does clang/gcc optimize something even when no optimizations are requested? Is this some kind of undefined behaviour?

I also notice that if I change the declarations to those shown below, clang/gcc/msvc do not leave any "some" in the assembler code at all. Why is the behaviour different?

static const char A[] = "some";
static const char B[] = "some";
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

This is not undefined behavior, but unspecified behavior. For string literals,

The compiler is allowed, but not required, to combine storage for equal or overlapping string literals. That means that identical string literals may or may not compare equal when compared by pointer.

That means the result of A == B might be true or false, on which you shouldn't depend.

From the standard, [lex.string]/16:

Whether all string literals are distinct (that is, are stored in nonoverlapping objects) and whether successive evaluations of a string-literal yield the same or a different object is unspecified.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...