Yes, ISO C++ allows (but doesn't require) implementations to make this choice.
But also note that ISO C++ allows a compiler to emit code that crashes on purpose (e.g. with an illegal instruction) if the program encounters UB, e.g. as a way to help you find errors. (Or because it's a DeathStation 9000. Being strictly conforming is not sufficient for a C++ implementation to be useful for any real purpose). So ISO C++ would allow a compiler to make asm that crashed (for totally different reasons) even on similar code that read an uninitialized uint32_t
. Even though that's required to be a fixed-layout type with no trap representations.
It's an interesting question about how real implementations work, but remember that even if the answer was different, your code would still be unsafe because modern C++ is not a portable version of assembly language.
You're compiling for the x86-64 System V ABI, which specifies that a bool
as a function arg in a register is represented by the bit-patterns false=0
and true=1
in the low 8 bits of the register1. In memory, bool
is a 1-byte type that again must have an integer value of 0 or 1.
(An ABI is a set of implementation choices that compilers for the same platform agree on so they can make code that calls each other's functions, including type sizes, struct layout rules, and calling conventions.)
ISO C++ doesn't specify it, but this ABI decision is widespread because it makes bool->int conversion cheap (just zero-extension). I'm not aware of any ABIs that don't let the compiler assume 0 or 1 for bool
, for any architecture (not just x86). It allows optimizations like !mybool
with xor eax,1
to flip the low bit: Any possible code that can flip a bit/integer/bool between 0 and 1 in single CPU instruction. Or compiling a&&b
to a bitwise AND for bool
types. Some compilers do actually take advantage Boolean values as 8 bit in compilers. Are operations on them inefficient?.
In general, the as-if rule allows allows the compiler to take advantage of things that are true on the target platform being compiled for, because the end result will be executable code that implements the same externally-visible behaviour as the C++ source. (With all the restrictions that Undefined Behaviour places on what is actually "externally visible": not with a debugger, but from another thread in a well-formed / legal C++ program.)
The compiler is definitely allowed to take full advantage of an ABI guarantee in its code-gen, and make code like you found which optimizes strlen(whichString)
to
5U - boolValue
. (BTW, this optimization is kind of clever, but maybe shortsighted vs. branching and inlining memcpy
as stores of immediate data2.)
Or the compiler could have created a table of pointers and indexed it with the integer value of the bool
, again assuming it was a 0 or 1. (This possibility is what @Barmar's answer suggested.)
Your __attribute((noinline))
constructor with optimization enabled led to clang just loading a byte from the stack to use as uninitializedBool
. It made space for the object in main
with push rax
(which is smaller and for various reason about as efficient as sub rsp, 8
), so whatever garbage was in AL on entry to main
is the value it used for uninitializedBool
. This is why you actually got values that weren't just 0
.
5U - random garbage
can easily wrap to a large unsigned value, leading memcpy to go into unmapped memory. The destination is in static storage, not the stack, so you're not overwriting a return address or something.
Other implementations could make different choices, e.g. false=0
and true=any non-zero value
. Then clang probably wouldn't make code that crashes for this specific instance of UB. (But it would still be allowed to if it wanted to.) I don't know of any implementations that choose anything other what x86-64 does for bool
, but the C++ standard allows many things that nobody does or even would want to do on hardware that's anything like current CPUs.
ISO C++ leaves it unspecified what you'll find when you examine or modify the object representation of a bool
. (e.g. by memcpy
ing the bool
into unsigned char
, which you're allowed to do because char*
can alias anything. And unsigned char
is guaranteed to have no padding bits, so the C++ standard does formally let you hexdump object representations without any UB. Pointer-casting to copy the object representation is different from assigning char foo = my_bool
, of course, so booleanization to 0 or 1 wouldn't happen and you'd get the raw object representation.)
You've partially "hidden" the UB on this execution path from the compiler with noinline
. Even if it doesn't inline, though, interprocedural optimizations could still make a version of the function that depends on the definition of another function. (First, clang is making an executable, not a Unix shared library where symbol-interposition can happen. Second, the definition in inside the class{}
definition so all translation units must have the same definition. Like with the inline
keyword.)
So a compiler could emit just a ret
or ud2
(illegal instruction) as the definition for main
, because the path of execution starting at the top of main
unavoidably encounters Undefined Behaviour. (Which the compiler can see at compile time if it decided to follow the path through the non-inline constructor.)
Any program that encounters UB is totally undefined for its entire existence. But UB inside a function or if()
branch that never actually runs doesn't corrupt the rest of the program. In practice that means that compilers can decide to emit an illegal instruction, or a ret
, or not emit anything and fall into the next block / function, for the whole basic block that can be proven at compile time to contain or lead to UB.
GCC and Clang in practice do actually sometimes emit ud2
on UB, instead of even trying to generate code for paths of execution that make no sense. Or for cases like falling off the end of a non-void
function, gcc will sometimes omit a ret
instruction. If you were thinking that "my function will just return with whatever garbage is in RAX", you are sorely mistaken. Modern C++ compilers don't treat the language like a portable assembly language any more. Your program really has to be valid C++, without making assumptions about how a stand-alone non inlined version of your function might look in asm.
Another fun example is Why does unaligned access to mmap'ed memory sometimes segfault on AMD64?. x86 doesn't fault on unaligned integers, right? So why would a misaligned uint16_t*
be a problem? Because alignof(uint16_t) == 2
, and violating that assumption led to a segfault when auto-vectorizing with SSE2.
See also What Every C Programmer Should Know About Undefined Behavior #1/3, an article by a clang developer.
Key point: if the compiler noticed the UB at compile time, it could "break" (emit surprising asm) the path through your code that causes UB even if targeting an ABI where any bit-pattern is a valid object representation for bool
.
Expect total hostility toward many mistakes by the programmer, especially things modern compilers warn about. This is why you should use -Wall
and fix warnings. C++ is not a user-friendly language, and something in C++ can be unsafe even if it would be safe in asm on the target you're compiling for. (e.g. signed overflow is UB in C++ and compilers will assume it doesn't happen, even when compiling for 2's complement x86, unless you use clang/gcc -fwrapv
.)
Compile-time-visible UB is always dangerous, and it's really hard to be sure (with link-time optimization) that you've really hidden UB from the compiler and can thus reason about what kind of asm it will generate.
Not to be over-dramatic; often compilers do let you get away with some things and emit code like you're expecting even when something is UB. But maybe it will be a problem in the future if compiler devs implement some optimization that gains more info about value-ranges (e.g. that a variable is non-negative, maybe allowing it to optimize sign-extension to free zero-extension on x86-64). For example, in current gcc and clang, doing tmp = a+INT_MIN
doesn't optimize a<0
as always-false, only that tmp
is always negative. (Because INT_MIN
+ a=INT_MAX
is negative on this 2's complement target, and a
can't be any higher than that.)
So gcc/clang don't currently backtrack to derive range info for the inputs of a calculation, only on the results based on the assumption of no signed overflow: <a href="https://godbolt.org/#g:!((g:!((g:!((h:codeEditor,i:(fontScale:1.2899450879999999,j:1,lang:c%2B%2B,source:'%23include+%3Climits.h%3E%0A%0Aint+range_check(int+a,+int+*sink1,+int+*sink2)+%7B%0A++++int+tmp+%3D+(a+%2B+INT_MIN)%3B++//+a+%3E%3D+1+to+avoid+UB%0A++++//+tmp+is+definitely+negative%0A++++//+a+is+definitely+positive%0A++++if+(tmp+%3E+0)+*sink1+%3D+0%3B++++++//+optimized+away%0A++++if+(a+%3C+0)+*sink2+%3D+0%3B++++++++//+not+optimized+away%0A++++return+tmp%3B%0A%7D%0A%0A%0A//+it+seems+that+gcc+and+clang+don!'t+derive+range+info+for+a%0A//+based+on+%60a-12345%60+not+overflowing,%0A//+but+they+do+derive+range+info+for+the+result.%0Aint+signed_overflow_rangecheck(int+a,+int+*sink)+%7B%0A++++int+tmp+%3D+(a+-+12345)%3B%0A++++//if+((unsigned)a+%3D%3D+0x80000000UL)+*sink%3D0%3B%0A++++if+(tmp+%3E+0x7ffffff0)+*sink+%3D+1%3B++//+optimized+out:+it+can!'t+have+wrapped+to+a+positive+this+close+to+INT_MAX%0A++++if+(a+%3C+(INT_MIN%2B12))+*sink+%3D+2%3B++//+not+optimized+out.++But+%60a%60+this+close+to+INT_MIN+would+mean+a-12345+wrapped.%0A++++return+tmp%3B%0A%7D%0A%0A'),l:'5',n:'0',o:'C%2B%2B+source+%231',t:'0')),k:37.77562439622385,l:'4',n:'0',o:'',s:0,t:'0'),(g:!((h:compiler,i:(compiler:clang700,filters:(b:'0',binary:'1',commentOnly:'0',demangle:'0',directives:'0',execute:'1',intel:'0',