Example Showing the gcc Optimization and User Code that May Fault
The function 'foo' in the snippet below will load only one of the struct members A or B; well at least that is the intention of the unoptimized code.
typedef struct {
int A;
int B;
} Pair;
int foo(const Pair *P, int c) {
int x;
if (c)
x = P->A;
else
x = P->B;
return c/102 + x;
}
Here is what gcc -O3 gives:
mov eax, esi
mov edx, -1600085855
test esi, esi
mov ecx, DWORD PTR [rdi+4] <-- ***load P->B**
cmovne ecx, DWORD PTR [rdi] <-- ***load P->A***
imul edx
lea eax, [rdx+rsi]
sar esi, 31
sar eax, 6
sub eax, esi
add eax, ecx
ret
So it appears that gcc is allowed to speculatively load both struct members in order to eliminate branching. But then, is the following code considered undefined behavior or is the gcc optimization above illegal?
#include <stdlib.h>
int naughty_caller(int c) {
Pair *P = (Pair*)malloc(sizeof(Pair)-1); // *** Allocation is enough for A but not for B ***
if (!P) return -1;
P->A = 0x42; // *** Initializing allocation only where it is guaranteed to be allocated ***
int res = foo(P, 1); // *** Passing c=1 to foo should ensure only P->A is accessed? ***
free(P);
return res;
}
If the load speculation will happen in the above scenario there is a chance that loading P->B will cause an exception because the last byte of P->B may lie in unallocated memory. This exception will not happen if the optimization is turned off.
The Question
Is the gcc optimization shown above of load speculation legal? Where does the spec say or imply that it's ok?
If the optimization is legal, how is the code in 'naughtly_caller' turn out to be undefined behavior?
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…