Assuming unsigned int
has no trap representations, do either or both of the statements marked (A) and (B) below provoke undefined behavior, why or why not, and (especially if you think one of them is well-defined but the other isn't), do you consider that a defect in the standard? I am primarily interested in the current version of the C standard (i.e. C2011), but if this is different in older versions of the standard, or in C++, I would also like to know about that.
(_Alignas
is used in this program to eliminate any question of UB due to inadequate alignment. The rules I discuss in my interpretation, though, say nothing about alignment.)
#include <stdlib.h>
#include <string.h>
int main(void)
{
unsigned int v1, v2;
unsigned char _Alignas(unsigned int) b1[sizeof(unsigned int)];
unsigned char *b2 = malloc(sizeof(unsigned int));
if (!b2) return 1;
memset(b1, 0x55, sizeof(unsigned int));
memset(b2, 0x55, sizeof(unsigned int));
v1 = *(unsigned int *)b1; /* (A) */
v2 = *(unsigned int *)b2; /* (B) */
return !(v1 == v2);
}
My interpretation of C2011 is that (A) provokes undefined behavior but (B) is well-defined (to store an unspecified value into v2
), because:
memset
is defined (§7.24.6.1) to write to its first argument as-if through an lvalue with character type, which is allowed for both b1
and b2
per the special case at the bottom of §6.5p7.
The object b1
has a declared type, unsigned char[n]
. Therefore, its effective type for accesses is also unsigned char[n]
per 6.5p6. Statement (A) reads b1
via an lvalue expression whose type is unsigned int
, which is not the effective type of b1
nor any of the other exceptions in 6.5p7, so the behavior is undefined.
The object pointed-to by b2
has no declared type. The value stored into it (by memset
) was (as-if) through an lvalue with character type, so the second case of 6.5p6 does not apply. The value was not copied from anywhere, so the third case of 6.5p6 does not apply either. Therefore, the effective type of the object is the type of the lvalue used for the access, which is unsigned int
, and the rules of 6.5p7 are satisfied.
Finally, per 6.2.6.1, assuming unsigned int
has no trap representations, the memset
operation has created the representation of some unspecified unsigned int
value in each of b1
and b2
. Therefore, if neither (A) nor (B) provokes undefined behavior, then the actual values in v1
and v2
are unspecified but they are equal.
Commentary:
The asymmetry of the "type-based aliasing" rules (that is, 6.5p7), permitting an object with any effective type to be accessed by an lvalue with character type, but not vice versa, is a continual source of confusion. The second case of 6.5p6 seems to have been added specifically to prevent its being undefined behavior to read a value initialized by memset
(or, for that matter, calloc
) but, because it only applies to objects with no declared type, is itself an additional source of confusion.
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…