I have some undefined behaviour in a seemingly innocuous function which is parsing a double
value from a buffer. I read the double
in two halves, because I am reasonably certain the language standard says that shifting char
values is only valid in a 32-bit context.
inline double ReadLittleEndianDouble( const unsigned char *buf )
{
uint64_t lo = (buf[3] << 24) | (buf[2] << 16) | (buf[1] << 8) | buf[0];
uint64_t hi = (buf[7] << 24) | (buf[6] << 16) | (buf[5] << 8) | buf[4];
uint64_t val = (hi << 32) | lo;
return *(double*)&val;
}
Since I am storing 32-bit values into 64-bit variables lo
and hi
, I reasonably expect that the high-order 32-bits of these variables will always be 0x00000000
. But sometimes they contain 0xffffffff
or other non-zero rubbish.
The fix is to mask it like this:
uint64_t val = ((hi & 0xffffffffULL) << 32) | (lo & 0xffffffffULL);
Alternatively, it seems to work if I mask during the assignment instead:
uint64_t lo = ((buf[3] << 24) | (buf[2] << 16) | (buf[1] << 8) | buf[0]) & 0xffffffff;
uint64_t hi = ((buf[7] << 24) | (buf[6] << 16) | (buf[5] << 8) | buf[4]) & 0xffffffff;
I would like to know why this is necessary. All I can think of to explain this is that my compiler is doing all the shifting and combining for lo
and hi
directly on 64-bit registers, and I might expect undefined behaviour in the high-order 32-bits if this is the case.
Can someone please confirm my suspicions or otherwise explain what is happening here, and comment on which (if any) of my two solutions is preferable?
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…