basically with float you get 32 bits that encode
VALUE = SIGN * MANTISSA * 2 ^ (128 - EXPONENT)
32-bits = 1-bit 23-bits 8-bits
and that is stored as
MSB LSB
[SIGN][EXPONENT][MANTISSA]
since you only get 23 bits, that's the amount of "precision" you can store. If you are trying to represent a fraction that is irrational (or repeating) in base 2, the sequence of bits will be "rounded off" at the 23rd bit.
0.7 base 10 is 7 / 10 which in binary is 0b111 / 0b1010 you get:
0.1011001100110011001100110011001100110011001100110011... etc
Since this repeats, in fixed precision there is no way to exactly represent it. The
same goes for 0.8 which in binary is:
0.1100110011001100110011001100110011001100110011001101... etc
To see what the fixed precision value of these numbers is you have to "cut them off" at the number of bits you and do the math. The only trick is you the leading 1 is implied and not stored so you technically get an extra bit of precision. Because of rounding, the last bit will be a 1 or a 0 depending on the value of the truncated bit.
So the value of 0.7 is effectively 11,744,051 / 2^24 (no rounding effect) = 0.699999988 and the value of 0.8 is effectively 13,421,773 / 2^24 (rounded up) = 0.800000012.
That's all there is to it :)
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…