Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
300 views
in Technique[技术] by (71.8m points)

c - signed and unsigned arithmetic implementation on x86

C language has signed and unsigned types like char and int. I am not sure, how it is implemented on assembly level, for example it seems to me that multiplication of signed and unsigned would bring different results, so do assembly do both unsigned and signed arithmetic or only one and this is in some way emulated for the different case?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

If you look at the various multiplication instructions of x86, looking only at 32bit variants and ignoring BMI2, you will find these:

  • imul r/m32 (32x32->64 signed multiply)
  • imul r32, r/m32 (32x32->32 multiply) *
  • imul r32, r/m32, imm (32x32->32 multiply) *
  • mul r/m32 (32x32->64 unsigned multiply)

Notice that only the "widening" multiply has an unsigned counterpart. The two forms in the middle, marked with an asterisk, are both signed and unsigned multiplication, because for the case where you don't get that extra "upper part", that's the same thing.

The "widening" multiplications have no direct equivalent in C, but compilers can (and often do) use those forms anyway.

For example, if you compile this:

uint32_t test(uint32_t a, uint32_t b)
{
    return a * b;
}

int32_t test(int32_t a, int32_t b)
{
    return a * b;
}

With GCC or some other relatively reasonable compiler, you'd get something like this:

test(unsigned int, unsigned int):
    mov eax, edi
    imul    eax, esi
    ret
test(int, int):
    mov eax, edi
    imul    eax, esi
    ret

(actual GCC output with -O1)


So signedness doesn't matter for multiplication (at least not for the kind of multiplication you use in C) and for some other operations, namely:

  • addition and subtraction
  • bitwise AND, OR, XOR, NOT
  • negation
  • left shift
  • comparing for equality

x86 doesn't offer separate signed/unsigned versions for those, because there's no difference anyway.

But for some operations there is a difference, for example:

  • division (idiv vs div)
  • remainder (also idiv vs div)
  • right shift (sar vs shr) (but beware of signed right shift in C)
  • comparing for bigger than / smaller than

But that last one is special, x86 doesn't have separate versions for signed and unsigned of this either, instead it has one operation (cmp, which is really just a nondestructive sub) that does both at once, and gives several results (multiple bits in "the flags" are affected). Later instructions that actually use those flags (branches, conditional moves, setcc) then choose which flags they care about. So for example,

cmp a, b
jg somewhere

Will go somewhere if a is "signed greater than" b.

cmp a, b
jb somewhere

Would go somewhere if a is "unsigned below" b.

See Assembly - JG/JNLE/JL/JNGE after CMP for more about the flags and branches.


This won't be a formal proof that signed and unsigned multiplication are the same, I'll just try to give you insight into why they should be the same.

Consider 4-bit 2's-complement integers. The weights their individual bits are, from lsb to msb, 1, 2, 4, and -8. When you multiply two of those numbers, you can decompose one of them into 4 parts corresponding to its bits, for example:

0011 (decompose this one to keep it interesting)
0010
---- *
0010 (from the bit with weight 1)
0100 (from the bit with weight 2, so shifted left 1)
---- +
0110

2 * 3 = 6 so everything checks out. That's just regular long multiplication that most people learn in school, only binary, which makes it a lot easier since you don't have to multiply by a decimal digit, you only have to multiply by 0 or 1, and shift.

Anyway, now take a negative number. The weight of the sign bit is -8, so at one point you will make a partial product -8 * something. A multiplication by 8 is shifting left by 3, so the former lsb is now the msb, and all other bits are 0. Now if you negate that (it was -8 after all, not 8), nothing happens. Zero is obviously unchanged, but so is 8, and in general the number with only the msb set:

-1000 = ~1000 + 1 = 0111 + 1 = 1000

So you've done the same thing you would have done if the weight of the msb was 8 (as in the unsigned case) instead of -8.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...