Same way 32-bit arithmetic worked on 16-bit systems.
In this case, it uses 2 32-bit memory addresses to form a 64-bit number together. Addition/substraction is easy, you do it by parts, the only gotcha is taking the carry-over from the lower part to the higher part. For multiplication/division, it's harder (ie more instructions).
It's obviously slow, quite a bit slower than 32 bit arithmetic for multiplication, but if you need it, it's there for you. And when you upgrade to a 64-bit processor compiler, it gets automatically optimized to one instruction with the bigger word size.
The Visual Studio 2010 Professional implementation of 64 bit multiplication on a 32-bit processor, compiled in release mode, is:
_allmul PROC NEAR
A EQU [esp + 4] ; stack address of a
B EQU [esp + 12] ; stack address of b
mov eax,HIWORD(A)
mov ecx,HIWORD(B)
or ecx,eax ;test for both hiwords zero.
mov ecx,LOWORD(B)
jnz short hard ;both are zero, just mult ALO and BLO
mov eax,LOWORD(A)
mul ecx
ret 16 ; callee restores the stack
hard:
push ebx
A2 EQU [esp + 8] ; stack address of a
B2 EQU [esp + 16] ; stack address of b
mul ecx ;eax has AHI, ecx has BLO, so AHI * BLO
mov ebx,eax ;save result
mov eax,LOWORD(A2)
mul dword ptr HIWORD(B2) ;ALO * BHI
add ebx,eax ;ebx = ((ALO * BHI) + (AHI * BLO))
mov eax,LOWORD(A2) ;ecx = BLO
mul ecx ;so edx:eax = ALO*BLO
add edx,ebx ;now edx has all the LO*HI stuff
pop ebx
ret 16 ; callee restores the stack
As you can see, it's a LOT slower than normal multiplication.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…