Give a man a fish and blah-blah-blah…
It’s good, that you have a code example. But do you understand the algorithm?
Okay, let’s go through it step by step on a simplified example: multiplying two 8-bit registers in AL
and AH
, and storing the result in DX
.
BTW, you can use any registers you like unless this or that instruction requires any particular register. Like, for example, SHL reg, CL
.
But before we actually start, there’re a couple of optimizations for the algorithm you provided. Assembly is all about optimization, you know. Either for speed or for size. Otherwize you do bloatware in C# or smth. else.
MOV DI,AX
AND DI,01h
XOR DI,01h
JZ ADD
What this part does is simply checks if the first bit (bit #0) in AX
is set or not.
You could simply do
TEST AX, 1
JNZ ADD
But you only need to test one bit, thus TEST AL, 1
instead of TEST AX, 1
saves you one byte.
Next,
RCR DX,1
There’s no need in rotation, so it could simply be SHR DX, 1
. But both instructions take the same time to execute and both two bytes long, thus doesn’t matter in this example.
Next,
DEC SI
CMP SI,0
JNZ LOOP
Never ever compare with zero after DEC
. It’s moveton! Simply do
DEC SI
JNZ LOOP
Next,
Unnecessary loop split
JZ ADD
CONT:
. . .
JMP END
ADD:
ADD DX, BX
JMP CONT
END:
. . .
Should be
JNZ CONT
ADD DX, BX
CONT:
. . .
END:
. . .
Here we go with a bit optimized routine you have:
LOOP:
TEST AL, 1
JZ SHORT CONT
ADD DX, BX
CONT:
RCR DX, 1
RCR CX, 1
SHR AX, 1
DEC SI
JNZ LOOP
END:
That’s it. Now back (or forward?) to what this little piece of code actually does. The following code sample fully mimics your example, but for 8-bit registers.
MOV AL,12h ; 8 bit multiplicand
MOV AH,34h ; 8 bit multiplier
XOR DX, DX ; result
MOV CX, 8 ; loop for 8 times
LOOP:
TEST AL, 1
JZ SHORT CONT
ADD DH, AH
CONT:
SHR DX, 1
SHR AL, 1
DEC CX
JNZ LOOP
END:
This is a Long Multiplication algorithm
12h = 00010010
x
34h = 01110100
--------
00000000
01110100
00000000
00000000
01110100
00000000
00000000
00000000
Add shifted 34h two times:
0000000011101000
+
0000011101000000
----------------
0000011110101000 = 03A8
That’s it!
Now to use more digits you use the same approach. Below is the implementation in fasm syntax. Result is stored in DX:CX:BX:AX
Num1 dd 0x12345678
Num2 dd 0x9abcdef0
mov si, word [Num1]
mov di, word [Num1 + 2]
xor ax, ax
xor bx, bx
xor cx, cx
xor dx, dx
mov bp, 32
_loop:
test si, 1
jz short _cont
add cx, word [Num2]
adc dx, word [Num2 + 2]
_cont:
rcr dx, 1
rcr cx, 1
rcr bx, 1
rcr ax, 1
rcr di, 1
rcr si, 1
dec bp
jnz short _loop
Cheers ;)