assembly - How to determine if ModR/M is needed through Opcodes?

Question

Welcome To Ask or Share your Answers For Others

assembly - How to determine if ModR/M is needed through Opcodes?

1 Reply

深蓝 · Answer 1 · 2021-10-17T03:10:08+0000

Intel's vol.2 manual has details on the encoding of operands for each form of each instruction. e.g. taking just the 8-bit operand size versions of the well-known add instruction, which has 2 reg,rm forms ; a rm,immediate form ; and a no-ModRM 2-byte short for for add al, imm8

Opcode    Instruction    | Op/En |  64-bit Mode | Compat/Leg Mode |  Description
04 ib     ADD AL, imm8   |  I    |   Valid           Valid         Add imm8 to AL.
80 /0 ib  ADD r/m8, imm8 |  MI   |   Valid           Valid         Add imm8 to r/m8.
00 /r     ADD r/m8, r8   |  MR   |   Valid           Valid         Add r8 to r/m8.
02 /r     ADD r8, r/m8   |  RM   |   Valid           Valid         Add r/m8 to r8.

And below that, the Instruction Operand Encoding ? table details what those I / MI / MR / RM codes from the Op/En (operand encoding) column above mean:

Op/En   | Operand 1        | Operand 2     | Operand 3  Operand 4
RM      | ModRM:reg (r, w) | ModRM:r/m (r) |  NA        NA
MR      | ModRM:r/m (r, w) | ModRM:reg (r) |  NA        NA
MI      | ModRM:r/m (r, w) | imm8/16/32    |  NA        NA
I       | AL/AX/EAX/RAX    | imm8/16/32    |  NA        NA

Notice that the "I" operand form doesn't mention a ModRM, so there isn't one. But MI does have one. (With the /r field being filled in with the /0 from the 80 /0 in the opcode table: full explanation with 83 /0 add r/m64, imm8 as an example.)

Notice that RM and MR differ only in whether the r/m operand (that can be memory) is the destination or source.

Most x86 ALU instructions have four reg, r/m opcodes, one for each direction (MR vs. RM) for each of 8-bit and non-8-bit (size determined by 66 operand-size prefix to flip between 16 and 32, or REX.W for 64-bit, or none for the default operand-size (32 in modes other than 16-bit).

Plus the standard immediate form(s):

r/m8 bit with immediate (sharing an opcode byte overloaded via /digit)
r/m 16/32/64-bit with 8-bit sign-extended immediate (sharing an opcode byte overloaded via /digit)
r/m 16/32/64-bit with 16/32/sign_extended_32 bit immediate (sharing an opcode byte overloaded via /digit)
AL no modrm with 8-bit immediate (whole opcode byte to itself)
AX/EAX/RAX no modrm, imm16 / imm32 / sign_extended_imm32 (whole opcode byte to itself)

This is a lot of opcodes for every mnemonic, and is why 8086 didn't have room for more following the same pattern as the usual instruction. (Why are there no NAND, NOR and XNOR instructions in X86?)

See also https://wiki.osdev.org/X86-64_Instruction_Encoding which covers things more concisely than Intel's manual. Also note that you can check your understanding by assembling something with an assembler like NASM or GAS and looking at the machine code. Or just looking at disassembly of an existing program like objdump -drwC -Mintel /bin/ls | less

Some disassemblers even group bytes together in the machine code for each instruction, keeping a 4-byte immediate together as a group separate from opcode and modrm for example. (Agner Fog's objconv is like this.)

Categories

assembly - How to determine if ModR/M is needed through Opcodes?

assembly - How to determine if ModR/M is needed through Opcodes?

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags