Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
1.0k views
in Technique[技术] by (71.8m points)

assembly - How to determine if ModR/M is needed through Opcodes?

I am reading the ia-32 instruction format and found that ModR/M is one byte if required, but how to determine if it is required, someone says it is determined by Opcode, but how? I want to know the details, and is there some useful and authoritative documents which explain the details?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Intel's vol.2 manual has details on the encoding of operands for each form of each instruction. e.g. taking just the 8-bit operand size versions of the well-known add instruction, which has 2 reg,rm forms ; a rm,immediate form ; and a no-ModRM 2-byte short for for add al, imm8

Opcode    Instruction    | Op/En |  64-bit Mode | Compat/Leg Mode |  Description
04 ib     ADD AL, imm8   |  I    |   Valid           Valid         Add imm8 to AL.
80 /0 ib  ADD r/m8, imm8 |  MI   |   Valid           Valid         Add imm8 to r/m8.
00 /r     ADD r/m8, r8   |  MR   |   Valid           Valid         Add r8 to r/m8.
02 /r     ADD r8, r/m8   |  RM   |   Valid           Valid         Add r/m8 to r8.

And below that, the Instruction Operand Encoding ? table details what those I / MI / MR / RM codes from the Op/En (operand encoding) column above mean:

Op/En   | Operand 1        | Operand 2     | Operand 3  Operand 4
RM      | ModRM:reg (r, w) | ModRM:r/m (r) |  NA        NA
MR      | ModRM:r/m (r, w) | ModRM:reg (r) |  NA        NA
MI      | ModRM:r/m (r, w) | imm8/16/32    |  NA        NA
I       | AL/AX/EAX/RAX    | imm8/16/32    |  NA        NA

Notice that the "I" operand form doesn't mention a ModRM, so there isn't one. But MI does have one. (With the /r field being filled in with the /0 from the 80 /0 in the opcode table: full explanation with 83 /0 add r/m64, imm8 as an example.)

Notice that RM and MR differ only in whether the r/m operand (that can be memory) is the destination or source.


Most x86 ALU instructions have four reg, r/m opcodes, one for each direction (MR vs. RM) for each of 8-bit and non-8-bit (size determined by 66 operand-size prefix to flip between 16 and 32, or REX.W for 64-bit, or none for the default operand-size (32 in modes other than 16-bit).

Plus the standard immediate form(s):

  • r/m8 bit with immediate (sharing an opcode byte overloaded via /digit)
  • r/m 16/32/64-bit with 8-bit sign-extended immediate (sharing an opcode byte overloaded via /digit)
  • r/m 16/32/64-bit with 16/32/sign_extended_32 bit immediate (sharing an opcode byte overloaded via /digit)
  • AL no modrm with 8-bit immediate (whole opcode byte to itself)
  • AX/EAX/RAX no modrm, imm16 / imm32 / sign_extended_imm32 (whole opcode byte to itself)

This is a lot of opcodes for every mnemonic, and is why 8086 didn't have room for more following the same pattern as the usual instruction. (Why are there no NAND, NOR and XNOR instructions in X86?)

See also https://wiki.osdev.org/X86-64_Instruction_Encoding which covers things more concisely than Intel's manual. Also note that you can check your understanding by assembling something with an assembler like NASM or GAS and looking at the machine code. Or just looking at disassembly of an existing program like objdump -drwC -Mintel /bin/ls | less

Some disassemblers even group bytes together in the machine code for each instruction, keeping a 4-byte immediate together as a group separate from opcode and modrm for example. (Agner Fog's objconv is like this.)


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

1.4m articles

1.4m replys

5 comments

57.0k users

...