Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
1.0k views
in Technique[技术] by (71.8m points)

assembly - How to interpret x86 opcode map?

In looking at an x86 opcode map such as this:

http://www.mlsite.net/8086/#tbl_map1

It defines mappings, for example:

00: ADD Eb,Gb
01: ADD Ev,Gv
...

That link has basic descriptions of what the letters mean, such as:

  • E: A ModR/M byte follows the opcode and specifies the operand. The operand is either a general-purpose register or a memory address. If it is a memory address, the address is computed from a segment register and any of the following values: a base register, an index register, a displacement.
  • b: Byte argument.

But it's a bit too vague. How do you actually translate that into "complete opcode" (the whole instruction + args in opcode)? Haven't been able to figure it out from the Intel manuals yet either, maybe I'm looking in the wrong place (and it's a bit overwhelming)? Seeing a snippet showing the output opcode for an input instruction (and how you did that) would be super helpful.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

By all means, use the intel manuals. For each instruction it gives the machine code and chapter 2 has a very detailed description on the instruction format.

But to give you a walkthrough, let's see ADD EDX, [EBX+ECX*4+15h]. First we read through the chapters 2 INSTRUCTION FORMAT and 3.1 INTERPRETING THE INSTRUCTION REFERENCE PAGES to get an idea of what we will see. We are especially interested in the abbreviations listed at 3.1.1.3 Instruction Column in the Opcode Summary Table.

Armed with that information, we turn to the page describing the ADD instruction and try to identify an appropriate version for the one we want to encode. Our first operand is a 32 bit register and the second is a 32 bit memory location, so let's see what matches that. It's going to be the penultimate line: 03 /r ADD r32, r/m32. We go back to chapter 3.1.1.1 Opcode Column in the Instruction Summary Table (Instructions without VEX prefix) to see what that magical /r is: Indicates that the ModR/M byte of the instruction contains a register operand and an r/m operand.

Okay, so Figure 2-1. Intel 64 and IA-32 Architectures Instruction Format showed us how the instruction will look. So far we know that we won't have any prefixes and the opcode will be 03 and we will use at least a modr/m byte. So let's go see how to figure that out. Look at Table 2-2. 32-Bit Addressing Forms with the ModR/M Byte. The columns represent the register operand, the rows the memory operand. Since our register is EDX we use the 3rd column.

The memory operand is [EBX+ECX*4+15h] which can be encoded using a 8 or a 32 bit displacement. To get shorter code we will use the 8 bit version, so the line [--][--]+disp8 applies. This means our modr/m byte is going to be 54.

We will need a SIB byte too. Those are listed in Table 2-3. 32-Bit Addressing Forms with the SIB Byte. Since our base is EBX we use column 4, and the row for [ECX*4] which gives us our SIB byte of 8B.

Finally we add our 8 bit displacement byte, which is 15. The complete instruction is thus 03 54 8B 15. We can verify this with an assembler:

2 00000000 03548B15                add edx, [ebx+ecx*4+15h]

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...