Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
1.0k views
in Technique[技术] by (71.8m points)

assembly - Why does the lw instruction's second argument take in both an offset and regSource?

So the lw instruction is in the following format: lw RegDest, Offset(RegSource). Why does the second argument take in both an offset and register source? Why not only one (i.e. only register source)?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Because what else are you going to do with the rest of the 32-bit instruction word? (Assuming you're the CPU architect designing the MIPS instruction set).

It lets LUI + LW load from any arbitrary 32-bit address in 2 instructions, instead of 3. And for loop unrolling or struct pointer->member access, avoiding ADDIU instructions for pointer math. i.e. spending that amount of coding space on LW/SW allows MIPS programs to be more efficient. Sometimes you only need 0($reg), but other times it would be a waste of instructions to compute the final address in a register.

Leaving out the 16-bit immediate displacement can't make the instruction shorter. MIPS is a RISC with fixed-length instruction words. (It could be R-type instead of I-type, but you'd still have unused bits in that format. Classic MIPS had lots of unused coding space, and spending coding space on LW/SW, LB/LBU/SB, and so on, is worth it.)

MIPS doesn't have a lot of different opcodes (especially classic MIPS without any FPU instructions, and without 64-bit instructions). It uses a lot of the instruction coding space to support an immediate form for most instructions, with a large immediate. (Unlike ARM32 for example which uses 4 bits in each instruction for predicated execution, and more bits for "flexible" source operand (optional rotate or shift by a constant or another register, or an immediate constant). But ARM immediates are encoded as 8 bits with a rotation, allowing lots of useful bit patterns that are common in real life.)


MIPS only has one addressing mode, and imm16(reg) can save a significant number of addiu instructions vs. just (reg).

For example, consider a C function that loads or stores to a static (or global) variable. Like

unsigned rng(void) {
    static unsigned seed = 1234;
    return (seed = seed * 5678 + 0x1234);
}

The compiler-generated (or hand-written) asm needs to load and store from seed, so you need it in a register. But it's a 32-bit constant that doesn't fit in a single instruction. In hand-written asm you'd probably use a pseudo-instruction like la $t0, rng.seed, which will assemble to lui $t0, hi(rng.seed) / ori $t0, $t0, lo(rng.seed). (hi and lo get half of the 32-bit address).

But you can do better than that:

lui   $t0, hi(rng.seed)
lw    $t1, lo(rng.seed) ($t0)

i.e. use the low 16 bits of the address as the 16-bit displacement in the load instruction. This is in fact what compilers like gcc do:

rng:    # gcc5.4 -O3
    lui     $5,%hi(seed.1482)
    lw      $4,%lo(seed.1482)($5)
    nop                       ; classic MIPS has a 1-cycle "shadow" for loads before the result is usable, with no pipeline interlock
    sll     $3,$4,5          ; I should have picked a simpler multiply constant (with fewer bits set)
    sll     $2,$4,3
    subu    $2,$3,$2
    sll     $3,$2,3
    subu    $2,$3,$2
    subu    $2,$2,$4
    sll     $3,$2,4
    addu    $2,$2,$3
    sll     $2,$2,1
    addiu   $2,$2,4660
    j       $31
    sw      $2,%lo(seed.1482)($5)       ; branch-delay slot

seed.1482:
    .word   1234

There are lots of other uses for small immediate displacements from a register. For example:

  • accessing locals on the stack if the compiler spills anything
  • struct fields
  • Array access in an unrolled loop. (MIPS has 32 integer registers, and is pretty much designed for software-pipelining to unroll loops).
  • small compile-time constant array indices.

As I said, there isn't much else you could do with those extra 16 bits of the instruction word that would be a good fit for MIPS. You could leave fewer than 16 bits for the displacement, but MIPS isn't PowerPC (where there are lots and lots of opcodes).


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

1.4m articles

1.4m replys

5 comments

57.0k users

...