Because what else are you going to do with the rest of the 32-bit instruction word? (Assuming you're the CPU architect designing the MIPS instruction set).
It lets LUI + LW load from any arbitrary 32-bit address in 2 instructions, instead of 3. And for loop unrolling or struct pointer->member access, avoiding ADDIU instructions for pointer math. i.e. spending that amount of coding space on LW/SW allows MIPS programs to be more efficient. Sometimes you only need 0($reg)
, but other times it would be a waste of instructions to compute the final address in a register.
Leaving out the 16-bit immediate displacement can't make the instruction shorter. MIPS is a RISC with fixed-length instruction words. (It could be R-type instead of I-type, but you'd still have unused bits in that format. Classic MIPS had lots of unused coding space, and spending coding space on LW/SW, LB/LBU/SB, and so on, is worth it.)
MIPS doesn't have a lot of different opcodes (especially classic MIPS without any FPU instructions, and without 64-bit instructions). It uses a lot of the instruction coding space to support an immediate form for most instructions, with a large immediate. (Unlike ARM32 for example which uses 4 bits in each instruction for predicated execution, and more bits for "flexible" source operand (optional rotate or shift by a constant or another register, or an immediate constant). But ARM immediates are encoded as 8 bits with a rotation, allowing lots of useful bit patterns that are common in real life.)
MIPS only has one addressing mode, and imm16(reg)
can save a significant number of addiu
instructions vs. just (reg)
.
For example, consider a C function that loads or stores to a static (or global) variable. Like
unsigned rng(void) {
static unsigned seed = 1234;
return (seed = seed * 5678 + 0x1234);
}
The compiler-generated (or hand-written) asm needs to load and store from seed
, so you need it in a register. But it's a 32-bit constant that doesn't fit in a single instruction. In hand-written asm you'd probably use a pseudo-instruction like la $t0, rng.seed
, which will assemble to lui $t0, hi(rng.seed)
/ ori $t0, $t0, lo(rng.seed)
. (hi and lo get half of the 32-bit address).
But you can do better than that:
lui $t0, hi(rng.seed)
lw $t1, lo(rng.seed) ($t0)
i.e. use the low 16 bits of the address as the 16-bit displacement in the load instruction. This is in fact what compilers like gcc do:
rng: # gcc5.4 -O3
lui $5,%hi(seed.1482)
lw $4,%lo(seed.1482)($5)
nop ; classic MIPS has a 1-cycle "shadow" for loads before the result is usable, with no pipeline interlock
sll $3,$4,5 ; I should have picked a simpler multiply constant (with fewer bits set)
sll $2,$4,3
subu $2,$3,$2
sll $3,$2,3
subu $2,$3,$2
subu $2,$2,$4
sll $3,$2,4
addu $2,$2,$3
sll $2,$2,1
addiu $2,$2,4660
j $31
sw $2,%lo(seed.1482)($5) ; branch-delay slot
seed.1482:
.word 1234
There are lots of other uses for small immediate displacements from a register. For example:
- accessing locals on the stack if the compiler spills anything
struct
fields
- Array access in an unrolled loop. (MIPS has 32 integer registers, and is pretty much designed for software-pipelining to unroll loops).
- small compile-time constant array indices.
As I said, there isn't much else you could do with those extra 16 bits of the instruction word that would be a good fit for MIPS. You could leave fewer than 16 bits for the displacement, but MIPS isn't PowerPC (where there are lots and lots of opcodes).