Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
201 views
in Technique[技术] by (71.8m points)

c++ - Strange behaviour of ldr [pc, #value]

I was debugging some c++ code (WinCE 6 on ARM platform), and i find some behavior strange:

    4277220C    mov         r3, #0x93, 30
    42772210    str         r3, [sp]
    42772214    ldr         r3, [pc, #0x69C]
    42772218    ldr         r2, [pc, #0x694]
    4277221C    mov         r1, #0
    42772220    ldr         r0, [pc, #0x688]

Line 42772214 ldr r3, [pc, #0x69C] is used to get some constant from .DATA section, at least I think so.

What is strange that according to the code r2 should be filled with memory from address pc=0x42772214 + 0x69C = 0x427728B0, but according to the memory contents it's loaded from 0x427728B8 (8bytes+), it happens for other ldr usages too.

Is it fault of the debugger or my understanding of ldr/pc? Another issue I don't get - why access to the .data section is relative to the executed code? I find it little bit strange.

And one more issue: i cannot find syntax of the 1st mov command (any one could point me a optype specification for the Thumb (1C2))

Sorry for the laic description, but I'm just familiarizing with the assemblies.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

This is correct. When pc is used for reading there is an 8-byte offset in ARM mode and 4-byte offset in Thumb mode.

From the ARM-ARM:

When an instruction reads the PC, the value read depends on which instruction set it comes from:

  • For an ARM instruction, the value read is the address of the instruction plus 8 bytes. Bits [1:0] of this value are always zero, because ARM instructions are always word-aligned.
  • For a Thumb instruction, the value read is the address of the instruction plus 4 bytes. Bit [0] of this value is always zero, because Thumb instructions are always halfword-aligned.

This way of reading the PC is primarily used for quick, position-independent addressing of nearby instructions and data, including position-independent branching within a program.

There are 2 reasons for pc-relative addressing.

  1. Position-independent code, which is in your case.
  2. Get some complicated constants nearby which cannot be written in 1 simple instruction, e.g. mov r3, #0x12345678 is impossible to complete in 1 instruction, so the compiler may put this constant in the end of the function and use e.g. ldr r3, [pc, #0x50] to load it instead.

I don't know what mov r3, #0x93, 30 means. Probably it is mov r3, #0x93, rol 30 (which gives 0xC0000024)?


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...