Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
606 views
in Technique[技术] by (71.8m points)

macos - What is this assembly function prologue / epilogue code doing with rbp / rsp / leave?

I am just starting to learn assembly for the mac using the GCC compiler to assemble my code. Unfortunately, there are VERY limited resources for learning how to do this if you are a beginner. I finally managed to find some simple sample code that I could start to rap my head around, and I got it to assemble and run correctly. Here is the code:

.text                                           # start of code indicator.
.globl _main                                    # make the main function visible to the outside.
_main:                                          # actually label this spot as the start of our main function.
    push    %rbp                            # save the base pointer to the stack.
    mov     %rsp, %rbp                      # put the previous stack pointer into the base pointer.
    subl    $8, %esp                        # Balance the stack onto a 16-byte boundary.
    movl    $0, %eax                        # Stuff 0 into EAX, which is where result values go.
    leave                                   # leave cleans up base and stack pointers again.
    ret

The comments explain some things in the code (I kind of understand what lines 2 - 5 do), but I dont understand what most of this means. I do understand the basics of what registers are and what each register here (rbp, rsp, esp and eax) is used for and how big they are, I also understand (generally) what the stack is, but this is still going over my head. Can anyone tell me exactly what this is doing? Also, could anyone point me in the direction of a good tutorial for beginners?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Stack is a data structure that follows LIFO principle. Whereas stacks in everyday life (outside computers, I mean) grow upward, stacks in x86 and x86-64 processors grow downward. See Wikibooks article on x86 stack (but please take into account that the code examples are 32-bit x86 code in Intel syntax, and your code is 64-bit x86-64 code in AT&T syntax).

So, what your code does (my explanations here are with Intel syntax):

push %rbp

Pushes rbp to stack, practically subtracting 8 from rsp (because the size of rbp is 8 bytes) and then stores rbp to [ss:rsp].

So, in Intel syntax push rbp practically does this:

sub rsp, 8
mov [ss:rsp], rbp

Then:

mov     %rsp, %rbp

This is obvious. Just store the value of rsp into rbp.

subl    $8, %esp

Subtract 8 from esp and store it into esp. Actually this is a bug in your code, even if it causes no problems here. Any instruction with a 32-bit register (eax, ebx, ecx, edx, ebp, esp, esi or edi) as destination in x86-64 sets the topmost 32 bits of the corresponding 64-bit register (rax, rbx, rcx, rdx, rbp, rsp, rsi or rdi) to zero, causing the stack pointer to point somewhere below the 4 GiB limit, effectively doing this (in Intel syntax):

sub rsp,8
and rsp,0x00000000ffffffff

Edit: added consequences of sub esp,8 below.

However, this causes no problems on a computer with less than 4 GiB of memory. On computers with more than 4 GiB memory, it may result in a segmentation fault. leave further below in your code returns a sane value to rsp. Generally in x86-64 code you don't need esp never (excluding possibly some optimizations or tweaks). To fix this bug:

subq    $8, %rsp

The instructions so far are the standard entry sequence (replace $8 according to the stack usage). Wikibooks has a useful article on x86 functions and stack frames (but note again that it uses 32-bit x86 assembly with Intel syntax, not 64-bit x86-64 assembly with AT&T syntax).

Then:

movl    $0, %eax

This is obvious. Store 0 into eax. This has nothing to do with the stack.

leave

This is equivalent to mov rsp, rbp followed by pop rbp.

ret

And this, finally, sets rip to the value stored at [ss:rsp], effective returning the code pointer back to where this procedure was called, and adds 8 to rsp.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...