Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
153 views
in Technique[技术] by (71.8m points)

c - How does kernel get an executable binary file running under linux?

How does kernel get an executable binary file running under linux?

It seems a simple question, but anyone can help me dig deep? How the file is loaded to memory and how execution code get started?

Can anyone help me and tell what's happening step by step?

Question&Answers:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Best moments of the exec system call on Linux 4.0

The best way to find all of that out is to GDB step debug the kernel with QEMU: How to debug the Linux kernel with GDB and QEMU?

  • fs/exec.c defines the system call at SYSCALL_DEFINE3(execve

    Simply forwards to do_execve.

  • do_execve

    Forwards to do_execveat_common.

  • do_execveat_common

    To find the next major function, track when return value retval is last modified.

    Starts building a struct linux_binprm *bprm to describe the program, and passes it to exec_binprm to execute.

  • exec_binprm

    Once again, follow the return value to find the next major call.

  • search_binary_handler

    • Handlers are determined by the first magic bytes of the executable.

      The two most common handlers are those for interpreted files (#! magic) and for ELF (x7fELF magic), but there are other built-into the kernel, e.g. a.out. And users can also register their own though /proc/sys/fs/binfmt_misc

      The ELF handler is defined at fs/binfmt_elf.c.

      See also: Why do people write the #!/usr/bin/env python shebang on the first line of a Python script?

    • The formats list contains all the handlers.

      Each handler file contains something like:

      static int __init init_elf_binfmt(void)
      {
          register_binfmt(&elf_format);
          return 0;
      }
      

      and elf_format is a struct linux_binfmt defined in that file.

      __init is magic and puts that code into a magic section that gets called when the kernel starts: What does __init mean in the Linux kernel code?

      Linker-level dependency injection!

    • There is also a recursion counter, in case an interpreter executes itself infinitely.

      Try this:

      echo '#!/tmp/a' > /tmp/a
      chmod +x /tmp/a
      /tmp/a
      
    • Once again we follow the return value to see what comes next, and see that it comes from:

      retval = fmt->load_binary(bprm);
      

      where load_binary is defined for each handler on the struct: C-style polymorsphism.

  • fs/binfmt_elf.c:load_binary

    Does the actual work:

  • eventually the scheduler decides to run the process, and it must then jump to the PC address stored in struct pt_regs while also moving to a less privileged CPU state such as Ring 3 / EL0: What are Ring 0 and Ring 3 in the context of operating systems?

    The scheduler gets woken up periodically by a clock hardware that generates interrupts periodically as configured earlier by the kernel, for example the old x86 PIT or the ARM timer. The kernel also registers handlers which run the scheduler code when the timer interrupts are fired.

TODO: continue source analysis further. What I expect to happen next:

  • the kernel parses the INTERP header of the ELF to find the dynamic loader (usually set to /lib64/ld-linux-x86-64.so.2).
  • if it is present:
    • the kernel mmaps the dynamic loader and the ELF to be executed to memory
    • dynamic loader is started, taking a pointer to the ELF in memory.
    • now in userland, the loader somehow parses elf headers, and does dlopen on them
    • dlopen uses a configurable search path to find those libraries (ldd and friends), mmap them to memory, and somehow inform the ELF where to find its missing symbols
    • loader calls the _start of the ELF
  • otherwise, the kernel loads the executable into memory directly without the dynamic loader.

    It must therefore in particular check if the executable is PIE or not an if it is place it in memory at a random location: What is the -fPIE option for position-independent executables in gcc and ld?


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...