c - Why does GCC pad functions with NOPs?

Question

Welcome To Ask or Share your Answers For Others

c - Why does GCC pad functions with NOPs?

posted Oct 17, 2021 in Technique[技术] by 深蓝 (71.8m points)

c - Why does GCC pad functions with NOPs?

I've been working with C for a short while and very recently started to get into ASM. When I compile a program:

int main(void)
  {
  int a = 0;
  a += 1;
  return 0;
  }

The objdump disassembly has the code, but nops after the ret:

...
08048394 <main>:
 8048394:       55                      push   %ebp
 8048395:       89 e5                   mov    %esp,%ebp
 8048397:       83 ec 10                sub    $0x10,%esp
 804839a:       c7 45 fc 00 00 00 00    movl   $0x0,-0x4(%ebp)
 80483a1:       83 45 fc 01             addl   $0x1,-0x4(%ebp)
 80483a5:       b8 00 00 00 00          mov    $0x0,%eax
 80483aa:       c9                      leave  
 80483ab:       c3                      ret    
 80483ac:       90                      nop
 80483ad:       90                      nop
 80483ae:       90                      nop
 80483af:       90                      nop
...

From what I learned nops do nothing, and since after ret wouldn't even be executed.

My question is: why bother? Couldn't ELF(linux-x86) work with a .text section(+main) of any size?

I'd appreciate any help, just trying to learn.

Question&Answers:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Reply

深蓝 · Answer 1 · 2021-10-16T22:31:09+0000

First of all, gcc doesn't always do this. The padding is controlled by -falign-functions, which is automatically turned on by -O2 and -O3:

-falign-functions
-falign-functions=n

Align the start of functions to the next power-of-two greater than n, skipping up to n bytes. For instance, -falign-functions=32 aligns functions to the next 32-byte boundary, but -falign-functions=24 would align to the next 32-byte boundary only if this can be done by skipping 23 bytes or less.

-fno-align-functions and -falign-functions=1 are equivalent and mean that functions will not be aligned.

Some assemblers only support this flag when n is a power of two; in that case, it is rounded up.

If n is not specified or is zero, use a machine-dependent default.

Enabled at levels -O2, -O3.

There could be multiple reasons for doing this, but the main one on x86 is probably this:

Most processors fetch instructions in aligned 16-byte or 32-byte blocks. It can be advantageous to align critical loop entries and subroutine entries by 16 in order to minimize the number of 16-byte boundaries in the code. Alternatively, make sure that there is no 16-byte boundary in the first few instructions after a critical loop entry or subroutine entry.

(Quoted from "Optimizing subroutines in assembly language" by Agner Fog.)

edit: Here is an example that demonstrates the padding:

// align.c
int f(void) { return 0; }
int g(void) { return 0; }

When compiled using gcc 4.4.5 with default settings, I get:

align.o:     file format elf64-x86-64

Disassembly of section .text:

0000000000000000 <f>:
   0:   55                      push   %rbp
   1:   48 89 e5                mov    %rsp,%rbp
   4:   b8 00 00 00 00          mov    $0x0,%eax
   9:   c9                      leaveq 
   a:   c3                      retq   

000000000000000b <g>:
   b:   55                      push   %rbp
   c:   48 89 e5                mov    %rsp,%rbp
   f:   b8 00 00 00 00          mov    $0x0,%eax
  14:   c9                      leaveq 
  15:   c3                      retq

Specifying -falign-functions gives:

align.o:     file format elf64-x86-64

Disassembly of section .text:

0000000000000000 <f>:
   0:   55                      push   %rbp
   1:   48 89 e5                mov    %rsp,%rbp
   4:   b8 00 00 00 00          mov    $0x0,%eax
   9:   c9                      leaveq 
   a:   c3                      retq   
   b:   eb 03                   jmp    10 <g>
   d:   90                      nop
   e:   90                      nop
   f:   90                      nop

0000000000000010 <g>:
  10:   55                      push   %rbp
  11:   48 89 e5                mov    %rsp,%rbp
  14:   b8 00 00 00 00          mov    $0x0,%eax
  19:   c9                      leaveq 
  1a:   c3                      retq

Categories

c - Why does GCC pad functions with NOPs?

c - Why does GCC pad functions with NOPs?

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags