Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
197 views
in Technique[技术] by (71.8m points)

c - How gcc optimize sum from 1 to 100 into 5050

I compiled following c program with gcc -O3 -S a.c:

#include <stdio.h>

int main() {
    int sum = 0;
    for (int i = 0; i <= 100; i++) {
        sum += i;
    }
    printf("%d", sum);
}

The generated assembly code is as follow:

    .section    __TEXT,__text,regular,pure_instructions
    .build_version macos, 10, 15    sdk_version 10, 15, 4
    .globl  _main                   ## -- Begin function main
    .p2align    4, 0x90
_main:                                  ## @main
    .cfi_startproc
## %bb.0:
    pushq   %rbp
    .cfi_def_cfa_offset 16
    .cfi_offset %rbp, -16
    movq    %rsp, %rbp
    .cfi_def_cfa_register %rbp
    leaq    L_.str(%rip), %rdi
    movl    $5050, %esi             ## imm = 0x13BA
    xorl    %eax, %eax
    callq   _printf
    xorl    %eax, %eax
    popq    %rbp
    retq
    .cfi_endproc
                                        ## -- End function
    .section    __TEXT,__cstring,cstring_literals
L_.str:                                 ## @.str
    .asciz  "%d"

.subsections_via_symbols

As if GCC ran the code and notice that the loop times are determined and GCC replaced the whole calculating with the result 5050.

movl $5050, %esi

How does gcc do this kind of optimization? What's the academic name of this kind of optimization?

question from:https://stackoverflow.com/questions/65896918/how-gcc-optimize-sum-from-1-to-100-into-5050

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

I was trying to find the specific flag, but with no luck. I was able to reproduce your result with only -O1 so what I did after that was manually adding all the flags that are enabled by -O1. But when I did, I could not reproduce the result.

And as can be read in the documentation

Not all optimizations are controlled directly by a flag. Only optimizations that have a flag are listed in this section.

So I assume this is done by something that -O1 adds that does not have a flag.

Here is what I tried:

$ gcc -fauto-inc-dec  -fbranch-count-reg  -fcombine-stack-adjustments 
-fcompare-elim  -fcprop-registers  -fdce  -fdefer-pop  -fdelayed-branch  -fdse  
-fforward-propagate  -fguess-branch-probability  -fif-conversion 
-fif-conversion2  -finline-functions-called-once  -fipa-profile  
-fipa-pure-const  -fipa-reference  -fipa-reference-addressable 
-fmerge-constants  -fmove-loop-invariants  -fomit-frame-pointer 
-freorder-blocks  -fshrink-wrap  -fshrink-wrap-separate  -fsplit-wide-types  
-fssa-backprop  -fssa-phiopt  -ftree-bit-ccp  -ftree-ccp  -ftree-ch 
-ftree-coalesce-vars  -ftree-copy-prop  -ftree-dce  -ftree-dominator-opts 
-ftree-dse  -ftree-forwprop  -ftree-fre  -ftree-phiprop  -ftree-pta 
-ftree-scev-cprop  -ftree-sink  -ftree-slsr  -ftree-sra  -ftree-ter 
-funit-at-a-time -S k.c 
cc1: warning: this target machine does not have delayed branches
$ grep "5050" k.s
$ gcc -O1 -S k.c
$ grep "5050" k.s
    movl    $5050, %esi
$ 

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...