Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
197 views
in Technique[技术] by (71.8m points)

c++ - Which is faster in a tight loop ? [swich-cast, if-else, goto-label]

I have a function: f(param), which does a single calculation based on input (param). This function supposed to be called something around 1 million (maximum) in a tight loop:

for (std::uint_fast64_t i = 0; i < 1'000'000; ++i) {
  f (param);
}

First Question:

My (first) question is what is the most effective way to write the condition section (to what to do based on param) of the f() function. I tried some options that I know:

if-else if:

static int f_3(int const param) {
  if (A::_1 == param) {
    return A::_1 * param;
  } else if (A::_2 == param) {
    return A::_2 * param;
  } // ... 
}

switch-case:

static int f(int const param) {
  switch (param) {
    case A::_1:
      return param * A::_1;
    // ...
  }
}

goto-label:

static int f_2(int const param) {
  void constexpr* const  _table[] = {
    && L1, // ... 
  };

  goto* _table[param];

L1:
  return A::_1 * param;
// ...
}

And I benchmarked it:

  • The complete code.
  • Compiler: g++ (GCC) 10.2.0
  • Compiler options: -O3 -std=c++17 -lboost_timer
  • OS: ArchLinux 5.10.8-arch1-1
int main(int argc, char* argv[]) {
  using ufast_t = std::uint_fast64_t;

  ufast_t constexpr n = 1'000'000'000ULL;
  ufast_t sum = 0;

  {
    std::cout << "switch-case :";
    boost::timer::auto_cpu_timer _f;
    for (ufast_t i = 0; i < n; ++i) {
      sum += f(argc);
    }
  }
  std::cout << "---------------------------" << std::endl;
  {
    std::cout << "goto        :";
    boost::timer::auto_cpu_timer _f2;
    for (ufast_t i = 0; i < n; ++i) {
      sum += f_2(argc);
    }
  }
  std::cout << "---------------------------" << std::endl;
  {
    std::cout << "if-elseif   :";
    boost::timer::auto_cpu_timer _f3;
    for (ufast_t i = 0; i < n; ++i) {
      sum += f_3(argc);
    }
  }
  std::cout << std::endl;

  return sum;
}

But it has (i think) a strange output:

 > ./a.out 1 2 3 4 5 6
switch-case : 1.056976s wall, 1.050000s user + 0.000000s system = 1.050000s CPU (99.3%)
---------------------------
goto        : 0.000001s wall, 0.000000s user + 0.000000s system = 0.000000s CPU (n/a%)
---------------------------
if-elseif   : 0.645751s wall, 0.640000s user + 0.000000s system = 0.640000s CPU (99.1%)

 > echo $?
0

So is there any (effective) way to do that? or I just have those options in my hand and I have to select one of them based on some benchmarks?

Second Question:

Why goto in the above benchmark doesn't have any overhead? It's Undefined behavior code? or somehow the compiler removes their loops in the optimization stage?

question from:https://stackoverflow.com/questions/65844193/which-is-faster-in-a-tight-loop-swich-cast-if-else-goto-label

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

I do not think, that the 3 different methods should differ that much from each other, since the compiler will probably recognize what you are doing and do the best thing. The most readable variant is the switch, but that is opinion based. It should also definitely emit a jump table as assembler, which should be the fastest option. But just trust the compiler for that.

It would be best to move the check out of the loop. I can imagine that the compiler will be able to recognize that param does not change and move the check out, itself, but that is much less certain. To make sure, you can do something like this:

template<A param>
auto foo() {
    // This function will be generated once for each enumerator of A that is included
    // in the switch below. In each variant the runtime code will not contain these
    // checks against param.
    if constexpr(param == A::_1) { 
    } else if constexpr(param == A::_2) {
    } ...
}

template<A param>
auto loop() {
    for(auto i = 0ul; i < 1'000'000; ++i) {
        foo<param>();
    }
}

int main(int argc, char** argv) {
    switch(argc) {
        case A::_1: 
            loop<A::_1>();
            break; 
        ....
    }
}

Sadly, there is to my knowledge no nice way to switch on argc.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...