Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
3.4k views
in Technique[技术] by (71.8m points)

c++ - std::remove_if GCC implementation isn't efficient?

From another question here there seems to be evidence, that GCC's implementation of std::remove_if doesn't provide equally efficiency compared to the following implementation:

'raw homebrew' solution:

static char str1[100] = "str,, ing";
size_t size = sizeof(str1)/sizeof(str1[0]);

int bad = 0;
int cur = 0;
while (str1[cur] != '') {
    if (bad < cur && !ispunct(str1[cur]) && !isspace(str1[cur])) {
        str1[bad] = str1[cur];
    }
    if (ispunct(str1[cur]) || isspace(str1[cur])) {
        cur++;
    } else {
        cur++;
        bad++;
    }
}
str1[bad] = '';

Timing outputs:

0.106860

Sample benchmarking code for std::remove_if for a solution of the same problem:

bool is_char_category_in_question(const char& c) {
    return std::ispunct(c) || std::isspace(c);
}

std::remove_if(&str1[0], &str1[size-1], is_char_category_in_question);

Timing outputs:

1.986838

Check and get actual runtime results for the code running the ideone links above please (giving the full codes here would obscure the question!).

Given the provided execution time results (from the samples), these seem to confirm the first implementation is having much better performance.

Can anyone tell reasons, why the std::remove_if() algorithm doesn't (or can't) provide a similarly efficient solution for the given problem?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Looks to me as though you're running remove_if on a range of 100 characters since size is 100, but the "homebrew" runs until you find the nul terminator (which is only 10 characters in).

Dealing with that using the change in your comment below, on GCC with -O2 I still see a difference of about a factor of 2, with remove_if being slower. Changing to:

struct is_char_category_in_question {
    bool operator()(const char& c) const {
        return std::ispunct(c) || std::isspace(c);
    }
};

gets rid of almost all of this difference, although there may still be a <10% difference. So that looks to me like a quality of implementation issue, whether or not the test gets inlined although I haven't checked the assembly to confirm.

Since your test harness means that no characters are actually removed after the first pass, I'm not troubled by a 10% difference. I'm a bit surprised, but not enough to really get into it. YMMV :-)


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...