Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
553 views
in Technique[技术] by (71.8m points)

c++ regex extract all substrings using regex_search()

I am new to c++ regex. I have a string "{1,2,3}" and I want to extract the numbers 1 2 3. I thought I should use regex_search but it failed.

#include<iostream>
#include<regex>
#include<string>
using namespace std;
int main()
{
        string s1("{1,2,3}");
        string s2("{}");
        smatch sm;
        regex e(R"(d+)");
        cout << s1 << endl;
        if (regex_search(s1,sm,e)){
                cout << "size: " << sm.size() << endl;
                for (int i = 0 ; i < sm.size(); ++i){
                        cout << "the " << i+1 << "th match" <<": "<< sm[i] <<  endl;
                }
        }
}

The result:

{1,2,3}
size: 1
the 1th match: 1
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

std::regex_search returns after only the first match found.

What std::smatch gives you is all the matched groups in the regular expression. Your regular expression only contains one group so std::smatch only has one item in it.

If you want to find all matches you need to use std::sregex_iterator.

int main()
{
    std::string s1("{1,2,3}");
    std::regex e(R"(d+)");

    std::cout << s1 << std::endl;

    std::sregex_iterator iter(s1.begin(), s1.end(), e);
    std::sregex_iterator end;

    while(iter != end)
    {
        std::cout << "size: " << iter->size() << std::endl;

        for(unsigned i = 0; i < iter->size(); ++i)
        {
            std::cout << "the " << i + 1 << "th match" << ": " << (*iter)[i] << std::endl;
        }
        ++iter;
    }
}

Output:

{1,2,3}
size: 1
the 1th match: 1
size: 1
the 1th match: 2
size: 1
the 1th match: 3

The end iterator is default constructed by design so that it is equal to iter when iter has run out of matches. Notice at the bottom of the loop I do ++iter. That moves iter on to the next match. When there are no more matches, iter has the same value as the default constructed end.

Another example to show the submatching (capture groups):

int main()
{
    std::string s1("{1,2,3}{4,5,6}{7,8,9}");
    std::regex e(R"~((d+),(d+),(d+))~");

    std::cout << s1 << std::endl;

    std::sregex_iterator iter(s1.begin(), s1.end(), e);
    std::sregex_iterator end;

    while(iter != end)
    {
        std::cout << "size: " << iter->size() << std::endl;

        std::cout << "expression match #" << 0 << ": " << (*iter)[0] << std::endl;
        for(unsigned i = 1; i < iter->size(); ++i)
        {
            std::cout << "capture submatch #" << i << ": " << (*iter)[i] << std::endl;
        }
        ++iter;
    }
}

Output:

{1,2,3}{4,5,6}{7,8,9}
size: 4
expression match #0: 1,2,3
capture submatch #1: 1
capture submatch #2: 2
capture submatch #3: 3
size: 4
expression match #0: 4,5,6
capture submatch #1: 4
capture submatch #2: 5
capture submatch #3: 6
size: 4
expression match #0: 7,8,9
capture submatch #1: 7
capture submatch #2: 8
capture submatch #3: 9

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...