Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
580 views
in Technique[技术] by (71.8m points)

python - How to get an expression between balanced parentheses

Suppose I am given the following kind of string:

"(this is (haha) a string(()and it's sneaky)) ipsom (lorem) bla"

and I want to extract substrings contained within a topmost layer of parentheses. I.e. I want to obtain the strings:"this is (haha) a string(()and it's sneaky)" and "lorem".

Is there a nice pythonic method to do this? Regular expressions are not obviously up to this task, but maybe there is a way to get an xml parser to do the job? For my application I can assume the parentheses are well formed, i.e. not something like (()(().

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

This is a standard use case for a stack: You read the string character-wise and whenever you encounter an opening parenthesis, you push the symbol to the stack; if you encounter a closing parenthesis, you pop the symbol from the stack.

Since you only have a single type of parentheses, you don’t actually need a stack; instead, it’s enough to just remember how many open parentheses there are.

In addition, in order to extract the texts, we also remember where a part starts when a parenthesis on the first level opens and collect the resulting string when we encounter the matching closing parenthesis.

This could look like this:

string = "(this is (haha) a string(()and it's sneaky)) ipsom (lorem) bla"

stack = 0
startIndex = None
results = []

for i, c in enumerate(string):
    if c == '(':
        if stack == 0:
            startIndex = i + 1 # string to extract starts one index later

        # push to stack
        stack += 1
    elif c == ')':
        # pop stack
        stack -= 1

        if stack == 0:
            results.append(string[startIndex:i])

print(results)
# ["this is (haha) a string(()and it's sneaky)", 'lorem']

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...