Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
295 views
in Technique[技术] by (71.8m points)

python - Regular expression to return all characters between two special characters

How would I go about using regx to return all characters between two brackets. Here is an example:

foobar['infoNeededHere']ddd
needs to return infoNeededHere

I found a regex to do it between curly brackets but all attempts at making it work with square brackets have failed. Here is that regex: (?<={)[^}]*(?=}) and here is my attempt to hack it

(?<=[)[^}]*(?=])

Final Solution:

import re

str = "foobar['InfoNeeded'],"
match = re.match(r"^.*['(.*)'].*$",str)
print match.group(1)
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

If you're new to REG(gular) EX(pressions) you learn about them at Python Docs. Or, if you want a gentler introduction, you can check out the HOWTO. They use Perl-style syntax.

Regex

The expression that you need is .*?[(.*)].*. The group that you want will be 1.
- .*?: . matches any character but a newline. * is a meta-character and means Repeat this 0 or more times. ? makes the * non-greedy, i.e., . will match up as few chars as possible before hitting a '['.
- [: escapes special meta-characters, which in this case, is [. If we didn't do that, [ would do something very weird instead.
- (.*): Parenthesis 'groups' whatever is inside it and you can later retrieve the groups by their numeric IDs or names (if they're given one).
- ].*: You should know enough by now to know what this means.

Implementation

First, import the re module -- it's not a built-in -- to where-ever you want to use the expression.

Then, use re.search(regex_pattern, string_to_be_tested) to search for the pattern in the string to be tested. This will return a MatchObject which you can store to a temporary variable. You should then call it's group() method and pass 1 as an argument (to see the 'Group 1' we captured using parenthesis earlier). I should now look like:

>>> import re
>>> pat = r'.*?[(.*)].*'             #See Note at the bottom of the answer
>>> s = "foobar['infoNeededHere']ddd"
>>> match = re.search(pat, s)
>>> match.group(1)
"'infoNeededHere'"

An Alternative

You can also use findall() to find all the non-overlapping matches by modifying the regex to (?>=[).+?(?=]).
- (?<=[): (?<=) is called a look-behind assertion and checks for an expression preceding the actual match.
- .+?: + is just like * except that it matches one or more repititions. It is made non-greedy by ?.
- (?=]): (?=) is a look-ahead assertion and checks for an expression following the match w/o capturing it.
Your code should now look like:

>>> import re
>>> pat = r'(?<=[).+?(?=])'  #See Note at the bottom of the answer
>>> s = "foobar['infoNeededHere']ddd[andHere] [andOverHereToo[]"
>>> re.findall(pat, s)
["'infoNeededHere'", 'andHere', 'andOverHereToo['] 

Note: Always use raw Python strings by adding an 'r' before the string (E.g.: r'blah blah blah').

10x for reading! I wrote this answer when there were no accepted ones yet, but by the time I finished it, 2 ore came up and one got accepted. :( x<


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...