python - regex search and findall

Question

Welcome To Ask or Share your Answers For Others

python - regex search and findall

posted Oct 17, 2021 in Technique[技术] by 深蓝 (71.8m points)

python - regex search and findall

I need to find all matches in a string for a given regex. I've been using findall() to do that until I came across a case where it wasn't doing what I expected. For example:

regex = re.compile('(d+,?)+')
s = 'There are 9,000,000 bicycles in Beijing.'

print re.search(regex, s).group(0)
> 9,000,000

print re.findall(regex, s)
> ['000']

In this case search() returns what I need (the longest match) but findall() behaves differently, although the docs imply it should be the same:

findall() matches all occurrences of a pattern, not just the first one as search() does.

Why is the behaviour different?
How can I achieve the result of search() with findall() (or something else)?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Reply

深蓝 · Answer 1 · 2021-10-17T01:13:33+0000

Ok, I see what's going on... from the docs:

If one or more groups are present in the pattern, return a list of groups; this will be a list of tuples if the pattern has more than one group.

As it turns out, you do have a group, "(d+,?)"... so, what it's returning is the last occurrence of this group, or 000.

One solution is to surround the entire regex by a group, like this

regex = re.compile('((d+,?)+)')

then, it will return [('9,000,000', '000')], which is a tuple containing both matched groups. of course, you only care about the first one.

Personally, i would use the following regex

regex = re.compile('((d+,)*d+)')

to avoid matching stuff like " this is a bad number 9,123,"

Edit.

Here's a way to avoid having to surround the expression by parenthesis or deal with tuples

s = "..."
regex = re.compile('(d+,?)+')
it = re.finditer(regex, s)

for match in it:
  print match.group(0)

finditer returns an iterator that you can use to access all the matches found. these match objects are the same that re.search returns, so group(0) returns the result you expect.

Categories

python - regex search and findall

python - regex search and findall

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags