Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
951 views
in Technique[技术] by (71.8m points)

regex - Python - Find sequence of same characters

I'm trying to use regex to match sequences of one or more instances of the same characters in a string.

Example :

string = "55544355"
# The regex should retrieve sequences "555", "44", "3", "55"

Can I have a few tips?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

You can use re.findall() and the ((.)2*) regular expression:

>>> [item[0] for item in re.findall(r"((.)2*)", string)]
['555', '44', '3', '55']

the key part is inside the outer capturing group - (.)2*. Here we capture a single character via (.) then reference this character by the group number: 2. The group number is 2 because we have an outer capturing group with number 1. * means 0 or more times.

You could've also solved it with a single capturing group and re.finditer():

>>> [item.group(0) for item in re.finditer(r"(.)1*", string)]
['555', '44', '3', '55']

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...