let me explain what you are doing:
regex = re.compile("(aa|bb)+")
you are creating a regex which will look for aa
or bb
and then will try to find if there are more aa
or bb
after that, and it will keep looking for aa
or bb
until it doesnt find. since you want your capturing group to return only the aa
or bb
then you only get the last captured/found group.
however, if you have a string like this: aaxaabbxaa
you will get aa,bb,aa
because you first look at the string and find aa
, then you look for more, and find only an x
, so you have 1 group. then you find another aa
, but then you find a bb
, and then an x
so you stop and you have your second group which is bb
. then you find another aa
. and so your final result is aa,bb,aa
i hope this explains what you are DOING. and it is as expected. to get ANY group of aa
or bb
you need to remove the +
which is telling the regex to seek multiple groups before returning a match. and just have regex return each match of aa
or bb
...
so your regex should be:
regex = re.compile("(aa|bb)")
cheers.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…