I am trying to match an html a tag with regex as shown below:
string = r'<a class="article_txt" href="/store">abcd</a>'
pattern = r'<[s]*a[s]*(class[s]*=[s]*"article_txt")[s]*.*?</a>'
tags = re.search(pattern, string)
print(tags.group())
with the expected output: <a class="article_txt" href="/store">abcd</a>
while this gives me the correct answer, I however, need to search multiple matches in a file. so I tried using re.findall
. But to my surprise, this doesn't match correctly.
The code for this specific example is:
string = r'<a class="article_txt" href="/store">abcd</a>'
pattern = r'<[s]*a[s]*(class[s]*=[s]*"article_txt")[s]*.*?</a>'
tags = re.findall(pattern, string)
print(tags)
returns this mysterious output: ['class="article_txt"']
why did findall only match the group in my regex?
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…