Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
233 views
in Technique[技术] by (71.8m points)

How can I remove all characters inside angle brackets python?

How can I remove all characters inside angle brackets including the brackets in a string? How can I also remove all the text between (" ") and ("."+"any 3 characters") Is this possible? I am currently using the solution by @xkcdjerry

e.g

body = """Dear Students roads etc. you place a tree take a snapshot, then when you place a
building, take a snapshot. Place at least 5-6 objects and then have 5-6
snapshots. Please keep these snapshots with you as everyone will be asked
to share them during the class.

I am attaching one PowerPoint containing instructions and one video of
explanation for your reference.

Kind regards,
Teacher Name
 zoom_0.mp4
<https://drive.google.com/file/d/1UX-klOfVhbefvbhZvIWijaBdQuLgh_-Uru4_1QTkth/view?usp=drive_web>"""
d = re.compile("
.+?\....")
body = d.sub('', body)
a = re.compile("<.*?>")
body = a.sub('', body)
print(body)```

For some reason the output is fine except that it has:
```gle.com/file/d/1UX-klOfVhbefvbhZvIWijaBdQuLgh_-Uru4_1QTkth/view?usp=drive_web>

randomly attached to the end How can I fix it.

question from:https://stackoverflow.com/questions/65932541/how-can-i-remove-all-characters-inside-angle-brackets-python

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Answer

Your problem can be solved by a regex:
Put this into the shell:

import re
a=re.compile("<.*?>")
a.sub('',"Keep this part of the string< Remove this part>Keep This part as well")

Output:

'Keep this part of the stringKeep This part as well'

Second question:

import re
re.compile("
.*?\..{3}")
a.sub('',"Hello
Filename.png")

Output:

'Hello'

Breakdown

Regex is a robust way of finding, replacing, and mutating small strings inside bigger ones, for further reading,consult https://docs.python.org/3/library/re.html. Meanwhile, here are the breakdowns of the regex information used in this answer:

. means any char.
*? means as many of the before as needed but as little as possible(non-greedy match)
So .*? means any number of characters but as little as possible.
Note: The reason there is a \. in the second regex is that a . in the match needs to be escaped by a , which in its turn needs to be escaped as \

The methods: re.compile(patten:str) compiles a regex for farther use. regex.sub(repl:str,string:str) replaces every match of regex in string with repl.

Hope it helps.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...