what's the best re
way to remove brackets and their content, as well as the trailing whitespace within a string? Note that not every string is formatted equally.
Script:
import pandas as pd
import re
df = pd.DataFrame({'name':
['University of Southampton (UK)',
'The College of William and Mary',
'University of Reading (UK)',
'Queensland University (Australia)']})
def cleaning(text):
cleaned = re.findall(re.compile('^([^,]+).+'), text)
cleaned = re.findall(re.compile('(.*)'), str(cleaned)) # Why do I have to str() here btw?
return cleaned
df['name'].apply(lambda x: cleaning(x))
Returns:
0 []
1 []
2 []
3 []
Desired output (no whitespace at the end):
0 University of Southampton
1 The College of William and Mary
2 University of Reading
3 Queensland University
Thanks for your help!
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…