If you are looking for efficiency. Using the translate function is the fastest you can get.
It can be used to quickly replace characters and/or delete them.
import string
delete_table = string.maketrans(
string.ascii_lowercase, ' ' * len(string.ascii_lowercase)
)
table = string.maketrans('', '')
"Agh#$%#%2341- -!zdrkfd".translate(table, delete_table)
In python 2.6: you don't need the second table anymore
import string
delete_table = string.maketrans(
string.ascii_lowercase, ' ' * len(string.ascii_lowercase)
)
"Agh#$%#%2341- -!zdrkfd".translate(None, delete_table)
This is method is way faster than any other. Of course you need to store the delete_table somewhere and use it. But even if you don't store it and build it every time, it is still going to be faster than other suggested methods so far.
To confirm my claims here are the results:
for i in xrange(10000):
''.join(c for c in s if c.islower())
real 0m0.189s
user 0m0.176s
sys 0m0.012s
While running the regular expression solution:
for i in xrange(10000):
re.sub(r'[^a-z]', '', s)
real 0m0.172s
user 0m0.164s
sys 0m0.004s
[Upon request] If you pre-compile the regular expression:
r = re.compile(r'[^a-z]')
for i in xrange(10000):
r.sub('', s)
real 0m0.166s
user 0m0.144s
sys 0m0.008s
Running the translate method the same number of times took:
real 0m0.075s
user 0m0.064s
sys 0m0.012s
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…