You can remove comments by parsing the Python code with tokenize.generate_tokens
. The following is a slightly modified version of this example from the docs:
import tokenize
import io
import sys
if sys.version_info[0] == 3:
StringIO = io.StringIO
else:
StringIO = io.BytesIO
def nocomment(s):
result = []
g = tokenize.generate_tokens(StringIO(s).readline)
for toknum, tokval, _, _, _ in g:
# print(toknum,tokval)
if toknum != tokenize.COMMENT:
result.append((toknum, tokval))
return tokenize.untokenize(result)
with open('script.py','r') as f:
content=f.read()
print(nocomment(content))
For example:
If script.py contains
def foo(): # Remove this comment
''' But do not remove this #1 docstring
'''
# Another comment
pass
then the output of nocomment
is
def foo ():
''' But do not remove this #1 docstring
'''
pass
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…