Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
155 views
in Technique[技术] by (71.8m points)

How to strip whitespace in text using built-in string methods in Python?

I have some text in an input file. The text is cleaned by removing white spaces occurring in it. Sample text looks as follows:

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Cras mattis purus nec aliquam placerat. Donec efficitur ex vel ante mattis fermentum. Fusce consequat placerat lectus a volutpat. Nulla vitae feugiat ex. Ut at sollicitudin felis. Curabitur efficitur ligula molestie lorem sagittis, eu blandit mi sagittis. Duis scelerisque blandit porta. In vel nunc quam. Phasellus aliquet nunc et nibh ullamcorper, at ullamcorper odio cursus. Suspendisse gravida erat ac urna luctus, nec fermentum nulla tincidunt. Etiam sollicitudin bibendum tristique. Class aptent taciti sociosqu ad litora torquent per conubia nostra, per inceptos himenaeos. Maecenas auctor nulla eu faucibus maximus.

What built-in string methods of the Python Standard Library can be used instead of writing a new method to strip the white space characters in the above sample text?

I'm using Python 3.6

question from:https://stackoverflow.com/questions/65914543/how-to-strip-whitespace-in-text-using-built-in-string-methods-in-python

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)
  • You can remove the trailing whitespaces with the strip method.
  • You can remove the all spaces (or other characters) with the replace method.
  • You can remove the all whitespaces with the split and join.
    • The split method splits a string into a list. You can specify the separator, default separator is any whitespace.
    • The join method takes all items in an iterable and joins them into one string.
  • You can remove the all whitespaces with the translate and the build-in string module

Code:

import string

input_text = """
Lorem ipsum dolor sit amet, consectetur adipiscing elit. 
    Cras mattis purus nec aliquam placerat. 
Donec efficitur ex vel ante mattis fermentum."""

print("ORIGINAL:
{}".format(input_text))
print("
WITHOUT TRAILING WHITESPACES:
{}".format(input_text.strip()))
print("
WITHOUT SPACES:
{}".format(input_text.replace(" ", "")))
print("
WITHOUT ANY WHITESPACES:
{}".format("".join(input_text.split())))
# It works only in Python3
print(
    "
WITHOUT ANY WHITESPACES:
{}".format(
        input_text.translate(str.maketrans("", "", string.whitespace))
    )
)

Output:

>>> python3 test.py 
ORIGINAL:

Lorem ipsum dolor sit amet, consectetur adipiscing elit. 
    Cras mattis purus nec aliquam placerat. 
Donec efficitur ex vel ante mattis fermentum.

WITHOUT TRAILING WHITESPACES:
Lorem ipsum dolor sit amet, consectetur adipiscing elit. 
    Cras mattis purus nec aliquam placerat. 
Donec efficitur ex vel ante mattis fermentum.

WITHOUT SPACES:

Loremipsumdolorsitamet,consecteturadipiscingelit.
Crasmattispurusnecaliquamplacerat.
Donecefficiturexvelantemattisfermentum.

WITHOUT ANY WHITESPACES:
Loremipsumdolorsitamet,consecteturadipiscingelit.Crasmattispurusnecaliquamplacerat.Donecefficiturexvelantemattisfermentum.

WITHOUT ANY WHITESPACES:
Loremipsumdolorsitamet,consecteturadipiscingelit.Crasmattispurusnecaliquamplacerat.Donecefficiturexvelantemattisfermentum.

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...