Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
319 views
in Technique[技术] by (71.8m points)

python - How to remove strings present in a list from a column in pandas

I have a dataframe df,

import pandas as pd

df = pd.DataFrame(
    {
        "ID": [1, 2, 3, 4, 5],
        "name": [
            "Hello Kitty",
            "Hello Puppy",
            "It is an Helloexample",
            "for stackoverflow",
            "Hello World",
        ],
    }
)

which looks like:

   ID               name
0   1        Hello Kitty
1   2        Hello Puppy
2   3   It is an Helloexample
3   4  for stackoverflow
4   5        Hello World

I have a list of strings To_remove_list

To_remove_lst = ["Hello", "for", "an", "It"]

I need to remove all the strings present in the list from the column name of df. How can I do this in pandas ?

My expected answer is:

   ID               name
0   1              Kitty
1   2              Puppy
2   3              is example
3   4              stackoverflow
4   5              World
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

I think need str.replace if want remove also substrings:

df['name'] = df['name'].str.replace('|'.join(To_remove_lst), '')

If possible some regex characters:

import re
df['name'] = df['name'].str.replace('|'.join(map(re.escape, To_remove_lst)), '')

print (df)
   ID            name
0   1           Kitty
1   2           Puppy
2   3     is  example
3   4   stackoverflow
4   5           World

But if want remove only words use nested list comprehension:

df['name'] = [' '.join([y for y in x.split() if y not in To_remove_lst]) for x in df['name']]

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...