Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
301 views
in Technique[技术] by (71.8m points)

python - What is to be done in argument of copy function so that it does not affect original one?

Error :- /usr/local/lib/python3.6/dist-packages/ipykernel_launcher.py:3: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy This is separate from the ipykernel package so we can avoid doing imports until

import pandas as pd

d = {'Name':pd.Series(['A','B','C','D','E','F','G','H','I','J']),
   'Value':pd.Series([45,37,59,'?',47,39,'?',43,52,'?'])}

df = pd.DataFrame(d)
print(df)

def replace_NAN(df):
    index = df['Value']=='?'
    df ['Value'][index] = float("NaN")
    print(df)

def ignore_missing_value(df):
    df [df.loc[:,'Value']=='?'].loc[:,'Value'] = float("NaN")
    df.dropna()
    print(df)

def replace_with_mean(df):
    df [df.loc[:,'Value']=='?'].loc[:,'Value'] = float("NaN")
    m = df.loc[:,'Value'].mean()
    df.fillna(m)
    print(df)

df1=df.copy()
replace_NAN(df1)
#ignore_missing_value(df1)
#replace_with_mean(df1)
question from:https://stackoverflow.com/questions/65952372/what-is-to-be-done-in-argument-of-copy-function-so-that-it-does-not-affect-origi

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Missing values are not ?, your functions should be changed:

Here is used DataFrame.loc for set values by condition:

def replace_val(df):
    df.loc[df['Value']=='?', 'Value']= float("NaN")
    #alternative
    #df['Value'] = df['Value'].replace('?',float("NaN"))
    print(df)

For filtering is used boolean indexing:

def filter_value(df):
    df = df.loc['Value' =='?']
    print(df)

Last you can change mask with != for select all values without ?:

def replace_with_mean(df):
    df.loc['Value' =='?','Value'] = df.loc['Value' !='?','Value'].mean()
    print(df)

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

1.4m articles

1.4m replys

5 comments

57.0k users

...