Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
121 views
in Technique[技术] by (71.8m points)

python - Can't seem to Merge Corectly on Index

I have two dataframes with these indices

Index(['RCFD1410', 'RCFD1420', 'RCFD1460', 'RCFD1563', 'RCFD1590', 'RCFD1616',
       'RCFD1616', 'RCFD1763', 'RCFD1764', 'RCFD1797',
       ...
       'Unnamed: 163_level_0', 'Unnamed: 195_level_0', 'Unnamed: 197_level_0',
       'Unnamed: 204_level_0', 'Unnamed: 79_level_0', 'Unnamed: 81_level_0',
       'Unnamed: 85_level_0', 'Unnamed: 92_level_0', 'IDRSSD', 'qyear'],
      dtype='object', length=240)

df_list.columns
Index(['IDRSSD', 'Line_Item'], dtype='object')

I am trying to select a subset of columns, so I don't get duplicating column names.

cols_to_use = df_list.columns.difference(grouped_and_summed.columns)
df_merged = pd.merge(grouped_and_summed, df_list[cols_to_use], on='IDRSSD')

This gives me an error: KeyError: 'IDRSSD'

That seems weird, because I thought the index names would match, based on 'IDRSSD' in both dataframes.

Now, if I do this.

cols_to_use = df_list.columns.difference(grouped_and_summed.columns)
df_merged = pd.merge(grouped_and_summed, df_list[cols_to_use], left_index=True, right_index=True, how='inner')

I get df_merged, but the size of df_merged is [239 rows x 241 columns]. The size of grouped_and_summed is [523454 rows x 240 columns]. Over 500k rows are chopped off!

Finally, if I do this.

cols_to_use = df_list.columns.difference(grouped_and_summed.columns)
df_merged = pd.merge(grouped_and_summed, df_list[cols_to_use], left_index=True, right_index=True, how='left')

I get a result like this.

         IDRSSD   qyear                 Line_Item  
0       1000052  2001Q1                       NaN  
1       1000052  2001Q2  REAL ESTATE LOANS, TOTAL  
2       1000052  2001Q3       R.E. LOANS-FARMLAND  
3       1000052  2001Q4   R.E. LOANS MULTI-FAMILY  
4       1000052  2002Q1               OTHER LOANS  

Shouldn't Line_Item be repeating if IDRSSD is repeating?! My question is, how can I match Line_Item to IDRSSD? There may be a simple fix for this, but I'm just not seeing it here.

question from:https://stackoverflow.com/questions/65849445/cant-seem-to-merge-corectly-on-index

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)
Waitting for answers

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...