Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
554 views
in Technique[技术] by (71.8m points)

python - Append dataframes with different column names - Pandas

I have 3 dataframes which can be generated from the code shown below

df1= pd.DataFrame({'person_id':[1,2,3],'gender': ['Male','Female','Not disclosed'],'ethn': ['Chinese','Indian','European']})
df2= pd.DataFrame({'pers_id':[4,5,6],'gen': ['Male','Female','Not disclosed'],'ethnicity': ['Chinese','Indian','European']})
df3= pd.DataFrame({'son_id':[7,8,9],'sex': ['Male','Female','Not disclosed'],'ethnici': ['Chinese','Indian','European']})

I would like to do two things

a) Append all these 3 dataframes into one large result dataframe

When I attempted this using the below code, the output isn't as expected

df1.append(df2)

enter image description here

So, to resolve this, I understand we have to rename the column names which leads to objective b below

b) Rename the column of these n dataframes to be uniform in a elegant way

Please note that in real time I might have dataframe with different column names which I may not know in advance but the values in them will always be the same belonging to columns Ethnicity, Gender and Person_id. But note there can be several other columns as well like Age, Date,bp reading etc

Currently, I do this by manually reading the column names using below code

df2.columns
df2.rename(columns={ethnicity:'ethn',gender = 'gen',person_id='pers_id}, 
             inplace=True)

How can I set the column names for all dataframe to be the same (gender,ethnicity,person_id and etc) irrespective of their original column values

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

As per pandas documentation, you can do this creating a mapping:

df2.rename(columns={column1:'ethn', column2:'gen', column3:'pers_id'}, inplace=True)

Now, you clearly stated that you have to do this runtime. If you know that number of columns and their respective positions won't change, you can collect the actual column names with df2.columns(), that should output something like that:

['ethnicity', 'gender', 'person_id']

At this point, you can create the mapping as:

final_columns = ['ethn', 'gen', 'pers_id']
previous_columns = df2.columns()
mapping = {previous_columns[i]: final_columns[i] for i in range(3)}  # 3 is arbitrary.

And then just call

df2.rename(mapping, inplace=True)

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

1.4m articles

1.4m replys

5 comments

57.0k users

...