I have 3 dataframes which can be generated from the code shown below
df1= pd.DataFrame({'person_id':[1,2,3],'gender': ['Male','Female','Not disclosed'],'ethn': ['Chinese','Indian','European']})
df2= pd.DataFrame({'pers_id':[4,5,6],'gen': ['Male','Female','Not disclosed'],'ethnicity': ['Chinese','Indian','European']})
df3= pd.DataFrame({'son_id':[7,8,9],'sex': ['Male','Female','Not disclosed'],'ethnici': ['Chinese','Indian','European']})
I would like to do two things
a) Append all these 3 dataframes into one large result
dataframe
When I attempted this using the below code, the output isn't as expected
df1.append(df2)
So, to resolve this, I understand we have to rename the column names which leads to objective b below
b) Rename the column of these n dataframes to be uniform in a elegant way
Please note that in real time I might have dataframe with different column names which I may not know in advance but the values in them will always be the same belonging to columns Ethnicity
, Gender
and Person_id
. But note there can be several other columns as well like Age
, Date
,bp reading
etc
Currently, I do this by manually reading the column names using below code
df2.columns
df2.rename(columns={ethnicity:'ethn',gender = 'gen',person_id='pers_id},
inplace=True)
How can I set the column names for all dataframe to be the same (gender
,ethnicity
,person_id
and etc) irrespective of their original column values
See Question&Answers more detail:
os