Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
170 views
in Technique[技术] by (71.8m points)

python - Pandas group-by error duplicate axis but no duplicate values

I've done this code:

df[['GL','Libelle']]=df['index'].str.split(' ',1,expand=True)

# Sort by GL, Date
df.sort_values(by=['GL', 'Class','month'], inplace=True)

# add columun with diff by month
df['value'] = pd.to_numeric(df['value'])
df["diff"] = df.groupby(['GL','Class','month'])['value'].diff().fillna(df['value'])

my pandas df is like this: index object

Class object

value float64

glid object

month object

GL object

Libelle object

and this is sample: enter image description here

Could you explain with I have this error ? "cannot reindex from a duplicate axis" on line df["diff"] = df.groupby(['GL','Class','month'])['value'].diff().fillna(df['value'])

question from:https://stackoverflow.com/questions/65943678/pandas-group-by-error-duplicate-axis-but-no-duplicate-values

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Generally, this error arises when you try to join or assign to a column when the index(row or column names) has duplicate values. If I'm understanding correctly, you are trying to join the column, so check if your rows have duplicate values. Also, check your original dataframe. It may possible that the duplicate is present in your original dataframe.

To find the duplicates in the original index, do this: df[df.index.duplicated()]

If you accidentally have created a duplicate column then remove it. This will solve your error.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...