python - Combine duplicated columns within a DataFrame

Question

Welcome To Ask or Share your Answers For Others

python - Combine duplicated columns within a DataFrame

posted Oct 24, 2021 in Technique[技术] by 深蓝 (71.8m points)

python - Combine duplicated columns within a DataFrame

If I have a dataframe that has columns that include the same name, is there a way to combine the columns that have the same name with some sort of function (i.e. sum)?

For instance with:

In [186]:

df["NY-WEB01"].head()
Out[186]:
                NY-WEB01    NY-WEB01
DateTime        
2012-10-18 16:00:00  5.6     2.8
2012-10-18 17:00:00  18.6    12.0
2012-10-18 18:00:00  18.4    12.0
2012-10-18 19:00:00  18.2    12.0
2012-10-18 20:00:00  19.2    12.0

How might I collapse the NY-WEB01 columns (there are a bunch of duplicate columns, not just NY-WEB01) by summing each row where the column name is the same?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Reply

深蓝 · Answer 1 · 2021-10-23T17:46:40+0000

I believe this does what you are after:

df.groupby(lambda x:x, axis=1).sum()

Alternatively, between 3% and 15% faster depending on the length of the df:

df.groupby(df.columns, axis=1).sum()

EDIT: To extend this beyond sums, use .agg() (short for .aggregate()):

df.groupby(df.columns, axis=1).agg(numpy.max)

Categories

python - Combine duplicated columns within a DataFrame

python - Combine duplicated columns within a DataFrame

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags