Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
336 views
in Technique[技术] by (71.8m points)

python - Conditional aggregation on pandas dataframe columns with combining 'n' rows into 1 row

I have the following pandas dataframe:

START   NAME
5.11    name1
9.1     name1
10.86   name1
12.61   name2
14.86   name2
23.11   name2
25.36   name1
26.61   name1
28.36   name2
31.61   name2
32.86   name1
35.61   name1
44.61   name1
46.36   name2

I would this merged by name as follows:

START   END     NAME
5.11    12.61   name1
12.61   25.36   name2
26.61   28.36   name1
28.36   32.86   name2
32.86   46.36   name1
46.36   total   name2

I tried something like this:

df2 = df.copy()
df2 = df2.rename({"name": "temp"}).reset_index()
grp = (df2['name'] != df2['name'].shift()).cumsum().rename('group')
df2 = df2.groupby(['name', grp], sort=False)

But this does not produce the desired output. Any help is appreciated

thanks

question from:https://stackoverflow.com/questions/65879064/conditional-aggregation-on-pandas-dataframe-columns-with-combining-n-rows-into

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)
  1. use shift to compare the row's content is same with the next row
  2. keep the NAME that is not the same as the next row's NAME.
  3. use shift(-1) to assign the NAME's END.
cond = (df['NAME'] != df['NAME'].shift(1))
dfn = df[cond].copy()
dfn['END'] = dfn['START'].shift(-1).fillna('total')
dfn[['START', 'END', 'NAME']]

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...