Using the data you shared, a simpler solution is possible:
Group by student_id
, and pass False
argument to the as_index
parameter (this works for a dataframe, and returns a dataframe);
Next, use a named aggregation to get the `max week for end week and the min week for st_week for each group
Get the difference between max_wk
and end_wk
Finally, keep only the required columns
(
df.groupby("student_id", as_index=False)
.agg(max_wk=("end_week", "max"), min_wk=("st_week", "min"))
.assign(Diff=lambda x: x["max_wk"] - x["min_wk"])
.loc[:, ["student_id", "Diff"]]
)
student_id Diff
0 1 1
1 2 2
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…