pandas - How to group by a df in Python by a column with the difference between the max value of one column and the min of another column?

Question

Welcome To Ask or Share your Answers For Others

pandas - How to group by a df in Python by a column with the difference between the max value of one column and the min of another column?

posted Feb 6, 2021 in Technique[技术] by 深蓝 (71.8m points)

pandas - How to group by a df in Python by a column with the difference between the max value of one column and the min of another column?

I have a data frame which looks like this:

student_id	session_id	reading_level_id	st_week	end_week
1	3334	3	3	3
1	3335	2	4	4
2	3335	2	2	2
2	3336	2	2	3
2	3337	2	3	3
2	3339	2	3	4

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Reply

深蓝 · Answer 1 · 2021-02-06T01:02:18+0000

Using the data you shared, a simpler solution is possible:

Group by student_id, and pass False argument to the as_index parameter (this works for a dataframe, and returns a dataframe);

Next, use a named aggregation to get the `max week for end week and the min week for st_week for each group

Get the difference between max_wk and end_wk

Finally, keep only the required columns

(
    df.groupby("student_id", as_index=False)
    .agg(max_wk=("end_week", "max"), min_wk=("st_week", "min"))
    .assign(Diff=lambda x: x["max_wk"] - x["min_wk"])
    .loc[:, ["student_id", "Diff"]]
)

    student_id  Diff
0          1    1
1          2    2

Categories

pandas - How to group by a df in Python by a column with the difference between the max value of one column and the min of another column?

pandas - How to group by a df in Python by a column with the difference between the max value of one column and the min of another column?

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags