This might be a trivial question but I'm still trying to figure out pandas/numpy.
So, suppose I have a table with the following structure:
group_id | col1 | col2 | col3 | "A" | "B"
x | 1 | 2 | 3 | NaN | 1
x | 3 | 2 | 3 | 1 | 1
x | 4 | 2 | 3 | 2 | 1
y | 1 | 2 | 3 | NaN | 3
y | 3 | 2 | 3 | 3 | 3
z | 3 | 2 | 3 | 10 | 2
z | 2 | 2 | 3 | 6 | 2
z | 4 | 2 | 3 | 4 | 2
z | 4 | 2 | 3 | 2 | 2
Note that there is a group_id that groups elements in each row.
So at the beginning, I have the values for columns group_id and col1-col3.
Then for each row, if col1, col2, or col3 have value = 1, then "A" is NaN, otherwise the value is based on a formula (irrelevant for here so I put some numbers in place).
That, I know how to do using:
df["A"] = np.where(((df['col1'] == 1)|(df['col2']== 1) | (df['col3']) == 1))), NaN, value)
But for column "B", I need to fill it in with the minimum of values from column A for a specific group.
So for example, "B" is equal to "1" for all rows with group X because the minimum value in column A for all of the group "x" rows is equal to 1.
Similarly, for rows in group "y", the minimum value is 3, and for group "z" the minimum value is 2. How exactly do I do that using pandas...? It's confusing me a little more because the number of rows for a specific group can be of varying size.
If they were all the same size I could just say fill it with the minimum of values in a pre-set range.
I hope that made sense; please let me know if I should provide a clearer example or clarify anything!
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…