I'm starting to learn Pandas and am trying to find the most Pythonic (or panda-thonic?) ways to do certain tasks.
Suppose we have a DataFrame with columns A, B, and C.
- Column A contains boolean values: each row's A value is either true or false.
- Column B has some important values we want to plot.
What we want to discover is the subtle distinctions between B values for rows that have A set to false, vs. B values for rows that have A is true.
In other words, how can I group by the value of column A (either true or false), then plot the values of column B for both groups on the same graph? The two datasets should be colored differently to be able to distinguish the points.
Next, let's add another feature to this program: before graphing, we want to compute another value for each row and store it in column D. This value is the mean of all data stored in B for the entire five minutes before a record - but we only include rows that have the same boolean value stored in A.
In other words, if I have a row where A=True
and time=t
, I want to compute a value for column D that is the mean of B for all records from time t-5
to t
that have the same A=True
.
In this case, how can we execute the groupby on values of A, then apply this computation to each individual group, and finally plot the D values for the two groups?
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…