I am really not sure about my terminology here so please feel free to correct my title.
supose I have a pandas dataframe D with columns X and Y both discrete values [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
a column R with only the two possible values 0.1
and 0.2
and V with various (in this example random) values.
X and Y are like coordinates so there exists every possible value exactly one time in X for every value in Y and vice versa.
now I want to "halve" the resolution of X and Y by calculating the means of all values in V and reducing the steps in X and Y to [1, 3, 5 ,7, 9]
.
effectively if I in the new dataframe Dnew i wanted the slice (example output below):
Dnew.loc[(Dnew.X == 1) & (Dnew.Y == 1)]
>> X Y R V
0 1 1 0.1 35
1 1 1 0.2 31
to return a dataframe containing only one value for V which is the means of all four values in V you'd get when doing the following slice in the previous dataframe D (example output below):
D.loc[(D.X >= 1) & (D.X <= 2) & (D.Y >= 1) & (D.Y <= 2)]
>> X Y R V
0 1 1 0.1 10
1 1 2 0.1 50
2 2 1 0.1 35
3 2 2 0.1 45
4 1 1 0.2 33
5 1 2 0.2 19
6 2 1 0.2 60
7 2 2 0.2 12
What would be a pythonic way that makes use of the special characteristics of pandas dataframes to calculate what I am looking for.
Also if this works I would like to explicate the calculations by making a distinction between all values with D.R == 0.1
and D.R == 0.2
so that would mean these "means groups" I just described would exist two times with different values in V for the two possible values in R.
I really hope I was able to get my point across. The topic is fairly abstract and my knowledge of pandas dataframes still has to grow. I am also not an English native speaker so please point me to any mistake I made or to things I could explain better.
In my example output here row 0 of the first example has the means of V
values from rows 0 - 3 in the second example.
Row 1 of the first example has the means of V
values from rows 4 - 7 of second example.
[edit: I added example output to the mentioned slices as suggested]
question from:
https://stackoverflow.com/questions/65660587/calculate-means-in-a-pandas-dataframe-over-certain-discrete-dimensional-ranges 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…