I have a pandas dataframe that looks like the following and holds groups of data via a column id
:
import numpy as np
import pandas as pd
df = pd.DataFrame(np.random.randn(10, 4), columns=list('ABCD'))
df['id'] = ['W', 'W', 'W', 'Z', 'Z', 'Y', 'Y', 'Y', 'Z', 'Z']
print(df)
A B C D id
0 0.347501 -1.152416 1.441144 -0.144545 w
1 0.775828 -1.176764 0.203049 -0.305332 w
2 1.036246 -0.467927 0.088138 -0.438207 w
3 -0.737092 -0.231706 0.268403 0.464026 x
4 -1.857346 -1.420284 -0.515517 -0.231774 x
5 -0.970731 0.217890 0.193814 -0.078838 y
6 -0.318314 -0.244348 0.162103 1.204386 y
7 0.340199 1.074977 1.201068 -0.431473 y
8 0.202050 0.790434 0.643458 -0.068620 z
9 -0.882865 0.687325 -0.008771 -0.066912 z
Now I want to create new dataframes (named df_w, df_x, df_y, df_z) which only hold their data from the original dataframe and are optimally combined within some iterable e.g. a list:
df_w
A B C D id
0 0.347501 -1.152416 1.441144 -0.144545 w
1 0.775828 -1.176764 0.203049 -0.305332 w
2 1.036246 -0.467927 0.088138 -0.438207 w
df_x
A B C D id
0 -0.737092 -0.231706 0.268403 0.464026 x
1 -1.857346 -1.420284 -0.515517 -0.231774 x
df_y
A B C D id
0 -0.970731 0.217890 0.193814 -0.078838 y
1 -0.318314 -0.244348 0.162103 1.204386 y
2 0.340199 1.074977 1.201068 -0.431473 y
df_z
A B C D id
0 0.202050 0.790434 0.643458 -0.068620 z
1 -0.882865 0.687325 -0.008771 -0.066912 z
Is there any smart (vectorized pandas) way to achieve this using groupby, apply and/or applymap and a function?
I was thinking about iterating over the dataframe but it doesn't seem to be very elegant..
Thanks in advance for any hints!
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…