python - Normalize DataFrame by group

Question

Welcome To Ask or Share your Answers For Others

python - Normalize DataFrame by group

posted Oct 24, 2021 in Technique[技术] by 深蓝 (71.8m points)

python - Normalize DataFrame by group

Let's say that I have some data generated as follows:

N = 20
m = 3
data = np.random.normal(size=(N,m)) + np.random.normal(size=(N,m))**3

and then I create some categorization variable:

indx = np.random.randint(0,3,size=N).astype(np.int32)

and generate a DataFrame:

import pandas as pd
df = pd.DataFrame(np.hstack((data, indx[:,None])), 
             columns=['a%s' % k for k in range(m)] + [ 'indx'])

I can get the mean value, per group as:

df.groubpy('indx').mean()

What I'm unsure of how to do is to then subtract the mean off of each group, per-column in the original data, so that the data in each column is normalized by the mean within group. Any suggestions would be appreciated.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Reply

深蓝 · Answer 1 · 2021-10-23T17:51:29+0000

replyed Oct 24, 2021 by 深蓝 (71.8m points)

In [10]: df.groupby('indx').transform(lambda x: (x - x.mean()) / x.std())

should do it.

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

Categories

python - Normalize DataFrame by group

python - Normalize DataFrame by group

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags