Current method, using transform
In [44]: grp = df["signal"].groupby(g)
In [45]: result2 = df["signal"].groupby(g).transform(np.mean)
In [47]: %timeit df["signal"].groupby(g).transform(np.mean)
1 loops, best of 3: 535 ms per loop
Using 'broadcasting' of the results
In [43]: result = pd.concat([ Series([r]*len(grp.groups[i])) for i, r in enumerate(grp.mean().values) ],ignore_index=True)
In [42]: %timeit pd.concat([ Series([r]*len(grp.groups[i])) for i, r in enumerate(grp.mean().values) ],ignore_index=True)
10 loops, best of 3: 119 ms per loop
In [46]: result.equals(result2)
Out[46]: True
I think you might need to set the index of the returned on the broadcast result (it happens to work here because its a default index
result = pd.concat([ Series([r]*len(grp.groups[i])) for i, r in enumerate(grp.mean().values) ],ignore_index=True)
result.index = df.index
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…