Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
368 views
in Technique[技术] by (71.8m points)

python - How to duplicate rows based on a counter column

Let's say I have a data frame called df

x count 
d 2
e 3
f 2

Count would be the counter column and the # times I want it to repeat.

How would I expand it to make it

x count
d 2
d 2
e 3
e 3
e 3
f 2
f 2

I've already tried numpy.repeat(df,df.iloc['count']) and it errors out

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

You can use np.repeat()

import pandas as pd
import numpy as np

# your data
# ========================
df

   x  count
0  d      2
1  e      3
2  f      2

# processing
# ==================================
np.repeat(df.values, df['count'].values, axis=0)


array([['d', 2],
       ['d', 2],
       ['e', 3],
       ['e', 3],
       ['e', 3],
       ['f', 2],
       ['f', 2]], dtype=object)

pd.DataFrame(np.repeat(df.values, df['count'].values, axis=0), columns=['x', 'count'])

   x count
0  d     2
1  d     2
2  e     3
3  e     3
4  e     3
5  f     2
6  f     2

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...