Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
291 views
in Technique[技术] by (71.8m points)

python - splitting a column into multiple columns with specific name in pandas dataframe

I have following dataframe:

pri    sec
TOM    AB,CD,EF
JACK   XY,YZ
HARRY  FG
NICK   KY,NY,SD,EF,FR

I need following output with column names as following(based on how many , separated fields exists in column 'sec'):

pri    sec             sec0  sec1  sec2  sec3 sec4
TOM    AB,CD,EF        AB    CD    EF    NaN  NaN
JACK   XY,YZ           XY    YZ    NaN   NaN  NaN
HARRY  FG              FG    NaN   NaN   NaN  NaN
NICK   KY,NY,SD,EF,FR  KY    NY    SD    EF   ER

Can I get any suggestions?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Use join + split + add_prefix:

df = df.join(df['sec'].str.split(',', expand=True).add_prefix('sec'))
print (df)
     pri             sec sec0  sec1  sec2  sec3  sec4
0    TOM        AB,CD,EF   AB    CD    EF  None  None
1   JACK           XY,YZ   XY    YZ  None  None  None
2  HARRY              FG   FG  None  None  None  None
3   NICK  KY,NY,SD,EF,FR   KY    NY    SD    EF    FR

And if need NaNs add fillna:

df = df.join(df['sec'].str.split(',', expand=True).add_prefix('sec').fillna(np.nan))
print (df)
     pri             sec sec0 sec1 sec2 sec3 sec4
0    TOM        AB,CD,EF   AB   CD   EF  NaN  NaN
1   JACK           XY,YZ   XY   YZ  NaN  NaN  NaN
2  HARRY              FG   FG  NaN  NaN  NaN  NaN
3   NICK  KY,NY,SD,EF,FR   KY   NY   SD   EF   FR

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...