Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
1.4k views
in Technique[技术] by (71.8m points)

create a column value if the column value is in the list- python pandas

I want to create a new column called Playercategory,

Name    Country
Kobe    United States
John    Italy
Charly  Japan
Braven  Japan / United States 
Rocky   Germany / United States
Bran    Lithuania
Nick    United States / Ukraine
Jonas   Nigeria

if the player's nationality is 'United States' or United States with any other country except European country, then Playercategory=="American"

if the player's nationality is European country or European country with any other country, then Playercategory=="Europe" (ex: 'Italy', 'Italy / United States', 'Germany / United States', 'Lithuania / Australia','Belgium')

For all the other players, then Playercategory=="Non"

Expected Output:

Name    Country                   Playercategory
Kobe    United States             American
John    Italy                     Europe
Charles Japan                     Non
Braven  Japan / United States     American
Rocky   Germany / United States   Europe
Bran    Lithuania                 Europe
Nick    United States / Ukraine   American
Jonas   Nigeria                   Non             

What I tried: First I created a list with Europe countries:

euroCountries=['Austria', 'Belgium', 'Bulgaria', 'Croatia', 'Cyprus', 'Czechia', 'Denmark',
   'Estonia', 'Finland', 'France', 'Germany', 'Greece', 'Hungary', 'Ireland',
   'Italy', 'Latvia', 'Lithuania', 'Luxembourg', 'Malta', 'Netherlands',
   'Poland', 'Portugal', 'Romania', 'Slovakia', 'Slovenia', 'Spain', 'Sweden']

i know how to check one condition,like below way,

df["PlayerCatagory"] = np.where(df["Country"].isin(euroCountries), "Europe", "Non")

But don't know how to concat the above three conditions and create PlayerCategory correctly.

Really appreciate your support!!!!!!!


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Use numpy.select with test first if match euroCountries in Series.str.contains and then test if match United States:

m1 = df['Country'].str.contains('|'.join(euroCountries))
m2 = df['Country'].str.contains('United States')

Or you can test splitted values with Series.str.split, DataFrame.eq or DataFrame.isin and then if at least one match per rows by DataFrame.any:

df1 = df['Country'].str.split(' / ', expand=True)
m1 = df1.eq('United States').any(axis=1)
m2 = df1.isin(euroCountries).any(axis=1)

df["PlayerCatagory"] = np.select([m1, m2], ['Europe','American'], default='Non')
print (df)
     Name                  Country PlayerCatagory
0    Kobe            United States       American
1    John                    Italy         Europe
2  Charly                    Japan            Non
3  Braven    Japan / United States       American
4   Rocky  Germany / United States         Europe
5    Bran                Lithuania         Europe
6    Nick  United States / Ukraine       American
7   Jonas                  Nigeria            Non

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...