Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
168 views
in Technique[技术] by (71.8m points)

python - Extract the YYYY year from two string columns and put it in a new column, keeping NaN values

In a dataframe I have two columns with the information of when some football players make their debut.The columns are called 'Debut' and 'Debut Deportivo'. I have to create a function to create a new column with the YYYY year information of both columns keeping the Nan values from both when applied. Let me show and example: enter image description here

With the code I have wrote until now, I am able to get the value from one column a put it in the new one, but I've never reach the form to combine both.

The result should be something like this:

Debut Debut Deportivo fecha_debut
27 de mayo de 2006 2006(UD Vecindario) 2006
21 de agosto de 2010 11 de agosto de 2010(Portuguesa) 2010
21 de agosto de 2010 NaN 2010
NaN NaN NaN
question from:https://stackoverflow.com/questions/65626460/extract-the-yyyy-year-from-two-string-columns-and-put-it-in-a-new-column-keepin

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

I suggest you use str.extract + combine_first

df['fecha_debut'] = df['Debut'].str.extract(r'(d{4})').combine_first(df['Debut Deportivo'].str.extract(r'(d{4})'))
print(df)

Output

                  Debut                   Debut Deportivo fecha_debut
0    27 de mayo de 2006               2006(UD Vecindario)        2006
1  21 de agosto de 2010  11 de agosto de 2010(Portuguesa)        2010
2  21 de agosto de 2010                               NaN        2010
3                   NaN                               NaN         NaN

For more on how to work with strings in pandas see this.

UPDATE

If you need the column to be numeric you could do:

df['fecha_debut'] = pd.to_numeric(df['fecha_debut']).astype(pd.Int32Dtype())

Note that because you have missing values in the column it cannot be of type int32. It can be either nullable integer or float. For more on working with missing data see this.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

1.4m articles

1.4m replys

5 comments

57.0k users

...