python - Update value for every row based on either of two previous columns

Question

Welcome To Ask or Share your Answers For Others

python - Update value for every row based on either of two previous columns

posted Jan 31, 2022 in Technique[技术] by 深蓝 (71.8m points)

python - Update value for every row based on either of two previous columns

I am researching ATP Tour male tennis data. Currently, I have a Pandas dataframe that contains ~60,000 matches. Every row contains information / statistics about the match, split between the winner and the loser. I have sorted the dataframe on date. Currently I am trying to calculate the ELO-rating of both the winner and the loser for every match (thus every row). To calculate the ELO-rating, one needs the ELO-rating for both players in their previous match. Another difficulty arises, as the winner of the current match might have been a loser in his previous match. As a result, the 'winner_player_id' value of the current match might be in the 'loser_player_id' column for the previous match.

I am not sure how to efficiently select the previous ELO-ratings for both players per row, as this entails a search across multiple columns.

Every row includes the following columns:

array(['match_id', 'tourney_dates', 'round_order', 'tourney_name',
   'tourney_year_id', 'tourney_round_name', 'winner_player_id',
   'winner_slug', 'loser_player_id', 'loser_slug', 'elo_player_1', 'elo_player_2'])

Your time is appreciated!

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Reply

深蓝 · Answer 1 · 2022-01-31T07:21:33+0000

One approach would be to sort each winner and loser in each row by player name/ID, so the order will be stable regardless of who wins/loses. Here's an example:

df.join(pd.DataFrame(
    np.sort(df[['winner_name', 'loser_name']].values, axis=1),
    columns=['name1', 'name2']))

df.head(10)

Output:

      winner_name         loser_name              name1          name2
0   Nicklas Kulti      Michael Stich      Michael Stich  Nicklas Kulti
1   Michael Stich        Jim Courier        Jim Courier  Michael Stich
2   Nicklas Kulti     Magnus Larsson     Magnus Larsson  Nicklas Kulti
3     Jim Courier      Martin Sinner        Jim Courier  Martin Sinner
4   Michael Stich        Jimmy Arias        Jimmy Arias  Michael Stich
5   Nicklas Kulti    Fabrice Santoro    Fabrice Santoro  Nicklas Kulti
6  Magnus Larsson      Patrik Kuhnen     Magnus Larsson  Patrik Kuhnen
7     Jim Courier      Paul Haarhuis        Jim Courier  Paul Haarhuis
8   Nicklas Kulti  Magnus Gustafsson  Magnus Gustafsson  Nicklas Kulti
9   Michael Stich        Gilad Bloom        Gilad Bloom  Michael Stich

Categories

python - Update value for every row based on either of two previous columns

python - Update value for every row based on either of two previous columns

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags