I am researching ATP Tour male tennis data. Currently, I have a Pandas dataframe that contains ~60,000 matches. Every row contains information / statistics about the match, split between the winner and the loser. I have sorted the dataframe on date. Currently I am trying to calculate the ELO-rating of both the winner and the loser for every match (thus every row).
To calculate the ELO-rating, one needs the ELO-rating for both players in their previous match. Another difficulty arises, as the winner of the current match might have been a loser in his previous match. As a result, the 'winner_player_id' value of the current match might be in the 'loser_player_id' column for the previous match.
I am not sure how to efficiently select the previous ELO-ratings for both players per row, as this entails a search across multiple columns.
Every row includes the following columns:
array(['match_id', 'tourney_dates', 'round_order', 'tourney_name',
'tourney_year_id', 'tourney_round_name', 'winner_player_id',
'winner_slug', 'loser_player_id', 'loser_slug', 'elo_player_1', 'elo_player_2'])
Your time is appreciated!
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…