I have a large data frame with key IDs, states, start dates and other characteristics. I have another data frame with states, a start date and a "1" to signify a flag.
I want to join the two, based on the state and the date in df1 being greater than or equal to the date in df2.
Take the example below. df1
is the table of states, start dates, and a 1 for a flag. df2
is a dataframe that needs those flags if
the date in df2 is >=
the date in df1
. The end result is df3
. The only observations get the flag whose states match and dates are >=
the original dates.
import pandas as pd
dict1 = {'date':['2020-01-01', '2020-02-15', '2020-02-04','2020-03-17',
'2020-06-15'],
'state':['AL','FL','MD','NC','SC'],
'flag': [1,1,1,1,1]}
df1 = pd.DataFrame(dict1)
df1['date'] = pd.to_datetime(df1['date'])
dict2 = {'state': ['AL','FL','MD','NC','SC'],
'keyid': ['001','002','003','004','005'],
'start_date':['2020-02-01', '2020-01-15', '2020-01-30','2020-05-18',
'2020-05-16']}
df2 = pd.DataFrame(dict2)
df2['start_date'] = pd.to_datetime(df2['start_date'])
df3 = df2
df3['flag'] = [0,1,1,0,1]
How do I get to df3 programmatically? My actual df1
has a row for each state. My actual df2
has over a million observations with different dates.
question from:
https://stackoverflow.com/questions/65843716/how-to-join-two-pandas-dataframes-based-on-a-date-in-df1-being-date-in-df2 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…