I have a dataframe (df
) with a single column of dates and a second dataframe (df_value
) with three columns: a start date ('From'), an end date ('To') and an associated value. I want to create a second column in df
with the correct value which has been looked up from df_value
:
import pandas as pd
df = pd.DataFrame(['30/03/2018', '01/10/2019','03/07/2020', '05/08/2020', '06/08/2020', '10/10/2020'], columns=['Date'])
df['Date'] = pd.to_datetime(df['Date'], format='%d/%m/%Y', dayfirst=True).dt.date
df_value = pd.DataFrame([['01/01/2018','31/12/2018',1.286], ['01/01/2019','30/06/2019',1.555], ['01/07/2019','31/12/2019',1.632], ['01/01/2020','31/12/2020',1.864]], columns =['From', 'To', 'Value'])
df_value['From'] = pd.to_datetime(df_value['From'], format='%d/%m/%Y', dayfirst=True).dt.date
df_value['To'] = pd.to_datetime(df_value['To'], format='%d/%m/%Y', dayfirst=True).dt.date
At the moment I have done this through applying the function below to df
row-by-row. Although this works I feel that there must be a far more efficient way of doing this:
def fixed_func(df):
value = 0
row_counter = 0
while value == 0:
if (df['Date']>= df_value.iloc[row_counter, 0]) & (df['Date']<= df_value.iloc[row_counter, 1]):
value = df_value.iloc[row_counter, 2]
else:
row_counter += 1
return value
df['Value'] = df.apply(fixed_func, axis=1)
question from:
https://stackoverflow.com/questions/65843211/assign-values-to-a-pandas-dataframe-column-based-on-intervals 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…