python - Assign values to a pandas dataframe column based on intervals

Question

Welcome To Ask or Share your Answers For Others

python - Assign values to a pandas dataframe column based on intervals

posted Oct 7, 2021 in Technique[技术] by 深蓝 (71.8m points)

python - Assign values to a pandas dataframe column based on intervals

I have a dataframe (df) with a single column of dates and a second dataframe (df_value) with three columns: a start date ('From'), an end date ('To') and an associated value. I want to create a second column in df with the correct value which has been looked up from df_value:

import pandas as pd
df = pd.DataFrame(['30/03/2018', '01/10/2019','03/07/2020', '05/08/2020', '06/08/2020', '10/10/2020'], columns=['Date'])
df['Date'] = pd.to_datetime(df['Date'], format='%d/%m/%Y', dayfirst=True).dt.date

df_value = pd.DataFrame([['01/01/2018','31/12/2018',1.286], ['01/01/2019','30/06/2019',1.555], ['01/07/2019','31/12/2019',1.632], ['01/01/2020','31/12/2020',1.864]], columns =['From', 'To', 'Value'])
df_value['From'] = pd.to_datetime(df_value['From'], format='%d/%m/%Y', dayfirst=True).dt.date
df_value['To'] = pd.to_datetime(df_value['To'], format='%d/%m/%Y', dayfirst=True).dt.date

At the moment I have done this through applying the function below to df row-by-row. Although this works I feel that there must be a far more efficient way of doing this:

def fixed_func(df):    
        value = 0
        row_counter = 0
                    
        while value == 0:
            if (df['Date']>= df_value.iloc[row_counter, 0]) & (df['Date']<= df_value.iloc[row_counter, 1]):
                value = df_value.iloc[row_counter, 2]
            else:
                row_counter += 1
            
        return value

df['Value'] = df.apply(fixed_func, axis=1)

question from:https://stackoverflow.com/questions/65843211/assign-values-to-a-pandas-dataframe-column-based-on-intervals

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Reply

深蓝 · Answer 1 · 2021-10-06T19:31:28+0000

You can use numpy.where:

import numpy as np

date_col = df.Date.values[:, np.newaxis]
x,y = np.where((date_col >= df_value.From.values) & 
               (date_col <= df_value.To.values))

df["Value"] = df_value.loc[y,"Value"].reset_index(drop = True)
#        Date  Value
#0 2018-03-30  1.286
#1 2019-10-01  1.632
#2 2020-07-03  1.864
#3 2020-08-05  1.864
#4 2020-08-06  1.864
#5 2020-10-10  1.864

Categories

python - Assign values to a pandas dataframe column based on intervals

python - Assign values to a pandas dataframe column based on intervals

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags