I've defined a function to calculate a CAGR. I have different columns with annual data, some not complete for the full period (0's where data is not available). For those with incomplete data for the length of the df column, I want to find where the data is 0, and calculate the cagr for the last n years as defined in df_periods. I'm quite new to python so not sure how to do this.
df=
col1 col2 col3 col4 col5
8 9 6 7 1
8 9 6 7 1
8 9 6 7 1
8 9 6 7 1
8 9 6 7 1
8 0 6 7 0
8 0 6 7 0
8 0 6 7 0
df_periods=
col1 col2 col3 col4 col5
4 3 5 4 4
def cagr(startval , endval , periods ):
return (endval / startval) ** (1 / (periods - 1)) - 1
I've iterated through the df to return true where I have 0 values in the dataframe (where the data is incomplete) -- but I can't figure out how to actually pull the references for where the data is zero, and to apply the cagr function to only the n years before the 0 values.
find_zero = df.apply(lambda x: x == 0)
#CAGR values -- how can I get these to reference the first true value in each column? I don't think these work
endval = df.iloc[lambda x: x.index == 0] -1
startval = df.iloc[]
periods = df_periods.iloc[]
question from:
https://stackoverflow.com/questions/65877138/iterate-through-a-dataframe-and-match-criteria 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…