I have a dataframe
with a bunch of donations data. I take the data and arrange it in time order from oldest to most recent gifts. Next I add a column containing a cumulative sum of the gifts over time. The data has multiple years of data and I was looking for a good way to reset the cumsum
to 0 at the start of each year (the year starts and ends July 1st for fiscal purposes).
This is how it currently is:
id date giftamt cumsum()
005 01-05-2001 20.00 20.00
007 06-05-2001 25.00 45.00
009 12-05-2001 20.00 65.00
012 02-05-2002 30.00 95.00
015 08-05-2002 50.00 145.00
025 12-05-2002 25.00 170.00
... ... ... ...
this is how I would like it to look:
id date giftamt cumsum()
005 01-05-2001 20.00 20.00
007 06-05-2001 25.00 45.00
009 12-05-2001 20.00 20.00
012 02-05-2002 30.00 50.00
015 08-05-2002 50.00 50.00
025 12-05-2002 25.00 75.00
... ... ... ...
Any suggestions?
UPDATE:
Here's the code that finally worked courtesy of Seb :
#tweak for changing the calendar year to fiscal year
df$year <- as.numeric(format(as.Date(df$giftdate), format="%Y"))
df$month <- as.numeric(format(as.Date(df$giftdate), format="%m"))
df$year <- ifelse(df$month<=6, df$year, df$year+1)
#cum-summing :)
library(plyr)
finalDf <- ddply(df, .(year), summarize, cumsum(as.numeric(as.character(giftamt))))
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…