I noticed already a couple of times that working with dates doesn't allow for using the usual tricks in R. Say I have a dataframe Data with Dates (see below), and I want to convert the complete dataframe to a date class. The only solution I could come up with until now is :
for (i in 1:ncol(Data)){
Data[,i] <- as.Date(Data[,i],format="%d %B %Y")
}
This gives a dataframe with the correct structure :
> str(Data)
'data.frame': 6 obs. of 4 variables:
$ Rep1:Class 'Date' num [1:6] 12898 12898 13907 13907 13907 ...
$ Rep2:Class 'Date' num [1:6] 13278 13278 14217 14217 14217 ...
$ Rep3:Class 'Date' num [1:6] 13600 13600 14340 14340 14340 ...
$ Rep4:Class 'Date' num [1:6] 13831 13831 14669 14669 14669 ...
Using a classic apply approach gives something completely different. Although all variables are of the same class and go to the same class, I can't get a data-frame or matrix of the correct class as output :
> str(sapply(Data,as.Date,format="%d %B %Y"))
num [1:6, 1:4] 12898 12898 13907 13907 13907 ...
- attr(*, "dimnames")=List of 2
..$ : NULL
..$ : chr [1:4] "Rep1" "Rep2" "Rep3" "Rep4"
> str(apply(Data,2,as.Date,format="%d %B %Y"))
num [1:6, 1:4] 12898 12898 13907 13907 13907 ...
- attr(*, "dimnames")=List of 2
..$ : NULL
..$ : chr [1:4] "Rep1" "Rep2" "Rep3" "Rep4"
If you want to transform these matrices again in Date objects, you need an origin. That origin can differ from system to system, so using as.Date or another function after the apply() doesn't help much either. If you apply the origin, you get a vector again.
Anybody a clean solution for this kind of data? Below is the dataframe I used in the examples.
Data <- structure(list(Rep1 = c(" 25 April 2005 ", " 25 April 2005 ",
" 29 January 2008 ", " 29 January 2008 ", " 29 January 2008 ",
" 29 January 2008 "), Rep2 = c(" 10 May 2006 ", " 10 May 2006 ",
" 4 December 2008 ", " 4 December 2008 ", " 4 December 2008 ",
" 4 December 2008 "), Rep3 = c(" 28 March 2007 ", " 28 March 2007 ",
" 6 April 2009 ", " 6 April 2009 ", " 6 April 2009 ", " 6 April 2009 "
), Rep4 = c(" 14 November 2007 ", " 14 November 2007 ", " 1 March 2010 ",
" 1 March 2010 ", " 1 March 2010 ", " 1 March 2010 ")), .Names = c("Rep1",
"Rep2", "Rep3", "Rep4"), row.names = c("1", "2", "3", "4", "5",
"6"), class = "data.frame")
See Question&Answers more detail:
os