Enjoy the power of lubridate
:
library(lubridate)
mdy(ord_dates)
[1] "2016-09-01" "2016-09-02" "2016-09-03" "2016-09-04"
Internally, lubridate
doesn't have any special conversion specifications which enable this. Rather, lubridate
first uses (by smart guessing) the format "%B %dst, %Y"
. This gets the first element of ord_dates
.
It then checks for NA
s and repeats its smart guessing on the remaining elements, settling on "%B %dnd, %Y"
to get the second element. It continues in this way until there are no NA
s left (which happens in this case after 4 iterations), or until its smart guessing fails to turn up a likely format candidate.
You can imagine this makes lubridate
slower, and it does -- about half the speed of just using the smart regex suggested by @alistaire above:
set.seed(109123)
ord_dates <- sample(
c("September 1st, 2016", "September 2nd, 2016",
"September 3rd, 2016", "September 4th, 2016"),
1e6, TRUE
)
library(microbenchmark)
microbenchmark(times = 10L,
lubridate = mdy(ord_dates),
base = as.Date(sub("\D+,", "", ord_dates),
format = "%B %e %Y"))
# Unit: seconds
# expr min lq mean median uq max neval cld
# lubridate 2.167957 2.219463 2.290950 2.252565 2.301725 2.587724 10 b
# base 1.183970 1.224824 1.218642 1.227034 1.228324 1.229095 10 a
The obvious advantage in lubridate
's favor being its conciseness and flexibility.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…