Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
668 views
in Technique[技术] by (71.8m points)

r - Difference between as.POSIXct/as.POSIXlt and strptime for converting character vectors to POSIXct/POSIXlt

I have followed a number of questions here that asks about how to convert character vectors to datetime classes. I often see 2 methods, the strptime and the as.POSIXct/as.POSIXlt methods. I looked at the 2 functions but am unclear what the difference is.

strptime

function (x, format, tz = "") 
{
    y <- .Internal(strptime(as.character(x), format, tz))
    names(y$year) <- names(x)
    y
}
<bytecode: 0x045fcea8>
<environment: namespace:base>

as.POSIXct

function (x, tz = "", ...) 
UseMethod("as.POSIXct")
<bytecode: 0x069efeb8>
<environment: namespace:base>

as.POSIXlt

function (x, tz = "", ...) 
UseMethod("as.POSIXlt")
<bytecode: 0x03ac029c>
<environment: namespace:base>

Doing a microbenchmark to see if there are performance differences:

library(microbenchmark)
Dates <- sample(c(dates = format(seq(ISOdate(2010,1,1), by='day', length=365), format='%d-%m-%Y')), 5000, replace = TRUE)
df <- microbenchmark(strptime(Dates, "%d-%m-%Y"), as.POSIXlt(Dates, format = "%d-%m-%Y"), times = 1000)

Unit: milliseconds
                                    expr      min       lq   median       uq      max
1 as.POSIXlt(Dates, format = "%d-%m-%Y") 32.38596 33.81324 34.78487 35.52183 61.80171
2            strptime(Dates, "%d-%m-%Y") 31.73224 33.22964 34.20407 34.88167 52.12422

strptime seems slightly faster. so what gives? why would there be 2 similar functions or are there differences between them that I missed?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Well, the functions do different things.

First, there are two internal implementations of date/time: POSIXct, which stores seconds since UNIX epoch (+some other data), and POSIXlt, which stores a list of day, month, year, hour, minute, second, etc.

strptime is a function to directly convert character vectors (of a variety of formats) to POSIXlt format.

as.POSIXlt converts a variety of data types to POSIXlt. It tries to be intelligent and do the sensible thing - in the case of character, it acts as a wrapper to strptime.

as.POSIXct converts a variety of data types to POSIXct. It also tries to be intelligent and do the sensible thing - in the case of character, it runs strptime first, then does the conversion from POSIXlt to POSIXct.

It makes sense that strptime is faster, because strptime only handles character input whilst the others try to determine which method to use from input type. It should also be a bit safer in that being handed unexpected data would just give an error, instead of trying to do the intelligent thing that might not be what you want.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...