Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
497 views
in Technique[技术] by (71.8m points)

sql - RODBC loses time values of datetime when result set is large

So this is VERY strange. RODBC seems to drop the time portion of DateTime SQL columns if the result set is large enough. (The queries are running against an SQL Server 2012 machine, and, yes, when I run them on the SQL Server side they produce identical and proper results, regardless of how many rows are returned.)

For example, the following works perfectly:

myconn <- odbcConnect(dsnName, uid, pwd)
results <- sqlQuery(myconn, "SELECT TOP 100 MyID, MyDateTimeColumn from MyTable ORDER BY MyDateTimeColumn DESC")
close(myconn)

In R, the following works as expected:

> results$MyDateTimeColumn[3]
[1] "2013-07-01 00:01:22 PDT"

which is a valid POSIXct date time. However, when somewhere between 10,000 and 100,000 rows are returned, suddenly the time portion disappears:

myconn <- odbcConnect(dsnName, uid, pwd)
bigResults <- sqlQuery(myconn, "SELECT TOP 100000 MyID, MyDateTimeColumn from MyTable ORDER BY MyDateTimeColumn DESC")
close(myconn)

(same code, simply a larger number of rows returned; NOTE: the exact same row has now lost its time component), R responds:

> bigResults$MyDateTimeColumn[3]
[1] "2013-07-01 PDT"

Note that the time is now missing (this is not a different row; it's the exact same row as previous), as the following shows:

>strptime(results$TriggerTime[3], "%Y-%m-%d %H:%M:%S")
[1] "2013-07-01 00:01:22"

>strptime(bigResults$TriggerTime[3], "%Y-%m-%d %H:%M:%S")
[1] NA

Obviously the work-around is either incremental query-with-append or export-to-CSV-and-import-to-R, but this seems very odd. Anyone ever seen anything like this?

Config: I'm using the latest version of RODBC (1.3-10) and can duplicate the behavior on both an R installation running on Windows x64 and an R installation running on Mac OS X 10.9 (Mavericks).

EDIT #2 Adding output of dput() to compare the objects, per request:

> dput(results[1:10,]$MyDateTimeColumn)
structure(c(1396909903.347, 1396909894.587, 1396909430.903, 1396907996.9, 1396907590.02, 1396906077.887, 1396906071.99, 1396905537.36, 1396905531.413, 1396905231.787), class = c("POSIXct", "POSIXt"), tzone = "")

> dput(bigResults[1:10,]$MyDateTimeColumn)
structure(c(1396854000, 1396854000, 1396854000, 1396854000, 1396854000, 1396854000, 1396854000, 1396854000, 1396854000, 1396854000), class = c("POSIXct", "POSIXt"), tzone = "")

It would appear that the underlying data are actually changing as a result of the number of rows returned by the query, which is downright strange.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

sqlQuery() has an option called as.is. Setting this to TRUE will pull everything as seen in for example Microsoft SQL Management Studio.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...