In R, why do I get one millisecond difference between POSIXct and POSIXlt?

466 views Asked by At

This snippet

options(digits.secs=3)
s<-"12:00:00.188"
fmt<-"%I:%M:%OS"
print(strptime(s,fmt))
print(as.POSIXct(strptime(s,fmt)))

gives this textual output:

[1] "2017-09-12 00:00:00.188 CEST"
[1] "2017-09-12 00:00:00.187 CEST"

while I expect the above result being the same. What am I missing?

My session info:

print(sessionInfo())

gives:

R version 3.4.0 (2017-04-21)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

Matrix products: default

locale:
[1] LC_COLLATE=Italian_Italy.1252  LC_CTYPE=Italian_Italy.1252    LC_MONETARY=Italian_Italy.1252 LC_NUMERIC=C                  
[5] LC_TIME=Italian_Italy.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

loaded via a namespace (and not attached):
[1] compiler_3.4.0 tools_3.4.0   

Same result in Linux:

R version 3.3.3 (2017-03-06)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Debian GNU/Linux 9 (stretch)

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                  LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

loaded via a namespace (and not attached):
[1] tools_3.3.3

Edit (after Roland's comment)

Maybe I am wrong, but it seems to me that 0.002 is not representable in floating point, still with 0.002 there is no difference between POSIXct and POSIXlt:

options(digits.secs=3)
s<-"12:00:00.002"
fmt<-"%I:%M:%OS"
print(strptime(s,fmt))
print(as.POSIXct(strptime(s,fmt)))

gives:

[1] "2017-09-12 00:00:00.002 CEST"
[1] "2017-09-12 00:00:00.002 CEST"
1

There are 1 answers

1
Kelli-Jean On

You can read about this in the docs for datetime: https://stat.ethz.ch/R-manual/R-devel/library/base/html/DateTimeClasses.html

In particular:

Class "POSIXct" represents the (signed) number of seconds since the beginning of 1970 (in the UTC time zone) as a numeric vector.

strptime and the other class POSIXlt store the datetimes differently.

So there are issues with sub-second accuracy:

Sub-second Accuracy

Classes "POSIXct" and "POSIXlt" are able to express fractions of a second. (Conversion of fractions between the two forms may not be exact, but will have better than microsecond accuracy.)

So, you'll see POSIXlt and strptime print these accurately:

strptime(s,fmt)
as.POSIXlt(strptime(s,fmt), format = "%Y-%m-%d %H:%M:%OS")

But, because POSIXct does calculations to represent the date as a number, it can have inaccuracies due to floating-point precision, leap seconds, etc.