lme() different results each run under Revolution R (MKL to blame?)

575 views Asked by At

Update (Aug 2014): I never got to the bottom of this, and never got any feedback on Revolution's forum. This issue, however, seems to have been fixed in Revolution R 7.2 (with R 3.0.3, again the academic version). I ran the lme() test below a few hundred times, all produced equal results, as expected.[end of update]

I just installed the academic version of Revolution R 7.0 (R 3.0.2) on a new PC and am getting strange results for the code below. Every time the code is run, it gives different results. Under CRAN-R the result is always the same (as I think it should be). The code snippet is from test 527 from test.data.table() version 1.8.10, which pointed me to the error.

library(nlme)
all.equal(lme(distance ~ age, data=Orthodont), lme(distance ~ age, data=Orthodont))

I get something like below, but different every time.

> all.equal(lme(distance ~ age, data=Orthodont), lme(distance ~ age, data=Orthodont))
[1] "Component 4: Component 2: Component 1: Mean relative difference: 1.774149e-08"
[2] "Component 7: Mean relative difference: 0.0003335902"

The 'fun' thing is that the nlme package (of which lme() is part) itself is identical, I uninstalled and reinstalled to be sure (the nlme_3.1-113.zip file of the package is bit-for-bit identical).

I do not know enough yet to go under the hood. Any pointers or ideas would be appreciated. I have also posted on Revolutions's forum but it seems much less populated than here...

This is under 64-bit Windows 8.1, 64-bit R as well, Intel i7-4770 CPU if it matters. Both the current version of Revolution R (R 3.0.2) and the previous (2.15.3) produce the unexpected (for me) behavior. CRAN-R 3.0.1 and 3.0.2 produce identical results.

sessionInfo() output for Revolution R:

> sessionInfo()
R version 3.0.2 (2013-09-25)
Platform: x86_64-w64-mingw32/x64 (64-bit)

locale:
[1] LC_COLLATE=English_United States.1252 
[2] LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] nlme_3.1-113     Revobase_7.0.0   RevoMods_7.0.0   RevoScaleR_7.0.0
[5] lattice_0.20-24  rpart_4.1-3     

loaded via a namespace (and not attached):
[1] codetools_0.2-8   foreach_1.4.1     grid_3.0.2        iterators_1.0.6  
[5] pkgXMLBuilder_1.0 revoIpe_1.0       tools_3.0.2       XML_3.98-1.1   

UPDATE 1: I have traced the issue (following some of the pointers from the answer & comments below) to the fact that Revolution R uses the Intel MKL BLAS library. If I switch to the BLAS library supplied by CRAN, the issues go away. (Note: I do not know enough to compile R myself, so I have not tested OpenBLAS and the other alternatives. In Revolution R it is just a matter of renaming two dll-s.).

It would seem that other people are getting inconsistent results with MKL as well. The differences there are within machine tollerance, i.e. all.equal() is TRUE while identical() is FALSE. The different results in my case seem to be meaningfully large.

I have posted this issue on Revolution R's forum and will update here if I get a response. I suppose at this point my question should be modified as "when to use MKL BLAS and when CRAN-R BLAS". It is not an issue of speed(*) but of consistent and correct results. I will spend some more time looking for a standard test suite (not sure of the terminology here?) to check the output of R against known-to-be-correct output. This is one of the things I love about data.table, it has its own test visible to the end-user. I am aware than I should not expect a single test encompassing all (or even most) packages, but something covering the base functionality at least.

(*) Speed is dependent on the concrete workflow. In this particular case the CRAN BLAS is faster than MKL (both run single-threaded). In other work, Revolution R has been substantially faster, that's why I am looking into it.

1

There are 1 answers

4
Spacedman On

At a guess, Revo R is parallelising it over CPU cores and the arithmetic in recombining parallel things isn't always associative. In other words, it depends on the order of operations. If the threads finish in different orders, which can happen if the cores have to do something else, then results get added up in a different order and (a+b)+c is not always equal to a+(b+c) in floating point...

To check, is there some way you can tell Revo R to only use one CPU core?