R "Packages" vs Linux packages for R

82 views Asked by At

I'm a little confused about what is going on when I install packages within R.

I start R (from the terminal, via "R"), and then at the R prompt, I do this:

> install.packages("devtools")

R goes off, and downloads quite a few things, launches a C/C++ compiler against dozens of source files, appears to build stuff successfully, but eventually fails with a message like this:

ERROR: dependencies ‘usethis’, ‘pkgdown’, ‘rcmdcheck’, ‘rversions’, ‘urlchecker’ are not available for package ‘devtools’

This is quite confusing, because I expect R to handle the dependencies.

Then I wonder, have I failed to install R correctly ? Looking at the list of R packages available (for OpenSuse Tumbleweed via zypper), I can see that there are about 50 packages with names like "R-base-devel", "R-graphics", "R-tools", etc.. These are somehow distinct from R's internal conception of a package. It's totally unclear how these Linux packages are related to the R internal packages, and if perhaps I am missing one of them.

I installed R via zypper by way of "sudo zypper install R-recommended-packages", which of course pulled in quite a few dependencies, so I guess I have a valid environment. But, clearly, I am missing something

How does this all hang together ? How do I know which linux package for R contains which R package ?

I wasn't expecting this to be quite a confusing and complicated.

2

There are 2 answers

1
Zé Loff On BEST ANSWER

All R packages are available in source format. That source code can be just plain R, or include stuff in other languages, such as C, C++, or Fortran. While R source code does not need to be pre-compiled to run, stuff in other languages often does. This means that when a package containing C or C++ source code is installed, that code needs to be compiled into binaries. This means that any environment installing R packages from source needs to have the appropriate toolset (compiler, linker, etc) and libraries including dependencies such as Boost. To avoid this, and to speedup the installing process, R packages are also made available in binary form, for some platforms (Windows and macOS being two notable cases).

Linux distributions run on multiple architectures and aren't consistent in how they make their libraries available, so it is not practical to distribute binary R packages (except perhaps for some well known and widely used distributions, and this happens with a lot of software, not just R packages). So in most cases, R packages must be installed from source, and compiled upon installation. To ease things, some Linux distributions pack sets of R packages in their own package system, so that's why you find things like R-graphics and R-recommended-packages on the repository of your OS. And you are absolutely right in your understanding that these OS packages are not the same thing as R packages.

R packages can be found either on centralized repositories, such as CRAN (the official repository for R packages), or Bioconductor, as standalone files. Nowadays you can also just pull a copy of a git repository and build the package from there. This is not to be confused with the repository for OS packages.

When you issue the command install.packages("foo") in R, it will search R repositories for package foo. If it's available in binary form for your platform and OS, it will install that. Otherwise, it will download the source code, compile it if necessary, and install it. By default, R will search the CRAN repository indicated by options('repos'), but you can add others.

Now if the package you are compiling has some external dependencies (such as libxml or libcurl), you'll need to make them available on your system, as @MichaelChirico noted on his comments. So you'll need to find you the requirements of those packages and install via your OS's package management system (e.g. zypper). You will need the development versions for each library (in Linux they often are -dev or -devel packages). This is another advantage of the OS distributing it own sets of pre-built R packages, such as R-recommended-packages: their dependencies are automatically installed by the OS's package management system.

The errors you are seeing are probably due to some of the R packages being installed (perhaps dependencies of the R packages you want) failing to compile, and thus to be installed. You'll need to find out their external requirements and install them first.

3
user2554330 On

@MichaelChirico gave good advice about how to diagnose the issues here. I'll try to answer the question about what "packages" are.

Both R and Linux share the idea of packages as collections of files (code and other things) that depend on each other. For the R devtools package to work, other R packages need to be installed, and install.packages() tries to ensure they are. Similarly, for each Linux package, you may need other Linux packages to get them to work.

However, R packages can also depend on having system files present, for example system libraries to handle graphics, etc. When you try to install an R package with such a "system dependency", it will fail if the dependency is missing. R doesn't try to install the system dependency because the method to install system libraries varies so much from platform to platform. Some R packages will try to give an informative message about how to install the system library yourself. On Linux this would typically be instructions to install a Linux package: you saw an example of this. Other R packages just fail in the installation, with messages like "libcurl not found".