Can someone please help how to get the list of built-in data sets and their dependency packages?
How do I get a list of built-in data sets in R?
45.6k views Asked by mockash At
4
There are 4 answers
2
On
I often need to also know which structure of datasets are available, so I created dataStr in my misc package.
dataStr <- function(package="datasets", ...)
{
d <- data(package=package, envir=new.env(), ...)$results[,"Item"]
d <- sapply(strsplit(d, split=" ", fixed=TRUE), "[", 1)
d <- d[order(tolower(d))]
for(x in d){ message(x, ": ", class(get(x))); message(str(get(x)))}
}
dataStr()
Please mind that the output in the console is quite long.
This is the type of output:
[...]
warpbreaks: data.frame
'data.frame': 54 obs. of 3 variables:
$ breaks : num 26 30 54 25 70 52 51 26 67 18 ...
$ wool : Factor w/ 2 levels "A","B": 1 1 1 1 1 1 1 1 1 1 ...
$ tension: Factor w/ 3 levels "L","M","H": 1 1 1 1 1 1 1 1 1 2 ...
WorldPhones: matrix
num [1:7, 1:7] 45939 60423 64721 68484 71799 ...
- attr(*, "dimnames")=List of 2
..$ : chr [1:7] "1951" "1956" "1957" "1958" ...
..$ : chr [1:7] "N.Amer" "Europe" "Asia" "S.Amer" ...
WWWusage: ts
Time-Series [1:100] from 1 to 100: 88 84 85 85 84 85 83 85 88 89 ...
Edit: To get more informative output and use it for unloaded packages or all the packages on the search path, please use the revised online version with
source("https://raw.githubusercontent.com/brry/berryFunctions/master/R/dataStr.R")
1
On
Here is a comprehensive R packages datasets list maintained by Prof. Vincent Arel-Bundock. https://vincentarelbundock.github.io/Rdatasets/
Rdatasetsis a collection of 1892 datasets that were originally distributed alongside the statistical software environment R and some of its add-on packages. The goal is to make these data more broadly accessible for teaching and statistical software development.
There are several ways to find the included datasets in R:
1: Using
data()will give you a list of the datasets of all loaded packages (and not only the ones from thedatasetspackage); the datasets are ordered by package2: Using
data(package = .packages(all.available = TRUE))will give you a list of all datasets in the available packages on your computer (i.e. also the not-loaded ones)3: Using
data(package = "packagename")will give you the datasets of that specific package, sodata(package = "plyr")will give the datasets in theplyrpackageIf you want to know in which package a dataset is located (e.g. the
acmedataset), you can do:which gives: