Is it possible to select a variable and then some other variables by condition in one call in R?

72 views Asked by At

I'm playing around with flights database which is included in the nycflights13 library and I thought it may be interesting to select one (or any number) of variables by hand and then some other by condition (for example "carrier" and then all numeric variables). This can be done in three steps, for example:

library(nycflights13)
data(flights)
flights
a <- flights[,"carrier", drop=FALSE]
b <- flights[, lapply(flights,is.numeric) == TRUE, drop=FALSE]
ab <- cbind(a,b)
str(ab) # 'data.frame':   336776 obs. of  15 variables:

But this doesn't work:

flights[, "carrier" & c(lapply(flights,is.numeric)) == TRUE, drop=FALSE]
flights[, "carrier" & lapply(flights,is.numeric) == TRUE, drop=FALSE]

Error in "carrier" & lapply(flights, is.numeric) == TRUE : 
  solo son posibles operaciones para variables de tipo numérico, compleja o lógico

I must say select_if from tidyverse is not useful either.

So my question is: is it possible to achieve what I want to do in one call and how can it be done? Thanks for any comment or suggestion

1

There are 1 answers

1
akrun On BEST ANSWER

Instead of lapply, we can use sapply. It will give a logical output, use that to extract the names or use which, and then concatenate with the 'carrier'

flights[, c("carrier", names(flights)[sapply(flights, is.numeric)]) , drop=FALSE]

The drop is not needed for tibble as by default it won't drop


With dplyrone option is

library(dplyr)
flights %>% 
       select(carrier,  select_if(., is.numeric) %>% names)