Using read.csv.sql to select multiple values from a single column

2.4k views Asked by At

I am using read.csv.sql from the package sqldf to try and read in a subset of rows, where the subset selects from multiple values - these values are stored in another vector.

I have hacked a way to a form that works but I would like to see the correct way to pass the sql statement.

Code below gives minimum example.

library(sqldf)

# some data
write.csv(mtcars, "mtcars.csv", quote = FALSE, row.names = FALSE)

# values to select from variable 'carb'
cc <- c(1, 2)

# This only selects last value from 'cc' vector
read.csv.sql("mtcars.csv", sql = paste("select * from file where carb = ", cc ))

# So try using the 'in' operator - this works
read.csv.sql("mtcars.csv", sql = "select * from file where carb in (1,2)" ) 

# but this doesn't
read.csv.sql("mtcars.csv", sql = paste("select * from file where carb in ", cc ))

# Finally this works
read.csv.sql("mtcars.csv", sql = paste("select * from file where carb in ", 
                                       paste("(", paste(cc, collapse=",") ,")")))

The final line above works, but is there are cleaner way to pass this statement, thanks.

2

There are 2 answers

2
G. Grothendieck On BEST ANSWER

1) fn$ Substitution can be done with fn$ of gsubfn (which is automatically pulled in by sqldf). See the fn$ examples on the sqldf home page. In this case we have:

fn$read.csv.sql("mtcars.csv", 
  sql = "select * from file where carb in ( `toString(cc)` )")

2) join Another approach would be to create a data.frame of the carb values desired and perform a join with it:

Carbs <- data.frame(carb = cc)
read.csv.sql("mtcars.csv", sql = "select * from Carbs join file using (carb)")
4
Thomas On

You could use deparse, but I'm not sure it's much cleaner than what you already have:

read.csv.sql("mtcars.csv",
             sql = paste("select * from file where carb in ", gsub("c","",deparse(cc)) ))

And note that this is not really a general solution, because deparse will not always give you the right character string. It just happens to work in this instance.