I am using read.csv.sql
from the package sqldf
to try and read in a subset of rows, where the subset selects from multiple values - these values are stored in another vector.
I have hacked a way to a form that works but I would like to see the correct way to pass the sql
statement.
Code below gives minimum example.
library(sqldf)
# some data
write.csv(mtcars, "mtcars.csv", quote = FALSE, row.names = FALSE)
# values to select from variable 'carb'
cc <- c(1, 2)
# This only selects last value from 'cc' vector
read.csv.sql("mtcars.csv", sql = paste("select * from file where carb = ", cc ))
# So try using the 'in' operator - this works
read.csv.sql("mtcars.csv", sql = "select * from file where carb in (1,2)" )
# but this doesn't
read.csv.sql("mtcars.csv", sql = paste("select * from file where carb in ", cc ))
# Finally this works
read.csv.sql("mtcars.csv", sql = paste("select * from file where carb in ",
paste("(", paste(cc, collapse=",") ,")")))
The final line above works, but is there are cleaner way to pass this statement, thanks.
1) fn$ Substitution can be done with
fn$
of gsubfn (which is automatically pulled in by sqldf). See thefn$
examples on the sqldf home page. In this case we have:2) join Another approach would be to create a data.frame of the
carb
values desired and perform a join with it: