Is there a way to use read.table()
to read all or part of the file in, use the class function to get the column types, modify the column types, and then re-read the file?
Basically I have columns which are zero padded integers that I like to treat as strings. If I let read.table()
just do its thing it of course assumes these are numbers and strips off the leading zeros and makes the column type integer. Thing is I have a fair number of columns so while I can create a character vector specifying each one I only want to change a couple from what R's best guess is. What I'd like to do is read the first few lines:
myTable <- read.table("//myFile.txt", sep="\t", quote="\"", header=TRUE, stringsAsFactors=FALSE, nrows = 5)
Then get the column classes:
colTypes <- sapply(myTable, class)
Change a couple of column types i.e.:
colTypes[1] <- "character"
And then re-read the file in using the modified column types:
myTable <- read.table("//myFile.txt", sep="\t", quote="\"", colClasses=colTypes, header=TRUE, stringsAsFactors=FALSE, nrows = 5)
While this seems like an infinitely reasonable thing to do, and colTypes = c("character")
works fine, when I actually try it I get a:
scan() expected 'an integer', got '"000001"'
class(colTypes)
and class(c("character"))
both return "character"
so what's the problem?
You use
read.table
scolClasses =
argument to specify the columns you want classified ascharacter
s. For example:[updated...] or, you could set and re-set
colClasses
with a vector...