I'm a beginner to R, and I'm trying to read a zipped .tsv file in R for analysis, but I am having trouble reading the whole file. The script I used is written by the data providers, so I guess it should be the right code, but I get a message that says
In scan(file = file, what = what, sep = sep, quote = quote, dec = dec, :EOF within quoted string
and I can only see 6,017,494 out of 7,430,874 lines of the data.
The script is as follows:
knitr::opts_chunk$set(echo = TRUE)
library(data.table)
#TODO: change directory as needed
setwd("directory")
data<- read.delim(unz("patent.tsv.zip", "patent.tsv"), header=TRUE, sep ="\t", comment.char="#", stringsAsFactors = FALSE, quote="\"", fill = TRUE)
I looked up some questions on the EOF warning, and some suggested setting quote=""
, and I tried it but then it just takes forever to read the data. I guess it doesn't work in my case because the quote is already set to "\""
? Not sure what that means though.. Can someone help me out?