I have about 1500 csv files which I want to load to my Rstudio. I am going to do use rbind() each of csv file one by one. (using for loop) I predict that total estimated number of rows is 1.6 million. Then I want to load that completed data frame to mySQL server. So is it possible to have 1.6 million rows of data in a data frame?
This is a bad idea because growing objects with iterative calls to
rbind
is very slow in R (see the second circle of the R inferno for details). You will probably find it more efficient to read in all the files and combine them in a single call torbind
:You can find out pretty easily:
Not only can you initialize a data frame with 1.6 million rows, but you can do it in under 0.1 seconds (on my machine).