I've had a look around and can't quite seem to get a grasp of is going on with this. I'm using R in Eclipse. The file I'm trying to import is 700mb with around 15mil rows and 6 columns. As I was having problems loading in I have started using the ff
package.
library(ff)
FDF = read.csv.ffdf(file='C:\\Users\\William\\Desktop\\R Data\\GBPUSD.1986.2014.txt', header = FALSE, colClasses=c('factor','factor','numeric','numeric','numeric','numeric'), sep=',')
names(FDF)= c('Date','Time','Open','High','Low','Close')
#names the columns in the ffdf file
dim(FDF)
# produces dimensions of the file
I then want to create a POSIXct sequence which will later be joined against the imported file. I had tried;
tm1 = seq(as.POSIXct("1986/12/1 00:00"), as.POSIXct("2014/09/04 23:59"),"mins"))
tm1 = data.frame (DateTime=strftime(tm1,format='%Y.%m.%d %H:%M'))
However R kept of crashing. I then tested this is RStudio and saw that their where constraints on the vector. It did, however, produce the correct
dim(tm1)
names(tm1)
So I went back into Eclipse thinking this was something to do with memory allocation. I've attempted the following;
library(ff)
tm1 = as.ffdf(seq(as.POSIXct("1986/12/1 00:00"), as.POSIXct("2014/09/04 23:59"),"mins"))
tm1 = as.ffdf(DateTime=strftime(tm1,format='%Y.%m.%d %H:%M'))
names(tm1) = c('DateTime')
dim(tm1)
names(tm1)
This gives an error of
no applicable method for 'as.ffdf' applied to an object of class "c('POSIXct', 'POSIXt')"
I can't seem to work around this. I then tried ...
library(ff)
tm1 = as.ff(seq(as.POSIXct("1986/12/1 00:00"), as.POSIXct("2014/09/04 23:59"),"mins"))
tm1 = as.ff(DateTime=strftime(tm1,format='%Y.%m.%d %H:%M'))
Which produce the output dates, however not in the correct format. In addition to this, when ...
dim(tm1)
names(tm1)
where executed they both returned null.
Question
- How can I produce a POSIXct seq in the format I require above?
We'll we got there in the end.
I believe the problem was the available RAM during the creation of the full vector. As this was the case I broke the vector into 3, converted them into ffdf format to free up RAM and then used
rbind
to bind them together.The problem with formatting the vector once created, I believe, was due to accessing RAM. Every time I tried this R crashed.
Even with the work around below my machine is slowing (4gb). I've ordered some more RAM and hope this will smooth future operations.
Below is the working code;