I have a block of code that is supposed to build a RNN model with 5 lag variables for an observation of time series data. Here is the code:
library(Quandl)
key<-"*******************"
Quandl.api_key(key)
sh_stock_ex <- Quandl("YAHOO/SS_600292", type="xts")
library(xts)
data <- scale(sh_stock_ex[-1,5])
feat <- merge(na.trim(lag(data,1)), na.trim(lag(data,2)), na.trim(lag(data,3)), na.trim(lag(data,4)),
na.trim(lag(data,5)), all=FALSE)
dataset <- merge(feat, data, all = FALSE)
colnames(dataset) <- c("lag.1", "lag.2","lag.3","lag.4","lag.5", "obj")
index <- 1:4000
training <- as.data.frame(dataset[index,])
testing <- as.data.frame(dataset[-index,])
library(mxnet)
train.x <- data.matrix(training[,-6])
train.y <- training[,6]
test.x <- data.matrix(testing[,-6])
test.y <- testing[,6]
get.label <- function(X) {
label <- array(0, dim=dim(X))
d <- dim(X)[1]
w <- dim(X)[2]
for (i in 0:(w-1)) {
for (j in 1:d) {
label[i*d+j] <- X[(i*d+j)%%(w*d)+1]
}
}
return (label)
}
X.train.label <- get.label(t(train.x))
X.val.label <- get.label(t(test.x))
X.train <- list(data=t(train.x), label=X.train.label)
X.val <- list(data=t(test.x), label=X.val.label)
#X.train <- list(data=t(train.x), label=X.train.label)
#X.val <- list(data=t(test.x), label=X.val.label)
batch.size = 5
seq.len = 5
num.hidden = 3
num.embed = 3
num.rnn.layer = 1
num.lstm.layer = 1
num.round = 1
update.period = 1
learning.rate= 0.1
wd=0.00001
clip_gradient=1
mx.set.seed(0)
model <- mx.rnn(X.train, X.val, num.rnn.layer=num.rnn.layer, seq.len=seq.len, num.hidden=num.hidden,
num.embed=num.embed, num.label=5, batch.size=batch.size, input.size=5, ctx = mx.cpu(),
num.round = num.round, update.period = update.period, initializer = mx.init.uniform(0.01),
dropout = 0, optimizer = "sgd", batch.norm = FALSE,
learning.rate=learning.rate, wd=wd, clip_gradient=clip_gradient)
#preds = predict(model,t(test.x))
mx.rnn.inference(num.rnn.layer = num.rnn.layer,input.size = 5,num.hidden = num.hidden,
num.embed = num.embed,num.label = 5,batch.size = batch.size,ctx = mx.cpu(),
dropout = 0,batch.norm = FALSE,arg.params = model$arg.params)
In the call to mx.rnn it raises the following error:
[15:36:29] src/operator/./reshape-inl.h:311: Using target_shape will be deprecated.
[15:36:29] src/operator/./reshape-inl.h:311: Using target_shape will be deprecated.
[15:36:29] src/operator/./reshape-inl.h:311: Using target_shape will be deprecated.
[15:36:29] src/operator/./reshape-inl.h:311: Using target_shape will be deprecated.
[15:36:29] C:/Users/qkou/mxnet/dmlc-core/include/dmlc/logging.h:235: [15:36:29] src/ndarray/ndarray.cc:231: Check failed: from.shape() == to->shape() operands shape mismatch
Error in exec$update.arg.arrays(arg.arrays, match.name, skip.null) :
[15:36:29] src/ndarray/ndarray.cc:231: Check failed: from.shape() == to->shape() operands shape mismatch
Is it not that I get this everytime. A couple of runs before this code actually ran. Could you please help me in figuring out what's happening?
Most probably the issue is with the data you receive from Quandl or how you are processing it.
NAs stays in the array after
na.trim()
, if NA is in the middle. Maybe it causes a failure of shapes matching in some situations. I would recommend to look into the state of the input once you see the fail again.Otherwise, after adding a few extra required callbacks, your code is valid. Here it is with the parameters added inline and using synthetic data:
If I run it, I get:
And if I execute a prediction, I receive: