Cannot train KSVM in R

1.1k views Asked by At

I have been on this all day long. Let's say I have a training data like below

1.0000000 0.8260869 0
0.7333333 0.4666667 0
0.0000000 0.0000000 0
0.3076923 0.3076923 0
0.2307692 0.4615385 0
0.9333333 0.4666667 1
0.3157895 0.4210526 1
1.0000000 0.7000000 1
0.3157895 0.2631579 1
0.6666667 0.4444444 1

Which the first couple columns are our feature set and the last column of each row is the label that we are trying to learn/predict.

But when I am trying to train a SVM for the above data with the following script I have wrote:

library(kernlab)
library(Matrix)

kp = function(d, e){
    gama = 0.25

    DA = d[,1]
    DB = d[,2]
    DE = e[,1]
    DF = e[,2]


    q1 = (norm(as.matrix(DA-DE)))^2
    q2 = (norm(as.matrix(DB-DF)))^2
    q3 = (norm(as.matrix(DA-DF)))^2
    q4 = (norm(as.matrix(DB-DE)))^2

    s1 = min((q1+q2),(q3+q4))

    s = (norm(as.matrix(s1)))^2

    exp(-gama*s)
}

data    <- read.csv(file = "dataset.dat", stringsAsFactors = TRUE, nrows = 10)

xtrain  <- as.matrix(data[,1:2])

ytrain  <- as.matrix(data[,687])

class(kp)<-"kernel"

ksvm(x = xtrain, y = ytrain, type = "C-svc", kernel = kp, C = 128, scale = FALSE)

I am getting the following error

Error in indexes[[j]] : subscript out of bounds
Calls: ksvm -> ksvm -> .local
Execution halted

I have googled it but I couldn't come up with a solution.

Question:

What am I doing wrong and how can I make it work!?


Edit

The result of traceback() is as follow:

3: .local(x, ...)
2: ksvm(x = xtrain, y = ytrain, type = "C-svc", kernel = kp, C = 128)
1: ksvm(x = xtrain, y = ytrain, type = "C-svc", kernel = kp, C = 128)

And also the dput(data)

structure(list(X0.8 = c(1, 0.7333333, 0, 0.3076923, 0.2307692), X0.7 = c(0.8260869, 0.4666667, 0, 0.3076923, .4615385)), .Names = c("X0.8",  "X0.7"), row.names = c(NA, 5L), class = "data.frame")
2

There are 2 answers

0
valeriux On

In my case the problem is the NA value. I removed them and I'm figured out

0
Yoni DAHAN On

I've faced the same issue and the problem was actually the presence of some infinite elements in my dataset.

You can check for them using something like apply(xtrain,2,range) and then set them arbitrarily to zero with xtrain$your_var[xtrain$your_var==Inf]<-0.