R Weka J48 with snowfall for parallel execution

135 views Asked by At

So I tried using the snowfall package for parallel execution in R, using all my cpu cores. This is the code I used for testing:

library(snow)
library(snowfall)
sfInit(parallel = TRUE, cpus = 16, type = "SOCK")
data <- array(1:1000000, dim=c(1000000,1))
system.time(x <- sfLapply(data, fun=function(x){return (x*x) }))

Which effectively runs 16 times faster as it uses all CPU cores available. But when I try this:

system.time(m2 <- J48(CHURNED_F~., data = data[, -c(1)]))

It takes about 50 seconds, as a test (with only about 1% of the whole data set) The following runs correctly but takes the same time and only uses one CPU:

library(snow)
library(snowfall)
sfInit(parallel = TRUE, cpus = 16, type = "SOCK")
system.time(m2 <- sfLapply("CHURNED_F~.", J48, data[, -c(1)]))

Am I just using the wrong syntax? How can I make this run in parallel?

0

There are 0 answers