Exception in thread "main" java.lang.NoClassDefFoundError: org/deeplearning4j/nn/conf/layers/Layer

710 views Asked by At

I am trying to build an application on spark using Deeplearning4j library. I have a cluster where i am going to run my jar(built using intelliJ) using spark-submit command. Here's my code

package Com.Spark.Examples

import scala.collection.mutable.ListBuffer
import org.apache.spark.SparkConf
import org.apache.spark.SparkContext
import org.canova.api.records.reader.RecordReader
import org.canova.api.records.reader.impl.CSVRecordReader
import org.deeplearning4j.nn.api.OptimizationAlgorithm
import org.deeplearning4j.nn.conf.MultiLayerConfiguration
import org.deeplearning4j.nn.conf.NeuralNetConfiguration
import org.deeplearning4j.nn.conf.layers.DenseLayer
import org.deeplearning4j.nn.conf.layers.OutputLayer
import org.deeplearning4j.nn.multilayer.MultiLayerNetwork
import org.deeplearning4j.nn.weights.WeightInit
import org.deeplearning4j.spark.impl.multilayer.SparkDl4jMultiLayer
import org.nd4j.linalg.lossfunctions.LossFunctions

object FeedForwardNetworkWithSpark {
  def main(args:Array[String]): Unit ={
    val recordReader:RecordReader = new CSVRecordReader(0,",")
    val conf = new SparkConf()
      .setAppName("FeedForwardNetwork-Iris")
    val sc = new SparkContext(conf)
    val numInputs:Int = 4
    val outputNum = 3
    val iterations =1
    val multiLayerConfig:MultiLayerConfiguration = new NeuralNetConfiguration.Builder()
      .seed(12345)
      .iterations(iterations)
      .optimizationAlgo(OptimizationAlgorithm.STOCHASTIC_GRADIENT_DESCENT)
      .learningRate(1e-1)
      .l1(0.01).regularization(true).l2(1e-3)
      .list(3)
      .layer(0, new DenseLayer.Builder().nIn(numInputs).nOut(3).activation("tanh").weightInit(WeightInit.XAVIER).build())
      .layer(1, new DenseLayer.Builder().nIn(3).nOut(2).activation("tanh").weightInit(WeightInit.XAVIER).build())
      .layer(2, new OutputLayer.Builder(LossFunctions.LossFunction.MCXENT).weightInit(WeightInit.XAVIER)
        .activation("softmax")
        .nIn(2).nOut(outputNum).build())
      .backprop(true).pretrain(false)
      .build
    val network:MultiLayerNetwork = new MultiLayerNetwork(multiLayerConfig)
    network.init
    network.setUpdater(null)
    val sparkNetwork:SparkDl4jMultiLayer = new
        SparkDl4jMultiLayer(sc,network)
    val nEpochs:Int = 6
    val listBuffer = new ListBuffer[Array[Float]]()
    (0 until nEpochs).foreach{i => val net:MultiLayerNetwork = sparkNetwork.fit("/user/iris.txt",4,recordReader)
      listBuffer +=(net.params.data.asFloat().clone())
      }
    println("Parameters vs. iteration Output: ")
    (0 until listBuffer.size).foreach{i =>
      println(i+"\t"+listBuffer(i).mkString)}
  }
}

Here is my build.sbt file

name := "HWApp"

version := "0.1"

scalaVersion := "2.12.3"

libraryDependencies += "org.apache.spark" % "spark-core_2.10" % "1.6.0" % "provided"
libraryDependencies += "org.apache.spark" % "spark-mllib_2.10" % "1.6.0" % "provided"
libraryDependencies += "org.deeplearning4j" % "deeplearning4j-nlp" % "0.4-rc3.8"
libraryDependencies += "org.deeplearning4j" % "dl4j-spark" % "0.4-rc3.8"
libraryDependencies += "org.deeplearning4j" % "deeplearning4j-core" % "0.4-rc3.8"
libraryDependencies += "org.nd4j" % "nd4j-x86" % "0.4-rc3.8" % "test"
libraryDependencies += "org.nd4j" % "nd4j-api" % "0.4-rc3.8"
libraryDependencies += "org.nd4j" % "nd4j-jcublas-7.0" % "0.4-rc3.8"
libraryDependencies += "org.nd4j" % "canova-api" % "0.0.0.14"

when i see my code in intelliJ, it does not show any error but when i execute the application on cluster: i got something like this:

Error

I don't know what it wants from me. Even a little help will be appreciated. Thanks.

1

There are 1 answers

5
Adam Gibson On BEST ANSWER

I'm not sure how you came up with this list of versions (I'm assuming just randomly compiling? please don't do that.)

You are using a 1.5 year old version of dl4j with dependencies that are a year older than that that don't exist anymore.

Start from scratch and follow our getting started and examples like you would any other open source project.

Those can be found here: https://deeplearning4j.org/quickstart

with example projects here: https://github.com/deeplearnin4j/dl4j-examples

A few more things: Canova doesn't exist anymore and has been renamed datavec for more than a year.

All dl4j, datavec, nd4j,.. versions must be the same.

If you are using any of our scala modules like spark, those must also always have the same scala version.

So you are mixing scala 2.12 with scala 2.10 dependencies which is a scala no no (that's not even dl4j specific).

Dl4j only supports scala 2.11 at most. This is mainly because hadoop distros like cdh and hortonworks don't support scala 2.12 yet.

Edit: Another thing to watch out for that is dl4j specific is how we do spark versions. Spark 1 and 2 are supported. Your artifact id should be:

dl4j-spark_${yourscala version} (usually 2.10, 2.11) with a dependency like: 0.9.1_spark_${YOUR VERSION OF SPARK}

This is applicable for our NLP modules as well.

Edit for more folks who haven't followed our getting started (Please do that, we keep that up to date): You also always need an nd4j backend. Usually this is nd4j-native-platform but maybe cuda if you are using gpus with: nd4j-cuda-${YOUR CUDA VERSION}-platform