Is there a tool to convert Excel files into csv using Spark 1.X ? got this issue when executing this tuto https://github.com/ZuInnoTe/hadoopoffice/wiki/Read-Excel-document-using-Spark-1.x
Exception in thread "main" java.lang.NoClassDefFoundError: org/zuinnote/hadoop/office/format/mapreduce/ExcelFileInputFormat
at org.zuinnote.spark.office.example.excel.SparkScalaExcelIn$.convertToCSV(SparkScalaExcelIn.scala:63)
at org.zuinnote.spark.office.example.excel.SparkScalaExcelIn$.main(SparkScalaExcelIn.scala:56)
at org.zuinnote.spark.office.example.excel.SparkScalaExcelIn.main(SparkScalaExcelIn.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.ClassNotFoundException: org.zuinnote.hadoop.office.format.mapreduce.ExcelFileInputFormat
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
You have to build a fat jar that contains all the necessary dependencies. The example project on the HadoopOffice page shows how you build one. One you build the fat/uber jar you simply use it in Spark summit.