I have written the following code, which returns a "Class not found" exception. I'm not sure what I need to do to load data from a csv file into SparkSQL.
import org.apache.spark.SparkContext
/**
* Loading sales csv using DataFrame API
*/
object CsvDataInput {
def main(args: Array[String]) {
val sc = new SparkContext(args(0), "Csv loading example")
val sqlContext = new org.apache.spark.sql.SQLContext(sc)
val df = sqlContext.load("com.databricks.spark.csv", Map("path" -> args(1),"header"->"true"))
df.printSchema()
df.registerTempTable("data")
val aggDF = sqlContext.sql("select * from data")
println(aggDF.collectAsList())
}
}
Try replacing this line
with this
You are importing just part of the library, but using classes from outside this part. Also, your import is actually misspelled - it should read
org.apache.spark.sql.SQLContext, and you used some other package, not related to the code presented.