I am stuck while implementing spark-corenlp in java. I am unable to figure out steps for below scala code in java. spark-corenlp git
scala example:
val input = Seq(
(1, "<xml>Stanford University is located in California. It is a great university.</xml>")
).toDF("id", "text")
val output = input
.select(cleanxml('text).as('doc))
.select(explode(ssplit('doc)).as('sen))
.select('sen, tokenize('sen).as('words), ner('sen).as('nerTags), sentiment('sen).as('sentiment))
Java :
SparkSession spark = SparkSession.builder().config(conf).getOrCreate();
JavaPairRDD<Integer,String> rdd= new JavaSparkContext(spark.sparkContext()).parallelizePairs(
Arrays.asList(new Tuple2[]{
new Tuple2<Integer,String>(1,"<xml>Stanford University is located in California. " +
"It is a great university.</xml>")}));
Dataset<Row> ds = spark.createDataFrame(rdd.rdd(),Type.class);
// steps to implement nlp ?
Thanks for your help.