how to use spark-corenlp in java

232 views Asked by At

I am stuck while implementing spark-corenlp in java. I am unable to figure out steps for below scala code in java. spark-corenlp git

scala example:

val input = Seq(
  (1, "<xml>Stanford University is located in California. It is a great university.</xml>")
).toDF("id", "text")
val output = input
  .select(cleanxml('text).as('doc))
  .select(explode(ssplit('doc)).as('sen))
  .select('sen, tokenize('sen).as('words), ner('sen).as('nerTags), sentiment('sen).as('sentiment))

Java :

SparkSession spark = SparkSession.builder().config(conf).getOrCreate();
JavaPairRDD<Integer,String> rdd= new JavaSparkContext(spark.sparkContext()).parallelizePairs(
                Arrays.asList(new Tuple2[]{
                        new Tuple2<Integer,String>(1,"<xml>Stanford University is located in California. " +
                                "It is a great university.</xml>")}));
Dataset<Row> ds = spark.createDataFrame(rdd.rdd(),Type.class);
     // steps to implement nlp ?

Thanks for your help.

0

There are 0 answers