How to set Encoder for Row, LabeledPointData in Spark?

466 views Asked by At

How to set Encoders for LabeledPointData which is combination of Double, Vectors of Double. How to set Encoders for creating DataFrame?

public static Dataset<LabeledPoint> convertRDDStringToLabeledPoint(Dataset<String> data,String delimiter) {
    Dataset<LabeledPoint> labeledPointData = data.map(
            (data1)->{
                String splitter[] = data1.split(delimiter);
                double[] arr = new double[splitter.length - 1];
                IntStream.range(0,arr.length).forEach(i->arr[i]=Double.parseDouble(splitter[i+1]));
                return new LabeledPoint(Double.parseDouble(splitter[0]), Vectors.dense(arr));
            },Encoders.???);
    return labeledPointData;
}
1

There are 1 answers

0
Jacek Laskowski On

LabeledPoint is a case class in Scala so I think it's Encoders.product[LabeledPoint].

(I don't know how to write it in Java)