Spark Java sum is giving incorrect value

56 views Asked by At

Spark Java sum is giving incorrect value

Java sample code is as below

List<Double> points = Arrays.asList(-6221.4, 6380.46);
Dataset<Row> dt  = spark.createDataset(points, Encoders.DOUBLE()).toDF("double_vals");
dt.createOrReplaceTempView("dual_table");
spark.sql("select sum(double_vals) from dual_table").show(false);

The expect result is 159.06 but the result I am getting is as below

+-----------------+
|sum(double_vals) |
+-----------------+
|159.0600000000004|
+-----------------+

Am I doing something wrong ?

Extending the example , what if the decimal is dynamic after sum as below, I will not be able to restrict it to 2 decimal points, is there any solution ?

Tuple3<String,String,Double> val1 = new Tuple3<>("Day1","Ram", -6221.4);
Tuple3<String,String,Double> val2 = new Tuple3<>("Day2","Ram", 6380.46);
Tuple3<String,String,Double> val3 = new Tuple3<>("Day1","Sam", 380.46);
Tuple3<String,String,Double> val4 = new Tuple3<>("Day2","Sam", 6380.462);
List<Tuple3<String,String,Double>> points = Arrays.asList(val1,val2,val3,val4);

Dataset<Row> dt  = spark.createDataset(points, Encoders.tuple(Encoders.STRING(),Encoders.STRING(),Encoders.DOUBLE())).toDF("day","name","profit");

dt.createOrReplaceTempView("dual_table");

Dataset<Row> newDs = spark.sql("select NAME, sum(profit) sum_val from dual_table group by name");

newDs.show();

Result will be

+----+------------------+
|NAME|           sum_val|
+----+------------------+
| Ram| 159.0600000000004| This needs 2 decimal points 
| Sam|6760.9220000000005| This needs 3 decimal points
+----+------------------+
1

There are 1 answers

2
s.polam On

Use round function to truncate precision

spark.sql("select round(sum(double_vals), 2) as sum_value from dual_table").show(false)

+---------+
|sum_value|
+---------+
|159.06   |
+---------+

Use cast( sum(<column name>) AS decimal(10, 2))

spark.sql("select cast(sum(double_vals) as decimal(10, 2)) as sum_value from dual_table").show(false)

+---------+
|sum_value|
+---------+
|159.06   |
+---------+