So I'm trying to set up Apache Sedona but running into strange issues that suggest that the version compatibilities are off. For context, I have Apache version 1.5.1, PySpark version 3.2.1, and Scala 2.12.18.
I installed the below packages using maven.
I'm trying to run this code
from sedona.spark import *
spark = SedonaContext.builder().\
config('spark.jars.packages',
'org.apache.sedona:sedona-spark-3.4_2.12:1.5.1,'
'org.datasyslab:geotools-wrapper:1.5.1-28.2,'
'uk.co.gresearch.spark:spark-extension_2.12:2.11.0-3.4,'
'org.apache.sedona:sedona-python-adapter-3.0_2.12:1.4.1'). \
config('spark.jars.repositories', 'https://artifacts.unidata.ucar.edu/repository/unidata-all'). \
getOrCreate()
sedona = SedonaContext.create(spark)
according to their example notebook https://github.com/apache/sedona/blob/master/binder/ApacheSedonaSQL.ipynb, but also making sure to add in the Python Adapter.
But I get this error
Py4JJavaError: An error occurred while calling o206.showString.
: java.lang.NoSuchMethodError: 'double org.locationtech.jts.geom.Coordinate.getZ()'
at org.apache.sedona.common.geometrySerde.GeometrySerializer.getCoordinateType(GeometrySerializer.java:449)
at org.apache.sedona.common.geometrySerde.GeometrySerializer.serializePoint(GeometrySerializer.java:112)
at org.apache.sedona.common.geometrySerde.GeometrySerializer.serialize(GeometrySerializer.java:43)
at org.apache.sedona.sql.utils.GeometrySerializer$.serialize(GeometrySerializer.scala:36)
at org.apache.spark.sql.sedona_sql.expressions.implicits$GeometryEnhancer.toGenericArrayData(implicits.scala:139)
at org.apache.spark.sql.sedona_sql.expressions.InferredTypes$.$anonfun$buildSerializer$1(InferredExpression.scala:155)
at org.apache.spark.sql.sedona_sql.expressions.InferredExpression.eval(InferredExpression.scala:71)
at org.apache.spark.sql.catalyst.expressions.UnaryExpression.eval(Expression.scala:477)
at org.apache.spark.sql.catalyst.optimizer.ConstantFolding$$anonfun$apply$1$$anonfun$applyOrElse$1.applyOrElse(expressions.scala:69)
which seems like the Geo tools are not working. I can load a regular dataframe though, just can't do geospatial operations on them. What's the issue here?