Is pyspark compatible with multiple spark server versions?

169 views Asked by At

I have an environment with two spark servers of different versions: spark 3.3.2 and spark 3.4.1

I need to use pyspark from a single python session to connect to the two instances above - of spark servers of different versions Is that possible? when i tried to connect pyspark 3.4.1 to the spark 3.3.2 server or pyspark 3.3.2 to a 3.4.1 spark server i get weird Java errors that don't say anything about versions mismatch.

Which pyspark version should I use?

1

There are 1 answers

0
canaytore On

I would highly recommended you to use the corresponding version of PySpark for each Spark server. PySpark is tightly integrated with the Spark version it is built for, and using a different version may result in compatibility issues and unexpected errors that you may not notice the root cause from error codes.