I got a freezing status when I tried to run very simple pyspark codes (Spark 3.5.0, Apple Silicon M1 Pro Chip). It used to work well with Spark 3.4.2. I thought it was version compatibility issue and I tried many different combinations (Python 3.9 + Spark 3.4.2, Python3.10 + Spark 3.4.2, Python3.9+Spark3.5.0, etc.).
I checked Spark UI executor and then jstack. The jstack shows something is blocked between python and java. By the way, I tried the same code in spark-shell and SparkR, both works well.
"Idle Worker Monitor for python3" #64 daemon prio=5 os_prio=31 tid=0x0000000125051800 nid=0xf303 waiting for monitor entry [0x00000001769d2000]
java.lang.Thread.State: BLOCKED (on object monitor)
at org.apache.spark.api.python.PythonWorkerFactory$MonitorThread.run(PythonWorkerFactory.scala:328)
- waiting to lock <0x00000007aea31be0> (a org.apache.spark.api.python.PythonWorkerFactory)
"Executor task launch worker for task 0.0 in stage 0.0 (TID 0)" #63 daemon prio=5 os_prio=31 tid=0x0000000123c98800 nid=0x5107 runnable [0x00000001767c5000]
java.lang.Thread.State: RUNNABLE
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
at java.net.SocketInputStream.read(SocketInputStream.java:171)
at java.net.SocketInputStream.read(SocketInputStream.java:141)
at java.net.SocketInputStream.read(SocketInputStream.java:224)
at java.io.DataInputStream.readInt(DataInputStream.java:387)
at org.apache.spark.api.python.PythonWorkerFactory.createSocket$1(PythonWorkerFactory.scala:127)
at org.apache.spark.api.python.PythonWorkerFactory.liftedTree1$1(PythonWorkerFactory.scala:143)
at org.apache.spark.api.python.PythonWorkerFactory.createThroughDaemon(PythonWorkerFactory.scala:142)
- locked <0x00000007aea31be0> (a org.apache.spark.api.python.PythonWorkerFactory)
at org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:107)
at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:124)
- locked <0x00000007808be028> (a org.apache.spark.SparkEnv)
at org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:174)
at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:67)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:364)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:328)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:93)
at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:161)
at org.apache.spark.scheduler.Task.run(Task.scala:141)
at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:620)
at org.apache.spark.executor.Executor$TaskRunner$$Lambda$1352/1057379762.apply(Unknown Source)
at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64)
at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:94)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:623)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)