I am trying to connect the spark master from the client machine with the Python script.
I'm encountering an error while running my Spark application from the client side. The error message states: /usr/local/lib/python3.9/site-packages/pyspark/bin/load-spark-env.sh: line 68: ps: command not found. Could someone help me understand what might be causing this issue?
/usr/local/lib/python3.9/site-packages/pyspark/bin/load-spark-env.sh: line 68: ps: command not found
Here is the Python script, running as python test.py:
spark = SparkSession.builder \
.appName("PySpark v1") \
.master("spark://test-spark-master-0.test-spark-headless.test-spark.svc.cluster.local:7077") \
.getOrCreate()
data = [("John", 25), ("Anna", 30), ("Mike", 35)]
df = spark.createDataFrame(data, ["Name", "Age"])
df.show()
filtered_df = df.filter(df["Age"] > 30)
filtered_df.show()
spark.stop()
i added package "apt-get install procps"