I use Hortonworks Sandbox HDP 2.6.5 and putty to use Linux shell. My OS is window10.
I put some JSON file on HDFS and I want to open this file with pyspark.
I use below python file on linux, typing "spark-submit example.py" on shell
from pyspark.sql import SparkSession
if __name__ == "main":
spark = SparkSession.builder.appName('JSONRead').getOrCreate()
jsonData = spark.read.json('hdfs://localhost/user/maria_dev/example.json')
jsonData.printSchema()
jsonData.createOrReplaceTempView('Users')
userNames = spark.sql('SELECT _id, name, age, email, phone, gender, index from Users')
spark.stop()
But I got this error message
"Call From sandbox-hdp.hortonworks.com/172.18.0.2 to localhost:8020 failed on connection exception"
I searched this problem on stackoverflow and people usually said name node is running different port or not run. But I don't know how to know name node's status and how to restart it.
I typed "sudo service hadoop-hdfs-namenode restart" But putty return "Unit hadoop-hdfs-namenode.service could not be found."
What can I do? Can you help me please?