Livy session to submit pyspark from HDFS

18 views Asked by At

I am trying to run a simple spark sql basically to extract data from a hive table using Livy 0.8.0 using below code. Firewall is already enabled for Livy url.

Import requests 
from requests_kerberos import HTTPKerberosAuth, DISABLED
from Livy import LivySession, models,client

//kerberos auth code
…
…
//create  a spark session
spark_code=f’’’spark.sql(“””select * from tables”””).count()’’’

with LivySession.create(url=livy_yrl, auth=auth, queue=‘root-queue’,verify=cert_verify) as session:
   print(f’session id:{session.session_id}’)
   result=session.run(spark_code)
   print(result)

Above code prints session id and returns record count. What needs to be done If I keep the same spark sql in an hdfs file could any one please let me know how to run that as session.run expects the code not the hdfs file path ? My use case is to trigger livy client session to trigger pyspark files kept at hdfs and which has complex model algorithm .

0

There are 0 answers