Python Code Connecting to Hadoop Hive Kerberos Keytab through watson studio

65 views Asked by At

I've a problem to connect to Hadoop hive through juypter notebook .

` from pyhive import hive import kerberos

# Define the connection parameters
hive_host = 'XX.XX.XX.XXX'
hive_port = 10000  # Default Hive server port
hive_database = 'XXXX'
keytab_path = '/project_data/data_asset/XXX.XXXx.keytab'
principal = '[email protected]'

# Create a Kerberos transport
# Initialize the Kerberos context
kerberos.ccache = '/project_data/data_asset'  # Specify the credential cache path
kerberos.authGSSClientInit(principal)

# Authenticate using the keytab
kerberos.authGSSClientStep()

# Create a connection to Hive
connection = hive.Connection(
    host=hive_host,
    port=hive_port,
    username=principal,
    database=hive_database,
    auth='KERBEROS',
)

# Create a cursor to execute Hive queries
cursor = connection.cursor()

# Now you can use the cursor to execute Hive queries
cursor.execute('SELECT * FROM your_table')
result = cursor.fetchall()

# Close the cursor and connection when you're done
cursor.close()
connection.close()



Error I receive is as below 

TypeError                                 Traceback (most recent call last)
Cell In[21], line 17
     14 kerberos.authGSSClientInit(principal)
     16 # Authenticate using the keytab
---> 17 kerberos.authGSSClientStep()
     19 # Create a connection to Hive
     20 connection = hive.Connection(
     21     host=hive_host,
     22     port=hive_port,
   (...)
     25     auth='KERBEROS',
     26 )

TypeError: function missing required argument 'state' (pos 1)

`

Hint from anyone ?!

I am expecting to connect to hadoop db and Kerberos keytab through Watson studio using libraries , not platform connection.

I ve tried alot of methods to connect using ready made connection platform code ,but i need to do alot of sql queries from old connection which can be done only by direct connection from juypter notebook, any idea how to do that ?!

0

There are 0 answers