I've a problem to connect to Hadoop hive through juypter notebook .
` from pyhive import hive import kerberos
# Define the connection parameters
hive_host = 'XX.XX.XX.XXX'
hive_port = 10000 # Default Hive server port
hive_database = 'XXXX'
keytab_path = '/project_data/data_asset/XXX.XXXx.keytab'
principal = '[email protected]'
# Create a Kerberos transport
# Initialize the Kerberos context
kerberos.ccache = '/project_data/data_asset' # Specify the credential cache path
kerberos.authGSSClientInit(principal)
# Authenticate using the keytab
kerberos.authGSSClientStep()
# Create a connection to Hive
connection = hive.Connection(
host=hive_host,
port=hive_port,
username=principal,
database=hive_database,
auth='KERBEROS',
)
# Create a cursor to execute Hive queries
cursor = connection.cursor()
# Now you can use the cursor to execute Hive queries
cursor.execute('SELECT * FROM your_table')
result = cursor.fetchall()
# Close the cursor and connection when you're done
cursor.close()
connection.close()
Error I receive is as below
TypeError Traceback (most recent call last)
Cell In[21], line 17
14 kerberos.authGSSClientInit(principal)
16 # Authenticate using the keytab
---> 17 kerberos.authGSSClientStep()
19 # Create a connection to Hive
20 connection = hive.Connection(
21 host=hive_host,
22 port=hive_port,
(...)
25 auth='KERBEROS',
26 )
TypeError: function missing required argument 'state' (pos 1)
`
Hint from anyone ?!
I am expecting to connect to hadoop db and Kerberos keytab through Watson studio using libraries , not platform connection.
I ve tried alot of methods to connect using ready made connection platform code ,but i need to do alot of sql queries from old connection which can be done only by direct connection from juypter notebook, any idea how to do that ?!