I'm using JayDeBeAPI in PySpark (the Apache Spark Python API), and here's the beginning of my code (note, I'm actually running all this through an interactive shell with PySpark).
import jaydebeapi
import jpype
conn = jaydebeapi.connect('org.apache.phoenix.jdbc.PhoenixDriver',
['jdbc:phoenix:hostname', '', ''])
I am querying Apache Phoenix, which is an SQL "front-end" for Apache HBase.
Here's my Python code for the SQL query:
curs = conn.cursor()
curs.execute('select "username",count("username") from "random_data" GROUP BY "username"')
curs.fetchall()
The output I'm getting is like this for all the rows:
(u'Username', <jpype._jclass.java.lang.Long object at 0x25d1e10>)
How can I fix it so that it actually shows the value of that returned column (the count column)?
From the Apache Phoenix datatypes page, the datatype of the count column is BIGINT, which is mapped to java.lang.Long, but for some reason jpype is not displaying the result.
I got JayDeBeAPI 0.1.4 and JPype 0.5.4.2 by python setup.py install when I downloaded them.
The object returned by JPype is a Python version of Java's
java.lang.Longclass. To get the value out of it, use thevalueattribute:JayDeBeApi contains a dict (
_DEFAULT_CONVERTERS) that maps types it recognises to functions that convert the Java values to Python values. This dict can be found at the bottom of__init__.pyin the JayDeBeApi source code.BIGINTis not included in this dict, so objects of that database type don't get mapped out of Java objects into Python values.It's fairly easy to modify JayDeBeApi to add support for
BIGINTs. Edit the__init__.pyfile that contains most of the JayDeBeApi code and add the lineto the
_DEFAULT_CONVERTERSdict.