Hive external table with JSON SerDe fetching all NULL values

3.1k views Asked by At

My data is stored in HDFS at directory /tmp/kafka/alert in multiple files. Each file contain new-line separated JSON objects like following.

{"alertHistoryId":123456,"entityId":123,"deviceId":"123","alertTypeId":1,"AlertStartDate":"Dec 28, 2016 12:05:48 PM"}
{"alertHistoryId":123456,"entityId":125,"deviceId":"125","alertTypeId":5,"AlertStartDate":"Dec 28, 2016 11:58:48 AM"}

I added hive JSON SerDe jar using below

ADD JAR /usr/local/downloads/hive-serdes-1.0-SNAPSHOT.jar;

I created table with following

CREATE EXTERNAL TABLE IF NOT EXISTS my_alert (
alertHistoryId bigint, entityId bigint, deviceId string, alertTypeId int,  AlertStartDate string
)
ROW FORMAT SERDE 'com.cloudera.hive.serde.JSONSerDe'
LOCATION '/tmp/kafka/alert';

table created successfully. But when I fetched data, I got all null values. Anyone got any idea how to resolve this?

2

There are 2 answers

1
Sandeep Singh On BEST ANSWER

You are using old version of JSON Serde. There might be an issue with your JSON Serde and Hadoop Distribution. Please find below link to get new version of Json Serde. Follow the steps from the link to build it according to your Hadoop distribution.

https://github.com/rcongiu/Hive-JSON-Serde

Please see below working example.

hive> add jar /User/User1/json-serde-1.3.8-SNAPSHOT-jar-with-dependencies.jar;
Added [/User/User1/json-serde-1.3.8-SNAPSHOT-jar-with-dependencies.jar] to class path
Added resources: [/User/User1/json-serde-1.3.8-SNAPSHOT-jar-with-dependencies.jar]
hive> use default;
OK
Time taken: 0.021 seconds
hive> CREATE EXTERNAL TABLE IF NOT EXISTS json_poc (
    > alertHistoryId bigint, entityId bigint, deviceId string, alertTypeId int,  AlertStartDate string
    > )
    > ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'
    > LOCATION '/User/User1/sandeep_poc/hive_json';
OK
Time taken: 0.077 seconds
hive> select * from json_poc;
OK
123456  123     123     1       Dec 28, 2016 12:05:48 PM
123456  125     125     5       Dec 28, 2016 11:58:48 AM
Time taken: 0.052 seconds, Fetched: 2 row(s)
hive>

How to build jar.

Maven should be installed on your PC then run command like this.

C:\Users\User1\Downloads\Hive-JSON-Serde-develop\Hive-JSON-Serde-develop>mvn -Phdp23 clean package

In my case I am using hdp2.3 so I have provided -Phdp23

Hope it will help if you are willing to use Hive JSON Serde.

2
Indrajit Swain On

Dont use Serde Adding Jar and converting those is always overhead.Rather than you can read the JSON using inbuilt get_json_object and json_tuple .if you are looking for an example how to use see this blog querying-json-records-via-hive

If you wanted to use JSON Serde only then have a look on this Hive-JSON-Serde. Before test it out first of all validate the JSON Validator.