I am working on an hive query that needs to count an enormous amount of data reported in a table like this one:
table[ |column_name |
| (...) |
|{"data":"number"}|
(...) ]
The performance of my query is important cause of the number of data I have to treat.
By searching on the web I found three methods to do this. Does some people know which is the fastest? or can propose another method which is faster than those three?
-
SELECT count(*) FROM table WHERE column LIKE TRIM('{"data":"number"}') -
SELECT count(*) FROM table WHERE get_json_object(column,'$.data)=="number" -
SELECT count(*) FROM ( SELECT json_tuple(column,'number') FROM table) AS new_column WHERE new_column.col1 = number