Result-set inconsistency between hive and hive-llap

Question

Result-set inconsistency between hive and hive-llap

299 views Asked by Vinay K L At 30 July 2020 at 17:51

we are using Hive 3.1.x clusters on HDI 4.0, with 1 being LLAP and another Just HIVE.

we've created a managed tables on both the clusters with the row count being 272409.

Before merge on both clusters

+---------------------+------------+---------------------+------------------------+------------------------+
| order_created_date  | col_count  | col_distinct_count  |        min_lmd         |        max_lmd         |
+---------------------+------------+---------------------+------------------------+------------------------+
| 20200615            | 272409     | 272409              | 2020-06-15 00:00:12.0  | 2020-07-26 23:42:17.0  |
+---------------------+------------+---------------------+------------------------+------------------------+

Based on the delta, we'd perform a merge operation (which updates 17 rows).

After merging on the hive-llap cluster (before compaction)

+---------------------+------------+---------------------+------------------------+------------------------+
| order_created_date  | col_count  | col_distinct_count  |        min_lmd         |        max_lmd         |
+---------------------+------------+---------------------+------------------------+------------------------+
| 20200615            | 272409     | 272392              | 2020-06-15 00:00:12.0  | 2020-07-27 22:52:34.0  |
+---------------------+------------+---------------------+------------------------+------------------------+

After merging on the hive-llap cluster (after compaction)

+---------------------+------------+---------------------+------------------------+------------------------+
| order_created_date  | col_count  | col_distinct_count  |        min_lmd         |        max_lmd         |
+---------------------+------------+---------------------+------------------------+------------------------+
| 20200615            | 272409     | 272409              | 2020-06-15 00:00:12.0  | 2020-07-27 22:52:34.0  |
+---------------------+------------+---------------------+------------------------+------------------------+

After merging on just hive cluster (without compacting deltas)

+---------------------+------------+---------------------+------------------------+------------------------+
| order_created_date  | col_count  | col_distinct_count  |        min_lmd         |        max_lmd         |
+---------------------+------------+---------------------+------------------------+------------------------+
| 20200615            | 272409     | 272409              | 2020-06-15 00:00:12.0  | 2020-07-27 22:52:34.0  |
+---------------------+------------+---------------------+------------------------+------------------------+

This is the inconsistency observed

However, after compacting the table on hive-llap, the result-set inconsistency is not seen, both the clusters are returning same result.

We thought it might be due to either caching or llap issue, so we restarted the hive-server2 process which will clear the cache. The issue is still persistent.

We also created a dummy table with same schema on just hive cluster and pointed the location of that table to that of llap one, which in turn is producing result as expected.

We even queried on spark using **Qubole spark-acid reader** (direct hive managed table reader), which is also producing expected result

This is very strange and peculiar, can someone help out here.

Original Q&A

There are 2 answers

**Anushan** · Answer 1 · 2020-08-04T16:12:18+00:00

Anushan On 04 August 2020 at 16:12

Qubole does not support Hive LLAP yet. (However, we (at Qubole) are evaluating whether to support this in the future)

**Durga** · Answer 2 · 2020-08-14T06:58:16+00:00

Durga On 14 August 2020 at 06:58

We also faced a similar issue in the HDInsight Hive llap cluster. On setting hive.llap.io.enabled as false resolved the issue

TechQA.

Result-set inconsistency between hive and hive-llap

There are 2 answers

Related Questions in HIVE

Related Questions in AZURE-HDINSIGHT

Related Questions in QUBOLE

Related Questions in SPARK-HIVE

Popular Questions

Popular Tags

Trending Questions