I wanted to understand how hive knows which of the hadoop namenode is in active state and what happens when the active namenode fails
Hive with Hadoop high availability
1.4k views Asked by Chang At
2
There are 2 answers
0
On
In hdfs HA environment name node url should be a logical name (eg hdfs://logicalnamenode). We need to configure hive to work with HA. For that you need to change the hive name node configuration with metatool command.
- List the current NN configuration
~# metatool -listFSRoot
hdfs://namenode:8020/user/hive/warehouse - The following command will update the old NN configuration with Logical name
metatool -updateLocation hdfs://logicalnamenode hdfs://namenode:8020 -tablePropKey avro.schema.url
Hive is configured via
metatool
to point to the configureddfs.nameservices
for HA HDFS. See https://cwiki.apache.org/confluence/display/Hive/Hive+MetaTool.dfs.nameservices
is a logical address while the actual namenodes are configured withdfs.ha.namenodes.[id]
.As for which Namenode is active, state is stored in Zookeeper. When the active namenode fails, failover is triggered after a configured time (5 second default,
ha.zookeeper.session-timeout.ms
). A fencing script is required and triggers the standby namenode to become active.