NiFi PutHiveStreaming processor with Hive: Failed connecting to EndPoint

2.4k views Asked by At

Someone would help on this issue with Nifi 1.3.0 and Hive. I get the same error with hive 1.2 and Hive 2.1.1. The hive table is partioned , bucketed and stored as ORC format.

The partition is created on hdfs but data failed on writing stage. Please check the logs as below:

[5:07 AM] papesdiop: Failed connecting to EndPoint {metaStoreUri='thrift://localhost:9083', database='mydb', table='guys', partitionVals=[dev] }
[5:13 AM] papesdiop: I get in log see next, hope it might help too:
[5:13 AM] papesdiop: Caused by: org.apache.hive.hcatalog.streaming.TransactionError: Unable to acquire lock on {metaStoreUri='thrift://localhost:9083', database='mydb', table='guys', partitionVals=[dev] }
  at org.apache.hive.hcatalog.streaming.HiveEndPoint$TransactionBatchImpl.beginNextTransactionImpl(HiveEndPoint.java:578)

FULL TRACE LOGS:

reconnect. org.apache.thrift.transport.TTransportException: null at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132) at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86) at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:429) at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:318) at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:219) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_lock(ThriftHiveMetastore.java:3906) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.lock(ThriftHiveMetastore.java:3893) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.lock(HiveMetaStoreClient.java:1863) at sun.reflect.GeneratedMethodAccessor380.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:152) at com.sun.proxy.$Proxy126.lock(Unknown Source) at org.apache.hive.hcatalog.streaming.HiveEndPoint$TransactionBatchImpl.beginNextTransactionImpl(HiveEndPoint.java:573) at org.apache.hive.hcatalog.streaming.HiveEndPoint$TransactionBatchImpl.beginNextTransaction(HiveEndPoint.java:547) at org.apache.nifi.util.hive.HiveWriter.nextTxnBatch(HiveWriter.java:261) at org.apache.nifi.util.hive.HiveWriter.(HiveWriter.java:73) at org.apache.nifi.util.hive.HiveUtils.makeHiveWriter(HiveUtils.java:46) at org.apache.nifi.processors.hive.PutHiveStreaming.makeHiveWriter(PutHiveStreaming.java:964) at org.apache.nifi.processors.hive.PutHiveStreaming.getOrCreateWriter(PutHiveStreaming.java:875) at org.apache.nifi.processors.hive.PutHiveStreaming.lambda$null$40(PutHiveStreaming.java:676) at org.apache.nifi.processor.util.pattern.ExceptionHandler.execute(ExceptionHandler.java:127) at org.apache.nifi.processors.hive.PutHiveStreaming.lambda$onTrigger$44(PutHiveStreaming.java:673) at org.apache.nifi.controller.repository.StandardProcessSession.read(StandardProcessSession.java:2136) at org.apache.nifi.controller.repository.StandardProcessSession.read(StandardProcessSession.java:2106) at org.apache.nifi.processors.hive.PutHiveStreaming.onTrigger(PutHiveStreaming.java:627) at org.apache.nifi.processors.hive.PutHiveStreaming.lambda$onTrigger$36(PutHiveStreaming.java:551) at org.apache.nifi.processor.util.pattern.PartialFunctions.onTrigger(PartialFunctions.java:114) at org.apache.nifi.processor.util.pattern.RollbackOnFailure.onTrigger(RollbackOnFailure.java:184) at org.apache.nifi.processors.hive.PutHiveStreaming.onTrigger(PutHiveStreaming.java:551) at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1120) at org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:147) at org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:47) at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:132) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) 2017-09-07 06:41:31,015 DEBUG [Timer-4] o.a.n.processors.hive.PutHiveStreaming PutHiveStreaming[id=13ed53d2-015e-1000-c7b1-5af434c38751] Start sending heartbeat on all writers 2017-09-07 06:41:31,890 INFO [Timer-Driven Process Thread-3] hive.metastore Trying to connect to metastore with URI thrift://localhost:9083 2017-09-07 06:41:31,893 INFO [Timer-Driven Process Thread-3] hive.metastore Connected to metastore. 2017-09-07 06:41:31,911 ERROR [Timer-Driven Process Thread-3] o.a.n.processors.hive.PutHiveStreaming PutHiveStreaming[id=13ed53d2-015e-1000-c7b1-5af434c38751] Failed to create HiveWriter for endpoint: {metaStoreUri='thrift://localhost:9083', database='default', table='guys', partitionVals=[dev] }: org.apache.nifi.util.hive.HiveWriter$ConnectFailure: Failed connecting to EndPoint {metaStoreUri='thrift://localhost:9083', database='default', table='guys', partitionVals=[dev] } org.apache.nifi.util.hive.HiveWriter$ConnectFailure: Failed connecting to EndPoint {metaStoreUri='thrift://localhost:9083', database='default', table='guys', partitionVals=[dev] } at org.apache.nifi.util.hive.HiveWriter.(HiveWriter.java:79) at org.apache.nifi.util.hive.HiveUtils.makeHiveWriter(HiveUtils.java:46) at org.apache.nifi.processors.hive.PutHiveStreaming.makeHiveWriter(PutHiveStreaming.java:964) at org.apache.nifi.processors.hive.PutHiveStreaming.getOrCreateWriter(PutHiveStreaming.java:875) at org.apache.nifi.processors.hive.PutHiveStreaming.lambda$null$40(PutHiveStreaming.java:676) at org.apache.nifi.processor.util.pattern.ExceptionHandler.execute(ExceptionHandler.java:127) at org.apache.nifi.processors.hive.PutHiveStreaming.lambda$onTrigger$44(PutHiveStreaming.java:673) at org.apache.nifi.controller.repository.StandardProcessSession.read(StandardProcessSession.java:2136) at org.apache.nifi.controller.repository.StandardProcessSession.read(StandardProcessSession.java:2106) at org.apache.nifi.processors.hive.PutHiveStreaming.onTrigger(PutHiveStreaming.java:627) at org.apache.nifi.processors.hive.PutHiveStreaming.lambda$onTrigger$36(PutHiveStreaming.java:551) at org.apache.nifi.processor.util.pattern.PartialFunctions.onTrigger(PartialFunctions.java:114) at org.apache.nifi.processor.util.pattern.RollbackOnFailure.onTrigger(RollbackOnFailure.java:184) at org.apache.nifi.processors.hive.PutHiveStreaming.onTrigger(PutHiveStreaming.java:551) at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1120) at org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:147) at org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:47) at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:132) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: org.apache.nifi.util.hive.HiveWriter$TxnBatchFailure: Failed acquiring Transaction Batch from EndPoint: {metaStoreUri='thrift://localhost:9083', database='default', table='guys', partitionVals=[dev] } at org.apache.nifi.util.hive.HiveWriter.nextTxnBatch(HiveWriter.java:264) at org.apache.nifi.util.hive.HiveWriter.(HiveWriter.java:73) ... 24 common frames omitted Caused by: org.apache.hive.hcatalog.streaming.TransactionError: Unable to acquire lock on {metaStoreUri='thrift://localhost:9083', database='default', table='guys', partitionVals=[dev] } at org.apache.hive.hcatalog.streaming.HiveEndPoint$TransactionBatchImpl.beginNextTransactionImpl(HiveEndPoint.java:578) at org.apache.hive.hcatalog.streaming.HiveEndPoint$TransactionBatchImpl.beginNextTransaction(HiveEndPoint.java:547) at org.apache.nifi.util.hive.HiveWriter.nextTxnBatch(HiveWriter.java:261) ... 25 common frames omitted Caused by: org.apache.thrift.transport.TTransportException: null at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132) at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86) at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:429) at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:318) at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:219) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_lock(ThriftHiveMetastore.java:3906) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.lock(ThriftHiveMetastore.java:3893) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.lock(HiveMetaStoreClient.java:1863) at sun.reflect.GeneratedMethodAccessor380.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:152) at com.sun.proxy.$Proxy126.lock(Unknown Source) at org.apache.hive.hcatalog.streaming.HiveEndPoint$TransactionBatchImpl.beginNextTransactionImpl(HiveEndPoint.java:573) ... 27 common frames omitted 2017-09-07 06:41:31,911 ERROR [Timer-Driven Process Thread-3] o.a.n.processors.hive.PutHiveStreaming PutHiveStreaming[id=13ed53d2-015e-1000-c7b1-5af434c38751] Error connecting to Hive endpoint: table guys at thrift://localhost:9083 2017-09-07 06:41:31,911 DEBUG [Timer-Driven Process Thread-3] o.a.n.processors.hive.PutHiveStreaming PutHiveStreaming[id=13ed53d2-015e-1000-c7b1-5af434c38751] has chosen to yield its resources; will not be scheduled to run again for 1000 milliseconds 2017-09-07 06:41:31,912 ERROR [Timer-Driven Process Thread-3] o.a.n.processors.hive.PutHiveStreaming PutHiveStreaming[id=13ed53d2-015e-1000-c7b1-5af434c38751] Hive Streaming connect/write error, flow file will be penalized and routed to retry. org.apache.nifi.util.hive.HiveWriter$ConnectFailure: Failed connecting to EndPoint {metaStoreUri='thrift://localhost:9083', database='default', table='guys', partitionVals=

The Hive table

CREATE TABLE mydb.guys(   firstname string,   lastname string) PARTITIONED BY (   job string) CLUSTERED BY (   firstname) INTO 10 BUCKETS ROW FORMAT SERDE   'org.apache.hadoop.hive.ql.io.orc.OrcSerde' STORED AS ORC LOCATION   'hdfs://localhost:9000/user/papesdiop/guys' TBLPROPERTIES ( 'transactional'='true')

Thanks in advance

1

There are 1 answers

5
mattyb On

If this is failing during the write to HDFS, perhaps your user does not have permissions to write to the target directory? If you have more information from the full stack trace please add it to your question, as it will help diagnose the problem. When I had this issue a while ago, it was because my NiFi user needed to be created on the target OS and added to the appropriate HDFS group(s) in order to get permission for PutHiveStreaming to write out to the ORC file(s) in HDFS.