Using Oozie to create a hive table on hbase causes an error with libthrift?

613 views Asked by At

I'm using an oozie hive action on cloudera (cdh 4) to create an hbase hive table. Running the create table command on my local dev util box executes without error. When I execute the same command via an oozie hive action in the cluster, I get this error:

Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.HiveMain], main() threw exception, org.apache.thrift.EncodingUtils.setBit(BIZ)B
java.lang.NoSuchMethodError: org.apache.thrift.EncodingUtils.setBit(BIZ)B
at org.apache.hadoop.hive.ql.plan.api.Query.setStartedIsSet(Query.java:487)
at org.apache.hadoop.hive.ql.plan.api.Query.setStarted(Query.java:474)
at org.apache.hadoop.hive.ql.QueryPlan.updateCountersInQueryPlan(QueryPlan.java:309)
at org.apache.hadoop.hive.ql.QueryPlan.getQueryPlan(QueryPlan.java:450)
at org.apache.hadoop.hive.ql.QueryPlan.toString(QueryPlan.java:622)
at org.apache.hadoop.hive.ql.history.HiveHistory.logPlanProgress(HiveHistory.java:504)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1106)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:982)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:902)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:412)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:347)
at org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:445)
at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:455)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:713)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:613)
at org.apache.oozie.action.hadoop.HiveMain.runHive(HiveMain.java:302)
at org.apache.oozie.action.hadoop.HiveMain.run(HiveMain.java:260)
at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:37)
at org.apache.oozie.action.hadoop.HiveMain.main(HiveMain.java:64)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:495)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:417)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332)
at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:394)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438)
at org.apache.hadoop.mapred.Child.main(Child.java:262)

Googling around, most answers said that this was due to different versions of thrift on hive, hbase, or hadoop; but as far as I can tell (using find -name in a shell action) they all have version 0.9.0:

Stdoutput ./lib/flume-ng/lib/libthrift-0.9.0.jar
Stdoutput ./lib/hcatalog/share/webhcat/svr/lib/libthrift-0.9.0.jar
Stdoutput ./lib/whirr/lib/libthrift-0.9.0.jar
Stdoutput ./lib/whirr/lib/libthrift-0.5.0.jar
Stdoutput ./lib/hive/lib/libthrift-0.9.0-cdh4-1.jar
Stdoutput ./lib/oozie/libserver/libthrift-0.9.0.jar
Stdoutput ./lib/oozie/libtools/libthrift-0.9.0.jar
Stdoutput ./lib/hbase/lib/libthrift-0.9.0.jar
Stdoutput ./lib/mahout/lib/libthrift-0.9.0.jar

These same versions are on my dev util box, and the hive command works fine. Any ideas what could be causing this issue?

Thanks in advance!

1

There are 1 answers

0
Noah On

The issue was with a jar included in the workflow's lib directory. This jar had dependencies that had dependencies with an older version of thrift.

I was able to circumvent this by making the hive action happen in a sub workflow, then setting

<global> 
  <configuration>
    <property>                                                                                                                                                                                                                                            
      <name>oozie.use.system.libpath</name>                                                                                                                                                                                                               
      <value>false</value>                                                                                                                                                                                                                                
    </property>                                                                                                                                                                                                                                           
    <property>                                                                                                                                                                                                                                            
      <name>oozie.libpath</name>                                                                                                                                                                                                                          
      <value>${wf:appPath()}/lib</value>                                                                                                                                                                                                                  
    </property> 
  </configuration>
</global>

on the workflow. This essentially told it to use the lib in my subworkflow's directory, not the main workflow's lib (which included the bad jar).