Mapreduce combinefileinputformat java.lang.reflect.InvocationTargetException while two jobs access same data

Question

Mapreduce combinefileinputformat java.lang.reflect.InvocationTargetException while two jobs access same data

1k views Asked by Harshit Mathur At 25 November 2014 at 05:24

The Hadoop Mapreduce CombineFileInputFormat works great when it comes to read a lot of small size files, however i have been noticing that sometimes the job gets failed with the following exception,

java.lang.RuntimeException: java.lang.reflect.InvocationTargetException at org.apache.hadoop.mapreduce.lib.input.CombineFileRecordReader.initNextRecordReader(CombineFileRecordReader.java:164) at org.apache.hadoop.mapreduce.lib.input.CombineFileRecordReader.nextKeyValue(CombineFileRecordReader.java:67) at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:483) at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:76) at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:85) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:139) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:672) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:330) at org.apache.hadoop.mapred.Child$4.run(Child.java:268) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.

I have noticed that this is happening only when some other mapreduce job is running on the same data at the same time otherwise it works as expected.

Even the same exception is generated when i run hive query under the similar condition.

I have been searching for the solution or probable cause for this.

Original Q&A

There are 1 answers

**Harshit Mathur** · Answer 1 · 2014-12-11T14:45:24+00:00

Finally i got the cause of this issue, actually i have been using the CombineFileInputFormat with gzip, so the first runnning job was extracting the gzip file in the same folder and was deleting it on its completion, however when i ran another job in parallel it also takes the file unzipped by the firstjob in its input.

So in between the execution of the second job the unzipped file was getting deleted by the first job, this actually was causing error.

Similar will be the case with hive.

TechQA.

Mapreduce combinefileinputformat java.lang.reflect.InvocationTargetException while two jobs access same data

There are 1 answers

Related Questions in HADOOP

Related Questions in MAPREDUCE

Related Questions in INVOCATIONTARGETEXCEPTION

Related Questions in RECORDREADER

Popular Questions

Popular Tags

Trending Questions