Why is MapR giving me a null pointer when reading files?

431 views Asked by At

I get the following exception when reading files from a mapr directory:

java.lang.NullPointerException
at com.mapr.fs.MapRFsInStream.read(MapRFsInStream.java:150)
at java.io.DataInputStream.read(DataInputStream.java:83)
at org.apache.hadoop.util.LineReader.readDefaultLine(LineReader.java:205)
at org.apache.hadoop.util.LineReader.readLine(LineReader.java:169)
at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:203)
at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:43)
at org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:184)
at org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:167)
at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:71)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at org.apache.spark.Aggregator.combineValuesByKey(Aggregator.scala:37)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$1.apply(PairRDDFunctions.scala:90)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$1.apply(PairRDDFunctions.scala:90)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:37)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:240)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
at org.apache.spark.scheduler.ShuffleMapTask.run(ShuffleMapTask.scala:149)
at org.apache.spark.scheduler.ShuffleMapTask.run(ShuffleMapTask.scala:88)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:158)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
at java.lang.Thread.run(Thread.java:662)

When I run it on a local spark interrupter, I do not get an exception. My guess is the file type is causing the excepting. Any idea what is causing this NP?

1

There are 1 answers

0
Nabeel Moidu On BEST ANSWER

Can you give some more context about what you are attempting to run here ? The versions of the components involved etc.

The NPE above normally happens if the fileSystem object is closed before the actual input data for a Map/Reduce job is finished being read and output written. Spark might be attempting something similar here.