I am trying to load data from redshift on to hdfs(parquet format)
,using sqoop(--as-parquetfile)
.
Has anyone else encountered this same error (see below)? If so, how did you go about fixing the problem?
Error: org.kitesdk.data.DatasetIOException: Cannot decode Avro value
at org.kitesdk.data.spi.SchemaUtil.fromString(SchemaUtil.java:419)
at org.kitesdk.data.spi.predicates.In.fromString(In.java:47)
at org.kitesdk.data.spi.predicates.Predicates.fromString(Predicates.java:85)
at org.kitesdk.data.spi.Constraints.fromQueryMap(Constraints.java:468)
at org.kitesdk.data.mapreduce.DatasetKeyOutputFormat.loadOrCreateTaskAttemptView(DatasetKeyOutputFormat.java:577)
at org.kitesdk.data.mapreduce.DatasetKeyOutputFormat.getRecordWriter(DatasetKeyOutputFormat.java:426)
at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.<init>(MapTask.java:644)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: java.io.EOFException
at org.apache.avro.io.BinaryDecoder.readInt(BinaryDecoder.java:153)
at org.apache.avro.io.BinaryDecoder.readIndex(BinaryDecoder.java:423)
at org.apache.avro.io.ResolvingDecoder.doAction(ResolvingDecoder.java:229)
at org.apache.avro.io.parsing.Parser.advance(Parser.java:88)
at org.apache.avro.io.ResolvingDecoder.readIndex(ResolvingDecoder.java:206)
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:152)
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:139)
at org.kitesdk.data.spi.SchemaUtil.fromString(SchemaUtil.java:417)
... 13 more
Thanks for any suggestions you may have.