I have recently run a Sqoop job and then used AvroTools to extract the schema of the avro files and then to compile it in to a Java class. Whenever I try to use the Avro object, I get the following ClassCastException:
java.lang.Exception: java.lang.ClassCastException: org.apache.avro.generic.GenericData$Record cannot be cast to com.xxx.xxx.patient_avro.PatientAvro at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522) Caused by: java.lang.ClassCastException: org.apache.avro.generic.GenericData$Record cannot be cast to com.xxx.xxx.patient_avro.PatientAvro at com.xxx.xxx.knn_mapreduce.KNNMapper.map(KNNMapper.java:28) at com.xxx.xxx.knn_mapreduce.KNNMapper.map(KNNMapper.java:14) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787) at org.apache.hadoop. mapred.MapTask.run(MapTask.java:341)
My Mapper:
public class KNNMapper extends Mapper<AvroKey<PatientAvro>, NullWritable, LongWritable, PatientWritable> {
PatientWritable patient;
LongWritable providerKey;
Integer patientKey;
Integer lengthOfStay;
Integer msDrgGroup;
Integer age;
@Override
public void map(AvroKey<PatientAvro> patientAvro, NullWritable value, Context context) throws IOException, InterruptedException {
patientKey = patientAvro.datum().getPatientKey();
lengthOfStay = patientAvro.datum().getLengthOfStay();
msDrgGroup = patientAvro.datum().getMsDrg();
age = patientAvro.datum().getAge();
patient = new PatientWritable();
patient.set(new Long(patientKey), new Double((double) lengthOfStay), new Double((double) msDrgGroup), new Double((double)age));
providerKey = new LongWritable(patientAvro.datum().getProviderKey());
context.write(providerKey, patient);
}
}
Any help would be greatly appreciated.
Sqoop automatically creates the Java and the .avsc files. Are you creating them separately?. So if you just use the generated Java you should be fine.