How to convert an Avro message deserialized with an older version of the schema into the newer, compiled schema?

1.1k views Asked by At

Let's say I used avro-tools to generate code for v2 of "mySchema", and a message comes on a queue that was written with v1 of "mySchema". If I understand right, the best way to handle the situation is to realize that the message was written with v1 of "mySchema", get the v1 schema using a schema registry, and deserialize the message using v1.

At this point, is it at all possible to somehow convert the resulting objects into the corresponding fields and objects from the avro-tools generated code for v2?

So far, it looks like the only way to handle the message results in a "GenericRecord" which requires string keys to access the values of the fields. I'd much prefer to use the generated code for v2 and turn any typos into compile errors instead of runtime errors, know the data type of the value while coding, and avoid setting up a bunch of enums to contain the string keys.

1

There are 1 answers

0
Stephen Kittelson On

Scott on the Avro users mailing list answered (https://lists.apache.org/thread.html/r2e77597fd20de1379fdd4287c02fc703a631cd2309f74f33d6a457b8%40%3Cuser.avro.apache.org%3E):

This code below returns type T and uses SpecificDatumReader.

final BinaryDecoder decoder = DecoderFactory.get().binaryDecoder(...);
final DatumReader<T> datumReader = new SpecificDatumReader<T>(v2Schema, v1Schema);
try {
    final T record = datumReader.read(null, decoder);
    return record;
} catch (java.io.IOException ioe) {
   // Handle it
}

My mistake was missing the T on new SpecificDatumReader<T>, so it was returning a GenericRecord instead of an instance of T.