I have a Hive table which has underlying files in Avro format with a schema (xyz.avsc) attached to it. Both are in HDFS. I would like to read the Avro file data like we read a HDFS text file (sc.textFile('hdfs://data/filename')) to generate few stats and to run few sparksql on them.
Can you please guide me on how to read the Avro file?
Limitation: I have only Avro library installed. (Not fast-Avro or databricks avro).
PS: I do not want to read the data through Hive as it will be a performance bottleneck.