Я пытаюсь прочитать файлы Avro из HDFS.Я проверил, что они существуют на узлах данных, и я могу прочитать их с помощью команды hdfs dfs -cat.
Однако, когда я пытаюсь прочитать данные в Scala, я получаю это исключение:
Exception in thread "main" java.io.IOException: Not a data file.
at org.apache.avro.file.DataFileStream.initialize(DataFileStream.java:102)
at org.apache.avro.file.DataFileStream.<init>(DataFileStream.java:84)
at spark_test.TestSparkJob$.main(TestSparkJob.scala:55)
at spark_test.TestSparkJob.main(TestSparkJob.scala)
Caused by: java.io.EOFException
at org.apache.avro.io.BinaryDecoder$InputStreamByteSource.readRaw(BinaryDecoder.java:827)
at org.apache.avro.io.BinaryDecoder.doReadBytes(BinaryDecoder.java:349)
at org.apache.avro.io.BinaryDecoder.readFixed(BinaryDecoder.java:302)
at org.apache.avro.io.Decoder.readFixed(Decoder.java:150)
at org.apache.avro.file.DataFileStream.initialize(DataFileStream.java:100)
... 3 more
В чем может быть причина?
Вот код, который я использую для чтения файлов Avro:
val fsInputStream = fs.open(new Path("/data/avro_static.avro"))
val datumReader = new GenericDatumReader[GenericRecord]()
val inStream = new BufferedInputStream(fsInputStream)
val fileReader = new DataFileStream(inStream, datumReader)
println("Schema " + fileReader.getSchema.toString())
Результаты команды hdfs -dfs -cat
:
Objavro.schema�{"type":"record","name":"TestData","namespace":"sample","fields":[{"name":"random_pk","type":["null",{"type":"bytes","logicalType":"decimal","precision":38,"scale":0}]},{"name":"random_string","type":["string","null"]},{"name":"code","type":["string","null"]},{"name":"random_bool","type":["boolean","null"]},{"name":"random_int","type":["int","null"]},{"name":"random_float","type":["double","null"]},{"name":"random_double","type":["double","null"]},{"name":"random_enum","type":["null",{"type":"enum","name":"enumType","symbols":["VAL_1","VAL_2","VAL_3"]}]},{"name":"random_date","type":["null",{"type":"int","logicalType":"date"}]},{"name":"random_decimal","type":["null",{"type":"bytes","logicalType":"decimal","precision":4,"scale":2}]},{"name":"update_database_time","type":["null",{"type":"long","logicalType":"timestamp-millis"}]},{"name":"update_database_time_tz","type":["null",{"type":"long","logicalType":"timestamp-millis"}]},{"name":"random_money","type":["null",{"type":"bytes","logicalType":"decimal","precision":19,"scale":4}]}]}avro.codec
g�9���E>����this word7,5,1,4,6@`f�D@= snappy���
g�9���E># ����Z���ײZ���that word2,5,4,8���؆@��Q���@���Л�Z��翲ZV��������