java.io.IOException: не файл данных Avro на HDFS - PullRequest
0 голосов
/ 03 мая 2019

Я пытаюсь прочитать файлы Avro из HDFS.Я проверил, что они существуют на узлах данных, и я могу прочитать их с помощью команды hdfs dfs -cat.

Однако, когда я пытаюсь прочитать данные в Scala, я получаю это исключение:

Exception in thread "main" java.io.IOException: Not a data file.
    at org.apache.avro.file.DataFileStream.initialize(DataFileStream.java:102)
    at org.apache.avro.file.DataFileStream.<init>(DataFileStream.java:84)
    at spark_test.TestSparkJob$.main(TestSparkJob.scala:55)
    at spark_test.TestSparkJob.main(TestSparkJob.scala)
Caused by: java.io.EOFException
    at org.apache.avro.io.BinaryDecoder$InputStreamByteSource.readRaw(BinaryDecoder.java:827)
    at org.apache.avro.io.BinaryDecoder.doReadBytes(BinaryDecoder.java:349)
    at org.apache.avro.io.BinaryDecoder.readFixed(BinaryDecoder.java:302)
    at org.apache.avro.io.Decoder.readFixed(Decoder.java:150)
    at org.apache.avro.file.DataFileStream.initialize(DataFileStream.java:100)
    ... 3 more

В чем может быть причина?

Вот код, который я использую для чтения файлов Avro:

val fsInputStream = fs.open(new Path("/data/avro_static.avro"))
val datumReader = new GenericDatumReader[GenericRecord]()

val inStream = new BufferedInputStream(fsInputStream)
val fileReader = new DataFileStream(inStream, datumReader)

println("Schema " + fileReader.getSchema.toString())

Результаты команды hdfs -dfs -cat:

Objavro.schema�{"type":"record","name":"TestData","namespace":"sample","fields":[{"name":"random_pk","type":["null",{"type":"bytes","logicalType":"decimal","precision":38,"scale":0}]},{"name":"random_string","type":["string","null"]},{"name":"code","type":["string","null"]},{"name":"random_bool","type":["boolean","null"]},{"name":"random_int","type":["int","null"]},{"name":"random_float","type":["double","null"]},{"name":"random_double","type":["double","null"]},{"name":"random_enum","type":["null",{"type":"enum","name":"enumType","symbols":["VAL_1","VAL_2","VAL_3"]}]},{"name":"random_date","type":["null",{"type":"int","logicalType":"date"}]},{"name":"random_decimal","type":["null",{"type":"bytes","logicalType":"decimal","precision":4,"scale":2}]},{"name":"update_database_time","type":["null",{"type":"long","logicalType":"timestamp-millis"}]},{"name":"update_database_time_tz","type":["null",{"type":"long","logicalType":"timestamp-millis"}]},{"name":"random_money","type":["null",{"type":"bytes","logicalType":"decimal","precision":19,"scale":4}]}]}avro.codec
g�9���E>����this word7,5,1,4,6@`f�D@=                                   snappy���
g�9���E># ����޲Z���ײZ���that word2,5,4,8���؆@��Q���@���Л�޲Z��翲ZV��������
...