Размер avro составляет около 44 МБ.
Ниже приведена ошибка журналов пряжи:
20/03/30 06:55:04 INFO spark.ExecutorAllocationManager: Existing executor 18 has been removed (new total is 0)
20/03/30 06:55:04 INFO cluster.YarnClusterScheduler: Cancelling stage 5
20/03/30 06:55:04 INFO scheduler.DAGScheduler: ResultStage 5 (head at IrdsFIInstrumentEnricher.scala:15) failed in 213.391 s due to Job aborted due to stage f ailure: Task 0 in stage 5.0 failed 4 times, most recent failure: Lost task 0.3 in stage 5.0 (TID 134, fratlhadooappd30.de.db.com, executor 18): ExecutorLostFa ilure (executor 18 exited caused by one of the running tasks) Reason: Container marked as failed: container_1585337469684_0037_02_000029 on host: fratlhadooap pd30.de.db.com. Exit status: 143. Diagnostics: Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143
Killed by external signal
Driver stacktrace:
20/03/30 06:55:04 INFO scheduler.DAGScheduler: Job 3 failed: head at IrdsFIInstrumentEnricher.scala:15, took 213.427308 s
20/03/30 06:55:04 ERROR CCOIrdsEnrichmentService: Unexpected error
→ at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1435)
20/03/30 06:48:19 INFO storage.DiskBlockManager: Shutdown hook called
20/03/30 06:48:19 INFO util.ShutdownHookManager: Shutdown hook called
Log Upload Time:Mon Mar 30 06:55:10 +0200 2020
Log Contents:
java.lang.OutOfMemoryError: Java heap space
-XX:OnOutOfMemoryError="kill %p"
Executing /bin/sh -c "kill 62191"...
Ниже приведен код, который я использую:
fiDF = spark.read
val tempDF = fiDF.select("payload.identifier.id")
tempDF.show(10) // ******* Error at t his line ******