В настоящее время я запускаю приложение структурированной потоковой передачи, которое завершается через 1,7 часа со статусом успешно и завершено sh. Почему это произойдет, поддержка потоковых приложений будет работать вечно. Использование Spark версии 2.4.4
Почему наступает состояние FINISHED ? Есть ли тайм-аут настройки, которые мне не хватает. Долгосрочное потоковое приложение НИКОГДА НЕ ДОЛЖНО FINI SH ... A
20/03/12 18:18:02 INFO Client: Application report for application_1583523441533_0217 **(state: FINISHED)**
20/03/12 18:18:02 INFO Client:
client token: N/A
diagnostics: N/A
ApplicationMaster host: ip-XXXX
ApplicationMaster RPC port: 40369
queue: default
start time: 1584032065252
**final status: SUCCEEDED**
tracking URL: http://XX-60.ec2.internal:20888/proxy/application_1583523441533_0217/
user: hadoop
20/03/12 18:18:02 INFO Client: Deleted staging directory hdfs://ip-1XXX60.ec2.internal:8020/user/hadoop/.sparkStaging/application
_1583523441533_0217
20/03/12 18:18:02 INFO ShutdownHookManager: Shutdown hook called
Единственная ошибка за это время - ФАЙЛ НЕ НАЙДЕН для ФАЙЛА / каталога, который УЖЕ СУЩЕСТВУЕТ ... . Похоже, что этот файл не найден. НИКОГДА не была решена должным образом .... https://issues.apache.org/jira/browse/SPARK-18512 Может кто-нибудь ПОЖАЛУЙСТА, ПОМОГИТЕ !!
20/03/12 18:17:54 ERROR Utils: Aborting task
java.io.FileNotFoundException: No such file or directory: xxs3a://xusfield/data/events02/date=20200312/time=181424/o
6c560.c000.snappy.parquet
at org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:1642)
at org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:117)
at org.apache.spark.sql.execution.streaming.ManifestFileCommitProtocol$$anonfun$4.apply(ManifestFileCommitProtocol.scala:109)
at org.apache.spark.sql.execution.streaming.ManifestFileCommitProtocol$$anonfun$4.apply(ManifestFileCommitProtocol.scala:109)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
at scala.collection.AbstractTraversable.map(Traversable.scala:104)
at org.apache.spark.sql.execution.streaming.ManifestFileCommitProtocol.commitTask(ManifestFileCommitProtocol.scala:109)
at org.apache.spark.sql.execution.datasources.FileFormatDataWriter.commit(FileFormatDataWriter.scala:78)
at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.s
cala:247)
at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.s
cala:242)
at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1394)
at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:248)
at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1.apply(FileFormatWriter.scala:170)
at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1.apply(FileFormatWriter.scala:169)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
at org.apache.spark.scheduler.Task.run(Task.scala:123)
at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:408)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:414)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
20/03/12 18:17:54 ERROR FileFormatWriter: Job job_20200312181703_0073 aborted.
20/03/12 18:17:54 ERROR Executor: Exception in task 941.0 in stage 73.0 (TID 75707)
org.apache.spark.SparkException: Task failed while writing rows.
at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:257)
at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1.apply(FileFormatWriter.scala:170)
at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1.apply(FileFormatWriter.scala:169)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
at org.apache.spark.scheduler.Task.run(Task.scala:123)
at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:408)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:414)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)