Приложение Spark не может успешно работать на EMR с YARN - PullRequest
0 голосов
/ 10 сентября 2018

Мое искровое приложение отлично работает в режиме клиента с мастером local[*] в режиме EMR и в режиме пряжи слишком локально

Команда Spark submit:

spark-submit --deploy-mode cluster --master yarn \
    --num-executors 3 --executor-cores 1 --executor-memory 2G \
    --conf spark.driver.memory=4G --class my.APP \
    --packages org.apache.spark:spark-core_2.11:2.3.1,org.apache.spark:spark-sql_2.11:2.3.1,org.elasticsearch:elasticsearch-spark-20_2.11:6.2.3,org.apache.spark:spark-mllib_2.11:2.3.1,org.postgresql:postgresql:42.2.4,mysql:mysql-connector-java:8.0.12,org.json4s:json4s-jackson_2.11:3.6.1,org.scalaj:scalaj-http_2.11:2.4.0,org.apache.commons:commons-math3:3.6.1 s3://spark-akshdiu/spark-sandbox_2.11-0.1.jar

Строка, которую он пытался запустить: val sc = new SparkContext(conf)

Я тоже пытался SparkContext.getOrCreate(conf), но не получилось.

А вот и исключение:

18/09/10 09:15:06 INFO Client:
     client token: N/A
     diagnostics: User class threw exception: java.lang.IllegalStateException: Promise already completed.
    at scala.concurrent.Promise$class.complete(Promise.scala:55)
    at scala.concurrent.impl.Promise$DefaultPromise.complete(Promise.scala:153)
    at scala.concurrent.Promise$class.success(Promise.scala:86)
    at scala.concurrent.impl.Promise$DefaultPromise.success(Promise.scala:153)
    at org.apache.spark.deploy.yarn.ApplicationMaster.org$apache$spark$deploy$yarn$ApplicationMaster$$sparkContextInitialized(ApplicationMaster.scala:423)
    at org.apache.spark.deploy.yarn.ApplicationMaster$.sparkContextInitialized(ApplicationMaster.scala:843)
    at org.apache.spark.scheduler.cluster.YarnClusterScheduler.postStartHook(YarnClusterScheduler.scala:32)
    at org.apache.spark.SparkContext.<init>(SparkContext.scala:559)
    at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2493)
    at my.APP$.main(APP.scala:279)
    at my.APP.main(APP.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$4.run(ApplicationMaster.scala:721)

     ApplicationMaster host: 10.0.104.106
     ApplicationMaster RPC port: 0
     queue: default
     start time: 1536570864212
     final status: FAILED
     tracking URL: http://ip-10-0-104-106.us-west-2.compute.internal:20888/proxy/application_1536569833967_0006/
     user: hadoop
Exception in thread "main" org.apache.spark.SparkException: Application application_1536569833967_0006 finished with failed status
    at org.apache.spark.deploy.yarn.Client.run(Client.scala:1165)
    at org.apache.spark.deploy.yarn.YarnClusterApplication.start(Client.scala:1520)
    at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:894)
    at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:198)
    at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:228)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:137)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
18/09/10 09:15:06 INFO ShutdownHookManager: Shutdown hook called
18/09/10 09:15:06 INFO ShutdownHookManager: Deleting directory /mnt/tmp/spark-896ebe22-bf2d-41ba-b89c-8c3ba9e7cbd0
18/09/10 09:15:06 INFO ShutdownHookManager: Deleting directory /mnt/tmp/spark-c99239b0-880f-49ff-9fb0-b848422ff4fe

Я запускал его на одном главном узле или на одном главном + 2 подчиненных в m5.xlarge, но все не удалось.

...