режим кластера spark-submit не работает для python spark, но работает для scala spark - PullRequest
0 голосов
/ 14 февраля 2019

У меня есть кластер, в котором мы настроили hadoop со встроенной искрой.Версия spark - это spark v2.0.0, и при развертывании scala spark в кластерном режиме она работает как положено.Ниже приведена команда: spark-submit --class org.apache.spark.examples.SparkPi --deploy-mode cluster --master yarn /usr/local/spark-bkp-24apr/examples/jars/spark-examples_2.11-2.0.0.jar

Однако, когда я пытаюсь вызвать spark-submit с помощью pyspark, происходит сбой со следующей ошибкой:

spark-submit --master yarn --deploy-mode cluster test.py
19/02/14 15:59:23 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
19/02/14 15:59:25 INFO client.RMProxy: Connecting to ResourceManager at xxx.xxx.xxx/10.250.36.240:8032
19/02/14 15:59:25 INFO yarn.Client: Requesting a new application from cluster with 19 NodeManagers
19/02/14 15:59:25 INFO yarn.Client: Verifying our application has not requested more than the maximum memory capability of the cluster (8192 MB per container)
19/02/14 15:59:25 INFO yarn.Client: Will allocate AM container, with 1408 MB memory including 384 MB overhead
19/02/14 15:59:25 INFO yarn.Client: Setting up container launch context for our AM
19/02/14 15:59:25 INFO yarn.Client: Setting up the launch environment for our AM container
19/02/14 15:59:25 INFO yarn.Client: Preparing resources for our AM container
19/02/14 15:59:25 WARN yarn.Client: Neither spark.yarn.jars nor spark.yarn.archive is set, falling back to uploading libraries under SPARK_HOME.
19/02/14 15:59:31 INFO yarn.Client: Uploading resource file:/tmp/spark-8b9d5766-0458-430d-a4a8-43f143c032fe/__spark_libs__8961544755566009185.zip -> hdfs://xxx.xxx.xxx:9000/user/ecomm/.sparkStaging/application_1539572182703_0803/__spark_libs__8961544755566009185.zip
19/02/14 15:59:34 INFO yarn.Client: Uploading resource file:/home/ecomm/test.py -> hdfs://xxx.xxx.xxx:9000/user/ecomm/.sparkStaging/application_1539572182703_0803/test.py
19/02/14 15:59:34 INFO yarn.Client: Uploading resource file:/usr/local/spark/python/lib/pyspark.zip -> hdfs://xxx.xxx.xxx:9000/user/ecomm/.sparkStaging/application_1539572182703_0803/pyspark.zip
19/02/14 15:59:34 INFO yarn.Client: Uploading resource file:/usr/local/spark/python/lib/py4j-0.10.1-src.zip -> hdfs://xxx.xxx.xxx:9000/user/ecomm/.sparkStaging/application_1539572182703_0803/py4j-0.10.1-src.zip
19/02/14 15:59:34 INFO yarn.Client: Uploading resource file:/tmp/spark-8b9d5766-0458-430d-a4a8-43f143c032fe/__spark_conf__1991403966415671421.zip -> hdfs://xxx.xxx.xxx:9000/user/ecomm/.sparkStaging/application_1539572182703_0803/__spark_conf__.zip
19/02/14 15:59:34 INFO spark.SecurityManager: Changing view acls to: ecomm
19/02/14 15:59:34 INFO spark.SecurityManager: Changing modify acls to: ecomm
19/02/14 15:59:34 INFO spark.SecurityManager: Changing view acls groups to:
19/02/14 15:59:34 INFO spark.SecurityManager: Changing modify acls groups to:
19/02/14 15:59:34 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(ecomm); groups with view permissions: Set(); users  with modify permissions: Set(ecomm); groups with modify permissions: Set()
19/02/14 15:59:34 INFO yarn.Client: Submitting application application_1539572182703_0803 to ResourceManager
19/02/14 15:59:34 INFO impl.YarnClientImpl: Submitted application application_1539572182703_0803
19/02/14 15:59:35 INFO yarn.Client: Application report for application_1539572182703_0803 (state: ACCEPTED)
19/02/14 15:59:35 INFO yarn.Client:
         client token: N/A
         diagnostics: N/A
         ApplicationMaster host: N/A
         ApplicationMaster RPC port: -1
         queue: default
         start time: 1550131174756
         final status: UNDEFINED
         tracking URL: http://xxx.xxx.xxx:8088/proxy/application_1539572182703_0803/
         user: ecomm
19/02/14 15:59:36 INFO yarn.Client: Application report for application_1539572182703_0803 (state: ACCEPTED)
19/02/14 15:59:37 INFO yarn.Client: Application report for application_1539572182703_0803 (state: ACCEPTED)
19/02/14 15:59:38 INFO yarn.Client: Application report for application_1539572182703_0803 (state: ACCEPTED)
19/02/14 15:59:39 INFO yarn.Client: Application report for application_1539572182703_0803 (state: ACCEPTED)
19/02/14 15:59:40 INFO yarn.Client: Application report for application_1539572182703_0803 (state: ACCEPTED)
19/02/14 15:59:41 INFO yarn.Client: Application report for application_1539572182703_0803 (state: ACCEPTED)
19/02/14 15:59:42 INFO yarn.Client: Application report for application_1539572182703_0803 (state: ACCEPTED)
19/02/14 15:59:43 INFO yarn.Client: Application report for application_1539572182703_0803 (state: ACCEPTED)
19/02/14 15:59:44 INFO yarn.Client: Application report for application_1539572182703_0803 (state: ACCEPTED)
19/02/14 15:59:45 INFO yarn.Client: Application report for application_1539572182703_0803 (state: ACCEPTED)
19/02/14 15:59:46 INFO yarn.Client: Application report for application_1539572182703_0803 (state: ACCEPTED)
19/02/14 15:59:47 INFO yarn.Client: Application report for application_1539572182703_0803 (state: ACCEPTED)
19/02/14 15:59:48 INFO yarn.Client: Application report for application_1539572182703_0803 (state: ACCEPTED)
19/02/14 15:59:49 INFO yarn.Client: Application report for application_1539572182703_0803 (state: ACCEPTED)
19/02/14 15:59:50 INFO yarn.Client: Application report for application_1539572182703_0803 (state: ACCEPTED)
19/02/14 15:59:51 INFO yarn.Client: Application report for application_1539572182703_0803 (state: ACCEPTED)
19/02/14 15:59:52 INFO yarn.Client: Application report for application_1539572182703_0803 (state: ACCEPTED)
19/02/14 15:59:53 INFO yarn.Client: Application report for application_1539572182703_0803 (state: ACCEPTED)
19/02/14 15:59:54 INFO yarn.Client: Application report for application_1539572182703_0803 (state: ACCEPTED)
19/02/14 15:59:55 INFO yarn.Client: Application report for application_1539572182703_0803 (state: ACCEPTED)
19/02/14 15:59:56 INFO yarn.Client: Application report for application_1539572182703_0803 (state: ACCEPTED)
19/02/14 15:59:57 INFO yarn.Client: Application report for application_1539572182703_0803 (state: ACCEPTED)
19/02/14 15:59:58 INFO yarn.Client: Application report for application_1539572182703_0803 (state: ACCEPTED)
19/02/14 15:59:59 INFO yarn.Client: Application report for application_1539572182703_0803 (state: ACCEPTED)
19/02/14 16:00:00 INFO yarn.Client: Application report for application_1539572182703_0803 (state: ACCEPTED)
19/02/14 16:00:01 INFO yarn.Client: Application report for application_1539572182703_0803 (state: ACCEPTED)
19/02/14 16:00:02 INFO yarn.Client: Application report for application_1539572182703_0803 (state: ACCEPTED)
19/02/14 16:00:03 INFO yarn.Client: Application report for application_1539572182703_0803 (state: ACCEPTED)
19/02/14 16:00:04 INFO yarn.Client: Application report for application_1539572182703_0803 (state: ACCEPTED)
19/02/14 16:00:05 INFO yarn.Client: Application report for application_1539572182703_0803 (state: ACCEPTED)
19/02/14 16:00:06 INFO yarn.Client: Application report for application_1539572182703_0803 (state: ACCEPTED)
19/02/14 16:00:07 INFO yarn.Client: Application report for application_1539572182703_0803 (state: ACCEPTED)
19/02/14 16:00:08 INFO yarn.Client: Application report for application_1539572182703_0803 (state: ACCEPTED)
19/02/14 16:00:09 INFO yarn.Client: Application report for application_1539572182703_0803 (state: ACCEPTED)
19/02/14 16:00:10 INFO yarn.Client: Application report for application_1539572182703_0803 (state: ACCEPTED)
19/02/14 16:00:11 INFO yarn.Client: Application report for application_1539572182703_0803 (state: ACCEPTED)
19/02/14 16:00:12 INFO yarn.Client: Application report for application_1539572182703_0803 (state: ACCEPTED)
19/02/14 16:00:13 INFO yarn.Client: Application report for application_1539572182703_0803 (state: ACCEPTED)
19/02/14 16:00:14 INFO yarn.Client: Application report for application_1539572182703_0803 (state: ACCEPTED)
19/02/14 16:00:15 INFO yarn.Client: Application report for application_1539572182703_0803 (state: ACCEPTED)
19/02/14 16:00:16 INFO yarn.Client: Application report for application_1539572182703_0803 (state: ACCEPTED)
19/02/14 16:00:17 INFO yarn.Client: Application report for application_1539572182703_0803 (state: FAILED)
19/02/14 16:00:17 INFO yarn.Client:
         client token: N/A
         diagnostics: Application application_1539572182703_0803 failed 2 times due to AM Container for appattempt_1539572182703_0803_000002 exited with  exitCode: 127
For more detailed output, check application tracking page:http://xxx.xxx.xxx:8088/cluster/app/application_1539572182703_0803Then, click on links to logs of each attempt.
Diagnostics: Exception from container-launch.
Container id: container_1539572182703_0803_02_000001
Exit code: 127
Stack trace: ExitCodeException exitCode=127:
        at org.apache.hadoop.util.Shell.runCommand(Shell.java:545)
        at org.apache.hadoop.util.Shell.run(Shell.java:456)
        at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:722)
        at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:212)
        at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
        at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)


Container exited with a non-zero exit code 127
Failing this attempt. Failing the application.
         ApplicationMaster host: N/A
         ApplicationMaster RPC port: -1
         queue: default
         start time: 1550131174756
         final status: FAILED
         tracking URL: http://xxx.xxx.xxx:8088/cluster/app/application_1539572182703_0803
         user: ecomm
Exception in thread "main" org.apache.spark.SparkException: Application application_1539572182703_0803 finished with failed status
        at org.apache.spark.deploy.yarn.Client.run(Client.scala:1132)
        at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1175)
        at org.apache.spark.deploy.yarn.Client.main(Client.scala)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:729)
        at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:185)
        at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:210)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:124)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
19/02/14 16:00:17 INFO util.ShutdownHookManager: Shutdown hook called
19/02/14 16:00:17 INFO util.ShutdownHookManager: Deleting directory /tmp/spark-8b9d5766-0458-430d-a4a8-43f143c032fe

Все еще не удается получить то, чтоя скучаю, любая помощь высоко ценится!

Заранее спасибо.

1 Ответ

0 голосов
/ 17 февраля 2019

Если вы попадаете на эту страницу и настроили свой собственный кластер, это может вам помочь: трассировка стека: ExitCodeException exitCode = 127, как правило, связана с проблемой сценария.Пожалуйста, проверьте скрипт и удалите все ненужные аргументы.

В моем случае я столкнулся со следующей проблемой в моем коде:

/opt/rh/rh-python35/root/usr/bin/python3.5 : error while loading shared libraries : 
libpython3.5m.so.rh-python35-1.0 : cannot open shared object file : No such file or 
directory 

Недавно мы обновили нашу версию Python кластера с 2.7.xдо 3.5.x, и я заметил, что когда задания выполнялись в режиме клиента, он работал успешно, а не в режиме кластера.

Причина в том, что все остальные узлы, кроме узла edgenode / master, все еще работализапуск с python 2.7.x, настроенным для того же пользователя.

Как только те же настройки были развернуты на остальных узлах для того же пользователя, проблема была решена, и все узлы показывали python 3.5.x.

В .bashrc:

export SPARK_HOME=/usr/local/spark
export SPARK_PID_DIR=/data/ecommerce/hadoop-2.7.2/pids/spark#$SPARK_HOME/temp
export PATH=$PATH:$SPARK_HOME/bin
export OOZIE_HOME=/data/ecommerce/oozie-server/oozie-4.2.0
export PATH=$PATH:$OOZIE_HOME/bin

export PYSPARK_PYTHON=/opt/rh/rh-python35/root/usr/bin/python3.5
#export PYSPARK_DRIVER_PYTHON=python3.5
export PATH="/opt/rh/rh-python35/root/usr/bin":$PATH
export LD_LIBRARY_PATH=/opt/rh/rh-python35/root/usr/lib64
#export PYTHONPATH=/usr/local/spark/python/lib/
export PYTHONPATH=/opt/rh/rh-python35/root/usr/bin/python3.5

Надеюсь, это поможет!Ура!

...