Невозможно запустить искровое дерево принятия решений в процессе быстрой загрузки - PullRequest
0 голосов
/ 07 ноября 2018

Я работаю на windows 8.1, Hadoop 2.6, spark 1.6, hive и rapidminer 9.0 версии. У меня есть процесс, составленный этими операторами: извлечение данных из улья, установка роли, искровое дерево решений. Radoop настроен, и я могу получить доступ к данным улья. когда я запускаю процесс, оператор дерева решений для искры долго работал без результата. на веб-сайте yarn namenode ui покажите, что задание spark не выполнено, а в rapidminer у меня есть эта ошибка:

Please verify your Spark Resource Allocation settings on the Advanced Connection Properties window. You can check the logs of the Spark job on the ResourceManager web interface  at http://${yarn.resourcemanager.hostname}:8088.

в журналах менеджера ресурсов пряжи у меня есть эти уведомления:

Container exited with a non-zero exit code 1
Failing this attempt. Failing the application.
18/11/07 10:43:50 INFO scheduler.AppSchedulingInfo: Application application_1541
582549574_0006 requests cleared
18/11/07 10:43:50 INFO rmapp.RMAppImpl: application_1541582549574_0006 State cha
nge from FINAL_SAVING to FAILED
18/11/07 10:43:50 INFO capacity.LeafQueue: Application removed - appId: applicat
ion_1541582549574_0006 user: user queue: default #user-pending-applications: 0 #
user-active-applications: 0 #queue-pending-applications: 0 #queue-active-applica
tions: 0
18/11/07 10:43:50 WARN resourcemanager.RMAuditLogger: USER=user OPERATION=Applic
ation Finished - Failed TARGET=RMAppManager     RESULT=FAILURE  DESCRIPTION=App
failed with state: FAILED       PERMISSIONS=Application application_154158254957
4_0006 failed 1 times due to AM Container for appattempt_1541582549574_0006_0000
01 exited with  exitCode: 1
For more detailed output, check application tracking page:http://asus:8088/proxy
/application_1541582549574_0006/Then, click on links to logs of each attempt.
Diagnostics: Exception from container-launch.
Container id: container_1541582549574_0006_01_000001
Exit code: 1
Stack trace: ExitCodeException exitCode=1:
        at org.apache.hadoop.util.Shell.runCommand(Shell.java:538)
        at org.apache.hadoop.util.Shell.run(Shell.java:455)
        at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:
715)
        at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.la
unchContainer(DefaultContainerExecutor.java:211)
        at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.C
ontainerLaunch.call(ContainerLaunch.java:302)
        at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.C
ontainerLaunch.call(ContainerLaunch.java:82)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.
java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor
.java:624)
        at java.lang.Thread.run(Thread.java:748)

Shell output:         1 fichier(s) déplacé(s).


Container exited with a non-zero exit code 1
Failing this attempt. Failing the application.  APPID=application_1541582549574_
0006
18/11/07 10:43:50 INFO capacity.ParentQueue: Application removed - appId: applic
ation_1541582549574_0006 user: user leaf-queue of parent: root #applications: 0
18/11/07 10:43:50 INFO resourcemanager.RMAppManager$ApplicationSummary: appId=ap
plication_1541582549574_0006,name=Decision Tree,user=user,queue=default,state=FA
ILED,trackingUrl=http://asus:8088/cluster/app/application_1541582549574_0006,app
MasterHost=N/A,startTime=1541583816334,finishTime=1541583830966,finalStatus=FAIL
ED 

Это описание процесса XML

 <?xml version="1.0" encoding="UTF-8"?><process version="9.0.003">
  <context>
    <input/>
    <output/>
    <macros/>
  </context>
  <operator activated="true" class="process" compatibility="9.0.003" expanded="true" name="Process">
    <process expanded="true">
      <operator activated="true" class="radoop:radoop_nest" compatibility="9.0.002" expanded="true" height="103" name="Radoop Nest" width="90" x="179" y="85">
        <parameter key="connection" value="hadoop"/>
        <parameter key="change_sample_size" value="true"/>
        <enumeration key="tables_to_reload"/>
        <process expanded="true">
          <operator activated="true" class="radoop:retrieve" compatibility="9.0.002" expanded="true" height="68" name="Retrieve (2)" width="90" x="45" y="340">
            <parameter key="table" value="forum"/>
          </operator>
          <operator activated="true" class="radoop:set_role" compatibility="9.0.002" expanded="true" height="82" name="Set Role" width="90" x="179" y="340">
            <parameter key="name" value="categforum"/>
            <parameter key="target_role" value="label"/>
            <list key="set_additional_roles"/>
          </operator>
          <operator activated="true" class="radoop:decision_tree_ml" compatibility="9.0.002" expanded="true" height="82" name="Decision Tree" width="90" x="313" y="340">
            <parameter key="file_format" value="PARQUET"/>
          </operator>
          <operator activated="true" class="radoop:apply_prediction" compatibility="9.0.002" expanded="true" height="82" name="Apply Model" width="90" x="447" y="340">
            <list key="application_parameters"/>
          </operator>
          <connect from_op="Retrieve (2)" from_port="output" to_op="Set Role" to_port="example set input"/>
          <connect from_op="Set Role" from_port="example set output" to_op="Decision Tree" to_port="training set"/>
          <connect from_op="Decision Tree" from_port="model" to_op="Apply Model" to_port="model"/>
          <connect from_op="Decision Tree" from_port="exampleSet" to_op="Apply Model" to_port="unlabelled data"/>
          <connect from_op="Apply Model" from_port="labelled data" to_port="output 1"/>
          <connect from_op="Apply Model" from_port="model" to_port="output 2"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_output 1" spacing="0"/>
          <portSpacing port="sink_output 2" spacing="0"/>
          <portSpacing port="sink_output 3" spacing="0"/>
        </process>
      </operator>
      <connect from_op="Radoop Nest" from_port="output 1" to_port="result 1"/>
      <portSpacing port="source_input 1" spacing="0"/>
      <portSpacing port="sink_result 1" spacing="0"/>
      <portSpacing port="sink_result 2" spacing="0"/>
    </process>
  </operator>
</process>

когда я пытаюсь проверить только искру на rapidminer, у меня появляется эта ошибка:

SEVERE: java.util.concurrent.TimeoutException
SEVERE: Timeout on the Spark test job. Please verify your Spark Resource Allocation settings on the Advanced Connection Properties window. You can check the logs of the Spark job on the ResourceManager web interface at http://localhost:8088.
SEVERE: Test failed: Spark job
SEVERE: Integration test for 'hadoop' failed.
...