работа искры на локальном кластере работает бесконечно - PullRequest
0 голосов
/ 03 января 2019

Я установил локальный искровой кластер на моем компьютере с Windows 7 (главный и рабочий узел).Я создал простой скрипт scala, который я создаю с помощью sbt и пытаюсь запустить с помощью spark-submit.Ниже приведены ресурсы

Scala-код:

package example1

import java.io._

import org.apache.spark.sql.SQLContext 
import org.apache.spark.sql.DataFrame
import org.apache.spark.sql.functions.expr
import org.apache.spark.SparkContext
import org.apache.spark.sql.SparkSession

object HelloWorld {

    def main(args: Array[String]): Unit = {
        println("===============================================")
        println("===============================================") 
        println("Hello, world!")
        val pw = new PrintWriter(new File("d:\\hello.txt" ))
        pw.write("Hello, world")

        println("===============================================")
        println("===============================================")

        val session = SparkSession.builder.getOrCreate()


        var filesmall = "file:///D:/_Work/azurepoc/samplebigdata/ds2.csv"

        //val df  =  session.read.format("csv").option("header", "true").load(filesmall)

        println("===============================================")

        pw.write("Hello, world some more information ")
        pw.close
    }
}

Spark cluster Главный сценарий:

C:\Windows\system32>spark-class org.apache.spark.deploy.master.Master
2019-01-03 16:49:16 INFO  Master:2612 - Started daemon with process name: 23940@ws-amalhotra
2019-01-03 16:49:16 WARN  NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2019-01-03 16:49:16 INFO  SecurityManager:54 - Changing view acls to: admin
2019-01-03 16:49:16 INFO  SecurityManager:54 - Changing modify acls to: admin
2019-01-03 16:49:16 INFO  SecurityManager:54 - Changing view acls groups to:
2019-01-03 16:49:16 INFO  SecurityManager:54 - Changing modify acls groups to:
2019-01-03 16:49:16 INFO  SecurityManager:54 - SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(admin); groups with view permissions: Set(); user
s  with modify permissions: Set(admin); groups with modify permissions: Set()
2019-01-03 16:49:17 INFO  Utils:54 - Successfully started service 'sparkMaster' on port 7077.
2019-01-03 16:49:17 INFO  Master:54 - Starting Spark master at spark://192.168.8.101:7077
2019-01-03 16:49:17 INFO  Master:54 - Running Spark version 2.3.2
2019-01-03 16:49:17 INFO  log:192 - Logging initialized @1412ms
2019-01-03 16:49:17 INFO  Server:351 - jetty-9.3.z-SNAPSHOT, build timestamp: unknown, git hash: unknown
2019-01-03 16:49:17 INFO  Server:419 - Started @1489ms
2019-01-03 16:49:17 INFO  AbstractConnector:278 - Started ServerConnector@16391414{HTTP/1.1,[http/1.1]}{0.0.0.0:8080}
2019-01-03 16:49:17 INFO  Utils:54 - Successfully started service 'MasterUI' on port 8080.
2019-01-03 16:49:17 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@204e3825{/app,null,AVAILABLE,@Spark}
2019-01-03 16:49:17 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@748394e8{/app/json,null,AVAILABLE,@Spark}
2019-01-03 16:49:17 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@19b99890{/,null,AVAILABLE,@Spark}
2019-01-03 16:49:17 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@5c0f561c{/json,null,AVAILABLE,@Spark}
2019-01-03 16:49:17 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@3443bda1{/static,null,AVAILABLE,@Spark}
2019-01-03 16:49:17 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@54541f46{/app/kill,null,AVAILABLE,@Spark}
2019-01-03 16:49:17 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@6e8c3d12{/driver/kill,null,AVAILABLE,@Spark}
2019-01-03 16:49:17 INFO  MasterWebUI:54 - Bound MasterWebUI to 0.0.0.0, and started at http://ws-amalhotra.domain.co.in:8080
2019-01-03 16:49:17 INFO  Server:351 - jetty-9.3.z-SNAPSHOT, build timestamp: unknown, git hash: unknown
2019-01-03 16:49:17 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@22eb9260{/,null,AVAILABLE}
2019-01-03 16:49:17 INFO  AbstractConnector:278 - Started ServerConnector@636eb125{HTTP/1.1,[http/1.1]}{192.168.8.101:6066}
2019-01-03 16:49:17 INFO  Server:419 - Started @1558ms
2019-01-03 16:49:17 INFO  Utils:54 - Successfully started service on port 6066.
2019-01-03 16:49:17 INFO  StandaloneRestServer:54 - Started REST server for submitting applications on port 6066
2019-01-03 16:49:17 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@1a4c3e84{/metrics/master/json,null,AVAILABLE,@Spark}
2019-01-03 16:49:17 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@5a3b4746{/metrics/applications/json,null,AVAILABLE,@Spark}
2019-01-03 16:49:17 INFO  Master:54 - I have been elected leader! New state: ALIVE
2019-01-03 16:49:21 INFO  Master:54 - Registering worker 192.168.8.101:8089 with 8 cores, 14.9 GB RAM

Мой рабочий узел:

C:\Windows\system32>spark-class org.apache.spark.deploy.worker.Worker spark://192.168.8.101:7077 -p 8089
2019-01-03 16:49:20 INFO  Worker:2612 - Started daemon with process name: 16264@ws-amalhotra
2019-01-03 16:49:21 WARN  NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2019-01-03 16:49:21 INFO  SecurityManager:54 - Changing view acls to: admin
2019-01-03 16:49:21 INFO  SecurityManager:54 - Changing modify acls to: admin
2019-01-03 16:49:21 INFO  SecurityManager:54 - Changing view acls groups to:
2019-01-03 16:49:21 INFO  SecurityManager:54 - Changing modify acls groups to:
2019-01-03 16:49:21 INFO  SecurityManager:54 - SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(admin); groups with view permissions: Set(); user
s  with modify permissions: Set(admin); groups with modify permissions: Set()
2019-01-03 16:49:21 INFO  Utils:54 - Successfully started service 'sparkWorker' on port 8089.
2019-01-03 16:49:21 INFO  Worker:54 - Starting Spark worker 192.168.8.101:8089 with 8 cores, 14.9 GB RAM
2019-01-03 16:49:21 INFO  Worker:54 - Running Spark version 2.3.2
2019-01-03 16:49:21 INFO  Worker:54 - Spark home: C:\spark
2019-01-03 16:49:21 INFO  log:192 - Logging initialized @1471ms
2019-01-03 16:49:21 INFO  Server:351 - jetty-9.3.z-SNAPSHOT, build timestamp: unknown, git hash: unknown
2019-01-03 16:49:21 INFO  Server:419 - Started @1518ms
2019-01-03 16:49:21 INFO  AbstractConnector:278 - Started ServerConnector@44629c8f{HTTP/1.1,[http/1.1]}{0.0.0.0:8081}
2019-01-03 16:49:21 INFO  Utils:54 - Successfully started service 'WorkerUI' on port 8081.
2019-01-03 16:49:21 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@36f34cce{/logPage,null,AVAILABLE,@Spark}
2019-01-03 16:49:21 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@447fb46{/logPage/json,null,AVAILABLE,@Spark}
2019-01-03 16:49:21 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@3b027ba{/,null,AVAILABLE,@Spark}
2019-01-03 16:49:21 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@5396b0bb{/json,null,AVAILABLE,@Spark}
2019-01-03 16:49:21 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@6830ec44{/static,null,AVAILABLE,@Spark}
2019-01-03 16:49:21 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@5eb28ff8{/log,null,AVAILABLE,@Spark}
2019-01-03 16:49:21 INFO  WorkerWebUI:54 - Bound WorkerWebUI to 0.0.0.0, and started at http://ws-amalhotra.domain.co.in:8081
2019-01-03 16:49:21 INFO  Worker:54 - Connecting to master 192.168.8.101:7077...
2019-01-03 16:49:21 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@36cc352{/metrics/json,null,AVAILABLE,@Spark}
2019-01-03 16:49:21 INFO  TransportClientFactory:267 - Successfully created connection to /192.168.8.101:7077 after 26 ms (0 ms spent in bootstraps)
2019-01-03 16:49:21 INFO  Worker:54 - Successfully registered with master spark://192.168.8.101:7077

Теперь я собираю и упаковываю код scala с помощью sbt, который упаковывает его в JAR.Мой файл build.sbt выглядит следующим образом:

version := "1.0" 
scalaVersion := "2.11.8" 
val sparkVersion = "2.0.0" 

libraryDependencies ++= Seq( 
    "org.apache.spark" %% "spark-core" % sparkVersion, 
    "org.apache.spark" %% "spark-streaming" % sparkVersion, 
    "org.apache.spark" %% "spark-sql" % sparkVersion 
    ) 

Он создает банку, и я отправляю ее, используя команду spark submit, как показано ниже:

C:\Users\amalhotra>spark-submit  --deploy-mode cluster --master spark://192.168.
8.101:6066 --class "example1.HelloWorld"  "D:\_Work\azurepoc\sbtexample\target\s
cala-2.11\sbtexample_2.11-1.0.jar"

Все работает нормально и теперья просто изменяю одну строку кода в моем скрипте и снова следую за компиляцией -> код пакета sbt -> spark-submit (как описано выше).Изменение кода заключается в том, что я раскомментирую следующую строку:

 //val df  =  session.read.format("csv").option("header", "true").load(filesmall)

Когда я снова запускаю вышеупомянутое с помощью spark-submit, рабочий выполняется вечно.Кроме того, файл на моем диске D не записывается.Журналы рабочих ниже

C:\Windows\system32>spark-class org.apache.spark.deploy.worker.Worker spark://192.168.8.101:7077 -p 8089
2019-01-03 17:24:38 INFO  Worker:2612 - Started daemon with process name: 24952@ws-amalhotra
2019-01-03 17:24:39 WARN  NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2019-01-03 17:24:39 INFO  SecurityManager:54 - Changing view acls to: admin
2019-01-03 17:24:39 INFO  SecurityManager:54 - Changing modify acls to: admin
2019-01-03 17:24:39 INFO  SecurityManager:54 - Changing view acls groups to:
2019-01-03 17:24:39 INFO  SecurityManager:54 - Changing modify acls groups to:
2019-01-03 17:24:39 INFO  SecurityManager:54 - SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(admin); groups with view permissions: Set(); user
s  with modify permissions: Set(admin); groups with modify permissions: Set()
2019-01-03 17:24:39 INFO  Utils:54 - Successfully started service 'sparkWorker' on port 8089.
2019-01-03 17:24:39 INFO  Worker:54 - Starting Spark worker 192.168.8.101:8089 with 8 cores, 14.9 GB RAM
2019-01-03 17:24:39 INFO  Worker:54 - Running Spark version 2.3.2
2019-01-03 17:24:39 INFO  Worker:54 - Spark home: C:\spark
2019-01-03 17:24:39 INFO  log:192 - Logging initialized @1512ms
2019-01-03 17:24:39 INFO  Server:351 - jetty-9.3.z-SNAPSHOT, build timestamp: unknown, git hash: unknown
2019-01-03 17:24:39 INFO  Server:419 - Started @1561ms
2019-01-03 17:24:39 INFO  AbstractConnector:278 - Started ServerConnector@51e2ccae{HTTP/1.1,[http/1.1]}{0.0.0.0:8081}
2019-01-03 17:24:39 INFO  Utils:54 - Successfully started service 'WorkerUI' on port 8081.
2019-01-03 17:24:39 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@3d96670b{/logPage,null,AVAILABLE,@Spark}
2019-01-03 17:24:39 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@48e02860{/logPage/json,null,AVAILABLE,@Spark}
2019-01-03 17:24:39 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@758918a3{/,null,AVAILABLE,@Spark}
2019-01-03 17:24:39 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@1643bea5{/json,null,AVAILABLE,@Spark}
2019-01-03 17:24:39 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@5f293725{/static,null,AVAILABLE,@Spark}
2019-01-03 17:24:39 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@339a8612{/log,null,AVAILABLE,@Spark}
2019-01-03 17:24:39 INFO  WorkerWebUI:54 - Bound WorkerWebUI to 0.0.0.0, and started at http://ws-amalhotra.domain.co.in:8081
2019-01-03 17:24:39 INFO  Worker:54 - Connecting to master 192.168.8.101:7077...
2019-01-03 17:24:39 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@196e9c2a{/metrics/json,null,AVAILABLE,@Spark}
2019-01-03 17:24:39 INFO  TransportClientFactory:267 - Successfully created connection to /192.168.8.101:7077 after 29 ms (0 ms spent in bootstraps)
2019-01-03 17:24:40 INFO  Worker:54 - Successfully registered with master spark://192.168.8.101:7077
2019-01-03 17:25:17 INFO  Worker:54 - Asked to launch driver driver-20190103172517-0000
2019-01-03 17:25:17 INFO  DriverRunner:54 - Copying user jar file:/D:/_Work/azurepoc/sbtexample/target/scala-2.11/sbtexample_2.11-1.0.jar to C:\spark\work\driver-20190103172517-0000\sbtexamp
le_2.11-1.0.jar
2019-01-03 17:25:17 INFO  Utils:54 - Copying D:\_Work\azurepoc\sbtexample\target\scala-2.11\sbtexample_2.11-1.0.jar to C:\spark\work\driver-20190103172517-0000\sbtexample_2.11-1.0.jar
2019-01-03 17:25:17 INFO  DriverRunner:54 - Launch Command: "C:\Program Files\Java\jdk1.8.0_181\bin\java" "-cp" "C:\spark\bin\..\conf\;C:\spark\jars\*" "-Xmx1024M" "-Dspark.master=spark://19
2.168.8.101:7077" "-Dspark.driver.supervise=false" "-Dspark.submit.deployMode=cluster" "-Dspark.jars=file:/D:/_Work/azurepoc/sbtexample/target/scala-2.11/sbtexample_2.11-1.0.jar" "-Dspark.ap
p.name=example1.HelloWorld" "org.apache.spark.deploy.worker.DriverWrapper" "spark://Worker@192.168.8.101:8089" "C:\spark\work\driver-20190103172517-0000\sbtexample_2.11-1.0.jar" "example1.He
lloWorld"
2019-01-03 17:25:19 INFO  Worker:54 - Asked to launch executor app-20190103172519-0000/0 for example1.HelloWorld
2019-01-03 17:25:19 INFO  SecurityManager:54 - Changing view acls to: admin
2019-01-03 17:25:19 INFO  SecurityManager:54 - Changing modify acls to: admin
2019-01-03 17:25:19 INFO  SecurityManager:54 - Changing view acls groups to:
2019-01-03 17:25:19 INFO  SecurityManager:54 - Changing modify acls groups to:
2019-01-03 17:25:19 INFO  SecurityManager:54 - SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(admin); groups with view permissions: Set(); user
s  with modify permissions: Set(admin); groups with modify permissions: Set()
2019-01-03 17:25:19 INFO  ExecutorRunner:54 - Launch command: "C:\Program Files\Java\jdk1.8.0_181\bin\java" "-cp" "C:\spark\bin\..\conf\;C:\spark\jars\*" "-Xmx1024M" "-Dspark.driver.port=557
86" "org.apache.spark.executor.CoarseGrainedExecutorBackend" "--driver-url" "spark://CoarseGrainedScheduler@ws-amalhotra.domain.co.in:55786" "--executor-id" "0" "--hostname" "192.168.8.101" "--
cores" "7" "--app-id" "app-20190103172519-0000" "--worker-url" "spark://Worker@192.168.8.101:8089"
2019-01-03 17:25:43 INFO  Worker:54 - Executor app-20190103172519-0000/0 finished with state EXITED message Command exited with code 1 exitStatus 1
2019-01-03 17:25:43 INFO  Worker:54 - Asked to launch executor app-20190103172519-0000/1 for example1.HelloWorld
2019-01-03 17:25:43 INFO  SecurityManager:54 - Changing view acls to: admin
2019-01-03 17:25:43 INFO  SecurityManager:54 - Changing modify acls to: admin
2019-01-03 17:25:43 INFO  SecurityManager:54 - Changing view acls groups to:
2019-01-03 17:25:43 INFO  SecurityManager:54 - Changing modify acls groups to:
2019-01-03 17:25:43 INFO  SecurityManager:54 - SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(admin); groups with view permissions: Set(); user
s  with modify permissions: Set(admin); groups with modify permissions: Set()
2019-01-03 17:25:43 INFO  ExecutorRunner:54 - Launch command: "C:\Program Files\Java\jdk1.8.0_181\bin\java" "-cp" "C:\spark\bin\..\conf\;C:\spark\jars\*" "-Xmx1024M" "-Dspark.driver.port=557
86" "org.apache.spark.executor.CoarseGrainedExecutorBackend" "--driver-url" "spark://CoarseGrainedScheduler@ws-amalhotra.domain.co.in:55786" "--executor-id" "1" "--hostname" "192.168.8.101" "--
cores" "7" "--app-id" "app-20190103172519-0000" "--worker-url" "spark://Worker@192.168.8.101:8089"
2019-01-03 17:26:05 INFO  Worker:54 - Executor app-20190103172519-0000/1 finished with state EXITED message Command exited with code 1 exitStatus 1
2019-01-03 17:26:05 INFO  Worker:54 - Asked to launch executor app-20190103172519-0000/2 for example1.HelloWorld
2019-01-03 17:26:05 INFO  SecurityManager:54 - Changing view acls to: admin
2019-01-03 17:26:05 INFO  SecurityManager:54 - Changing modify acls to: admin
2019-01-03 17:26:05 INFO  SecurityManager:54 - Changing view acls groups to:
2019-01-03 17:26:05 INFO  SecurityManager:54 - Changing modify acls groups to:
2019-01-03 17:26:05 INFO  SecurityManager:54 - SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(admin); groups with view permissions: Set(); user
s  with modify permissions: Set(admin); groups with modify permissions: Set()
2019-01-03 17:26:05 INFO  ExecutorRunner:54 - Launch command: "C:\Program Files\Java\jdk1.8.0_181\bin\java" "-cp" "C:\spark\bin\..\conf\;C:\spark\jars\*" "-Xmx1024M" "-Dspark.driver.port=557
86" "org.apache.spark.executor.CoarseGrainedExecutorBackend" "--driver-url" "spark://CoarseGrainedScheduler@ws-amalhotra.domain.co.in:55786" "--executor-id" "2" "--hostname" "192.168.8.101" "--
cores" "7" "--app-id" "app-20190103172519-0000" "--worker-url" "spark://Worker@192.168.8.101:8089"
2019-01-03 17:26:28 INFO  Worker:54 - Executor app-20190103172519-0000/2 finished with state EXITED message Command exited with code 1 exitStatus 1
2019-01-03 17:26:28 INFO  Worker:54 - Asked to launch executor app-20190103172519-0000/3 for example1.HelloWorld
2019-01-03 17:26:28 INFO  SecurityManager:54 - Changing view acls to: admin
2019-01-03 17:26:28 INFO  SecurityManager:54 - Changing modify acls to: admin
2019-01-03 17:26:28 INFO  SecurityManager:54 - Changing view acls groups to:
2019-01-03 17:26:28 INFO  SecurityManager:54 - Changing modify acls groups to:
2019-01-03 17:26:28 INFO  SecurityManager:54 - SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(admin); groups with view permissions: Set(); user
s  with modify permissions: Set(admin); groups with modify permissions: Set()
2019-01-03 17:26:28 INFO  ExecutorRunner:54 - Launch command: "C:\Program Files\Java\jdk1.8.0_181\bin\java" "-cp" "C:\spark\bin\..\conf\;C:\spark\jars\*" "-Xmx1024M" "-Dspark.driver.port=557
86" "org.apache.spark.executor.CoarseGrainedExecutorBackend" "--driver-url" "spark://CoarseGrainedScheduler@ws-amalhotra.domain.co.in:55786" "--executor-id" "3" "--hostname" "192.168.8.101" "--
cores" "7" "--app-id" "app-20190103172519-0000" "--worker-url" "spark://Worker@192.168.8.101:8089"

Это продолжается вечно, одни и те же журналы повторяются каждые несколько секунд.Неясно, что происходит.Журналы не говорят много.Полных примеров, показывающих выполнение таких заданий в локальном автономном кластере, не существует

1 Ответ

0 голосов
/ 17 июля 2019

Установить spark.driver.host в conf / spark-defaults.conf!

...