Запуск задания потоковой передачи искры на узле, но получение сбоя?
пробовал другую версию kafka, и искра все еще не работает.пробовал разные jar-файлы при отправке задания spark
from pyspark import SparkContext
from pyspark.streaming import StreamingContext
from pyspark.streaming.kafka import KafkaUtils
sc = SparkContext(appName="PythonSparkStreamingKafka")
sc.setLogLevel("WARN")
ssc = StreamingContext(sc,20)
kafkaStream=KafkaUtils.createStream(ssc,"kmnn1.hdp.com:9092",
"spark-streaming-consumer", {'test': 1})
lines = kafkaStream.map(lambda x: x[1])
counts = lines.flatMap(lambda line: line.split('')).map(lambda word:
(word, 1)).reduceByKey(lambda a, b: a+b)
counts.pprint()
ssc.start()
ssc.awaitTermination()
Используемая команда:
spark-submit --packages org.apache.spark:spark-streaming-kafka-0-8_2.11:2.2.0 --jars ./spark-streaming-kafka-0-8-assembly_2.11-2.2.0.jar spark-stream.py
Ожидаемые результаты
Input: hello world
Expected output : 'hello' 1
'world' 1
Сообщение об ошибке:
Exception in thread "Thread-6" java.lang.NoClassDefFoundError:
kafka/common/TopicAndPartitionat java.lang.Class.getDeclaredMethods0(Native Method)at java.lang.Class.privateGetDeclaredMethods(Class.java:2701)at java.lang.Class.privateGetPublicMethods(Class.java:2902)at java.lang.Class.getMethods(Class.java:1615)at py4j.reflection.ReflectionEngine.getMethodsByNameAndLength(ReflectionEngine.java:345)at py4j.reflection.ReflectionEngine.getMethod(ReflectionEngine.java:305)at py4j.reflection.ReflectionEngine.getMethod(ReflectionEngine.java:326)at py4j.Gateway.invoke(Gateway.java:274)at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)at py4j.commands.CallCommand.execute(CallCommand.java:79)at py4j.GatewayConnection.run(GatewayConnection.java:238)at java.lang.Thread.run(Thread.java:748)Caused by: java.lang.ClassNotFoundException: kafka.common.TopicAndPartitionat java.net.URLClassLoader.findClass(URLClassLoader.java:381)at java.lang.ClassLoader.loadClass(ClassLoader.java:424)at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335)at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 12 moreERROR:root:Exception while sending command.Traceback (most recent call last): File "/usr/hdp/3.1.0.0-78/spark2/python/lib/py4j-0.10.7-src.zip/py4j/java_gateway.py", line 985, in send_command response = connection.send_command(command) File "/usr/hdp/3.1.0.0-78/spark2/python/lib/py4j-0.10.7-src.zip/py4j/java_gateway.py", line 1164, in send_command "Error while receiving", e, proto.ERROR_ON_RECEIVE)Py4JNetworkError: Error while receivingTraceback (most recent call last): File "/root/spark-code/spark-stream.py", line 13, in <module> kafkaStream=KafkaUtils.createStream(ssc,"kmnn1.hdp.com:9092", "spark-streaming-consumer", {'test': 1}) File "/usr/hdp/3.1.0.0-78/spark2/python/lib/pyspark.zip/pyspark/streaming/kafka.py", line 79, in createStream File "/usr/hdp/3.1.0.0-78/spark2/python/lib/py4j-0.10.7-src.zip/py4j/java_gateway.py", line 1257, in __call__ File "/usr/hdp/3.1.0.0-78/spark2/python/lib/py4j-0.10.7-src.zip/py4j/protocol.py", line 336, in get_return_valuepy4j.protocol.Py4JError: An error occurred while calling o58.createStream