При подключении к kafka topi c imageinstring
с spark.readStream
я обнаружил эту ошибку:
Py4JJavaError: An error occurred while calling o49.load.
: java.lang.NoClassDefFoundError:
org/apache/spark/sql/sources/v2/reader/SupportsScanUnsafeRow
Я использую:
- kafka_2.11- 0.8.2.2
- spark-2.4.5-bin-hadoop2.7
со следующими зависимостями:
org.apache.spark:spark-streaming-kafka-0-8_2.11:2.4.5
org.apache.spark:spark-sql-kafka-0-10_2.11:2.4.5
Это мой код:
import findspark
findspark.init()
# Spark
from pyspark.sql import SparkSession
# Spark Streaming
from pyspark.streaming import StreamingContext
# Kafka
from pyspark.streaming.kafka import KafkaUtils
# json parsing
import json
spark = SparkSession.builder.appName("StructuredNetwork").getOrCreate()
df=spark.readStream.format("kafka")\
.option("kafka.bootstrap.servers","localhost:9092")\
.option("subscribe","imageinstring").load()
query = df.writeStream.outputMode("complete").format("console").start()
query.awaitTermination()