У меня есть автономный кластер с 3 узлами. Я пытаюсь получить пример pyspark здесь . Я запускаю его шаг за шагом в python. Я чувствую, что это должно быть больше связано с версией pyspark, которую я установил, чем с версией spark, которую я использую.
from pyspark import SparkConf, SparkContext
conf = SparkConf()
conf.setMaster("spark://192.168.122.54:7077")
conf.setAppName("My application")
conf.set("spark.executor.memory", "1g")
sc = SparkContext(conf = conf)
Вывод:
20/05/05 20:20:15 WARN Utils: Your hostname, ronald-Standard-PC-i440FX-PIIX-1996 resolves to a loopback address: 127.0.1.1; using 192.168.122.54 instead (on interface ens3)
20/05/05 20:20:15 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by org.apache.spark.unsafe.Platform (file:/usr/local/spark-3.0.0-preview2-bin-hadoop2.7/jars/spark-unsafe_2.12-3.0.0-preview2.jar) to constructor java.nio.DirectByteBuffer(long,int)
WARNING: Please consider reporting this to the maintainers of org.apache.spark.unsafe.Platform
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
20/05/05 20:20:16 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/ronald/.local/lib/python3.7/site-packages/pyspark/context.py", line 136, in __init__
conf, jsc, profiler_cls)
File "/home/ronald/.local/lib/python3.7/site-packages/pyspark/context.py", line 213, in _do_init
self._encryption_enabled = self._jvm.PythonUtils.getEncryptionEnabled(self._jsc)
File "/home/ronald/.local/lib/python3.7/site-packages/py4j/java_gateway.py", line 1487, in __getattr__
"{0}.{1} does not exist in the JVM".format(self._fqn, name))
py4j.protocol.Py4JError: org.apache.spark.api.python.PythonUtils.getEncryptionEnabled does not exist in the JVM
Если я просто запустил его на локальном узле
from pyspark import SparkConf, SparkContext
conf = SparkConf()
conf.setMaster("local")
conf.setAppName("My application")
conf.set("spark.executor.memory", "1g")
sc = SparkContext(conf = conf)
Ошибка все еще та же
20/05/05 20:16:29 WARN Utils: Your hostname, ronald-Standard-PC-i440FX-PIIX-1996 resolves to a loopback address: 127.0.1.1; using 192.168.122.54 instead (on interface ens3)
20/05/05 20:16:29 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by org.apache.spark.unsafe.Platform (file:/usr/local/spark-3.0.0-preview2-bin-hadoop2.7/jars/spark-unsafe_2.12-3.0.0-preview2.jar) to constructor java.nio.DirectByteBuffer(long,int)
WARNING: Please consider reporting this to the maintainers of org.apache.spark.unsafe.Platform
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
20/05/05 20:16:30 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/ronald/.local/lib/python3.7/site-packages/pyspark/context.py", line 136, in __init__
conf, jsc, profiler_cls)
File "/home/ronald/.local/lib/python3.7/site-packages/pyspark/context.py", line 213, in _do_init
self._encryption_enabled = self._jvm.PythonUtils.getEncryptionEnabled(self._jsc)
File "/home/ronald/.local/lib/python3.7/site-packages/py4j/java_gateway.py", line 1487, in __getattr__
"{0}.{1} does not exist in the JVM".format(self._fqn, name))
py4j.protocol.Py4JError: org.apache.spark.api.python.PythonUtils.getEncryptionEnabled does not exist in the JVM