Я создал новую виртуальную среду, а затем установил следующие пакеты с помощью pip в моей Windows системе.
pytest==4.4.0
pyspark==2.4.0
pytest-spark==0.4.5
Код, который я пытаюсь запустить, выглядит следующим образом:
from pyspark.sql import SparkSession
spark = SparkSession\
.builder\
.getOrCreate()
И трассировка стека на моей консоли выглядит следующим образом:
C:\Users\stkrs\OneDrive - dump\Desktop\test>"c:/Users/stkrs/OneDrive - dump/Desktop/test/env/Scripts/activate.bat"
(env) C:\Users\stkrs\OneDrive - dump\Desktop\test>"c:/Users/stkrs/OneDrive - dump/Desktop/test/env/Scripts/python.exe" "c:/Users/stkrs/OneDrive - dump/Desktop/test/hello.py"
Failed to find Spark jars directory.
You need to build Spark before running this program.
Traceback (most recent call last):
File "c:/Users/stkrs/OneDrive - dump/Desktop/test/hello.py", line 2, in <module>
spark = SparkSession\
File "c:\Users\stkrs\OneDrive - dump\Desktop\test\env\lib\site-packages\pyspark\sql\session.py", line 173, in getOrCreate
sc = SparkContext.getOrCreate(sparkConf)
File "c:\Users\stkrs\OneDrive - dump\Desktop\test\env\lib\site-packages\pyspark\context.py", line 349, in getOrCreate SparkContext(conf=conf or SparkConf())
File "c:\Users\stkrs\OneDrive - dump\Desktop\test\env\lib\site-packages\pyspark\context.py", line 115, in __init__
SparkContext._ensure_initialized(self, gateway=gateway, conf=conf)
File "c:\Users\stkrs\OneDrive - dump\Desktop\test\env\lib\site-packages\pyspark\context.py", line 298, in _ensure_initialized
SparkContext._gateway = gateway or launch_gateway(conf)
File "c:\Users\stkrs\OneDrive - dump\Desktop\test\env\lib\site-packages\pyspark\java_gateway.py", line 94, in launch_gateway
raise Exception("Java gateway process exited before sending its port number")
Exception: Java gateway process exited before sending its port number