Доступ к таблице HIVE с помощью файла pyspark .py - PullRequest
0 голосов
/ 23 января 2019

Я получаю данные из таблицы sql, используя этот код, когда я запускаю в терминале pyspark на компьютере GCP

from pyspark.sql import SparkSession
spark = SparkSession.builder.appName("appName").getOrCreate()
sc = spark.sparkContext

from pyspark.sql import SQLContext
sqlContext = SQLContext(sc)

df= sqlContext.sql('select * from mytable limit 100')
print 'number of rows = ', df.count()  

Это работает, когда код копируется и вставляется в окно терминала pyspark.Но он выдает эту ошибку, когда файл запускается как .py из терминала.

19/01/21 03:38:43 INFO spark.SparkContext: Running Spark version 2.2.1
19/01/21 03:38:43 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
19/01/21 03:38:43 INFO spark.SparkContext: Submitted application: appName
19/01/21 03:38:43 INFO spark.SecurityManager: Changing view acls to: xxxxxxx
19/01/21 03:38:43 INFO spark.SecurityManager: Changing modify acls to: xxxxxxx
19/01/21 03:38:43 INFO spark.SecurityManager: Changing view acls groups to: 
19/01/21 03:38:43 INFO spark.SecurityManager: Changing modify acls groups to: 
19/01/21 03:38:43 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(xxxxxxx); groups with view permissions: Set(); users  with modify permissions: Set(xxxxxxx); groups with modify permissions: Set()
19/01/21 03:38:44 INFO util.Utils: Successfully started service 'sparkDriver' on port 00000.
19/01/21 03:38:44 INFO spark.SparkEnv: Registering MapOutputTracker
19/01/21 03:38:44 INFO spark.SparkEnv: Registering BlockManagerMaster
19/01/21 03:38:44 INFO storage.BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
19/01/21 03:38:44 INFO storage.BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up
19/01/21 03:38:44 INFO storage.DiskBlockManager: Created local directory at /tmp/blockmgr-bdcf00db-e6fc-4a6f-a64d-59def40ca89c
19/01/21 03:38:44 INFO memory.MemoryStore: MemoryStore started with capacity 4.3 GB
19/01/21 03:38:44 INFO spark.SparkEnv: Registering OutputCommitCoordinator
19/01/21 03:38:44 INFO util.log: Logging initialized @3180ms
19/01/21 03:38:44 INFO server.Server: jetty-9.3.z-SNAPSHOT
19/01/21 03:38:44 INFO server.Server: Started @3277ms
19/01/21 03:38:44 WARN util.Utils: Service 'SparkUI' could not bind on port 4040. Attempting port 4041.
19/01/21 03:38:44 WARN util.Utils: Service 'SparkUI' could not bind on port 4041. Attempting port 4042.
19/01/21 03:38:44 WARN util.Utils: Service 'SparkUI' could not bind on port 4042. Attempting port 4043.
19/01/21 03:38:44 WARN util.Utils: Service 'SparkUI' could not bind on port 4043. Attempting port 4044.
19/01/21 03:38:44 WARN util.Utils: Service 'SparkUI' could not bind on port 4044. Attempting port 4045.
19/01/21 03:38:44 WARN util.Utils: Service 'SparkUI' could not bind on port 4045. Attempting port 4046.
19/01/21 03:38:44 WARN util.Utils: Service 'SparkUI' could not bind on port 4046. Attempting port 4047.
19/01/21 03:38:44 WARN util.Utils: Service 'SparkUI' could not bind on port 4047. Attempting port 4048.
19/01/21 03:38:44 WARN util.Utils: Service 'SparkUI' could not bind on port 4048. Attempting port 4049.
19/01/21 03:38:44 INFO server.AbstractConnector: Started ServerConnector@aaa850a{HTTP/1.1,[http/1.1]}{0.0.0.0:0000}
19/01/21 03:38:44 INFO util.Utils: Successfully started service 'SparkUI' on port 0000.
19/01/21 03:38:44 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@eqqwe23231q2w{/jobs,null,AVAILABLE,@Spark}
19/01/21 03:38:44 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@eqqwe23231q2w{/jobs/json,null,AVAILABLE,@Spark}
19/01/21 03:38:44 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@eqqwe23231q2w{/jobs/job,null,AVAILABLE,@Spark}
19/01/21 03:38:44 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@eqqwe23231q2w{/jobs/job/json,null,AVAILABLE,@Spark}
19/01/21 03:38:44 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@eqqwe23231q2w{/stages,null,AVAILABLE,@Spark}
19/01/21 03:38:44 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@eqqwe23231q2w{/stages/json,null,AVAILABLE,@Spark}
19/01/21 03:38:44 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@eqqwe23231q2w{/stages/stage,null,AVAILABLE,@Spark}
19/01/21 03:38:44 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@eqqwe23231q2w{/stages/stage/json,null,AVAILABLE,@Spark}
19/01/21 03:38:44 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@eqqwe23231q2w{/stages/pool,null,AVAILABLE,@Spark}
19/01/21 03:38:44 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@eqqwe23231q2w{/stages/pool/json,null,AVAILABLE,@Spark}
19/01/21 03:38:44 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@eqqwe23231q2w{/storage,null,AVAILABLE,@Spark}
19/01/21 03:38:44 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@eqqwe23231q2w{/storage/json,null,AVAILABLE,@Spark}
19/01/21 03:38:44 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@eqqwe23231q2w{/storage/rdd,null,AVAILABLE,@Spark}
19/01/21 03:38:44 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@eqqwe23231q2w{/storage/rdd/json,null,AVAILABLE,@Spark}
19/01/21 03:38:44 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@eqqwe23231q2w{/environment,null,AVAILABLE,@Spark}
19/01/21 03:38:44 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@eqqwe23231q2w{/environment/json,null,AVAILABLE,@Spark}
19/01/21 03:38:44 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@eqqwe23231q2w{/executors,null,AVAILABLE,@Spark}
19/01/21 03:38:44 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@eqqwe23231q2w{/executors/json,null,AVAILABLE,@Spark}
19/01/21 03:38:44 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@eqqwe23231q2w{/executors/threadDump,null,AVAILABLE,@Spark}
19/01/21 03:38:44 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@eqqwe23231q2w{/executors/threadDump/json,null,AVAILABLE,@Spark}
19/01/21 03:38:44 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@eqqwe23231q2w{/static,null,AVAILABLE,@Spark}
19/01/21 03:38:44 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@eqqwe23231q2w{/,null,AVAILABLE,@Spark}
19/01/21 03:38:44 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@eqqwe23231q2w{/api,null,AVAILABLE,@Spark}
19/01/21 03:38:44 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@eqqwe23231q2w{/jobs/job/kill,null,AVAILABLE,@Spark}
19/01/21 03:38:44 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@eqqwe23231q2w{/stages/stage/kill,null,AVAILABLE,@Spark}
19/01/21 03:38:44 INFO ui.SparkUI: Bound SparkUI to 0.0.0.0, and started at http://00.00.00.00:0000
19/01/21 03:38:44 INFO util.Utils: Using initial executors = 8, max of spark.dynamicAllocation.initialExecutors, spark.dynamicAllocation.minExecutors and spark.executor.instances
19/01/21 03:38:44 INFO gcs.GoogleHadoopFileSystemBase: GHFS version: 1.6.10-hadoop2
19/01/21 03:38:45 WARN shortcircuit.DomainSocketFactory: The short-circuit local reads feature cannot be used because libhadoop cannot be loaded.
19/01/21 03:38:46 INFO client.ConfiguredRMFailoverProxyProvider: Failing over to rm1
19/01/21 03:38:46 INFO retry.RetryInvocationHandler: Exception while invoking getClusterMetrics of class ApplicationClientProtocolPBClientImpl over rm1 after 1 fail over attempts. Trying to fail over after sleeping for 829ms.
java.net.ConnectException: Call From mytable/ipaddress to mytable:0000 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
    at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:792)
    at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:732)
    at org.apache.hadoop.ipc.Client.call(Client.java:1479)
    at org.apache.hadoop.ipc.Client.call(Client.java:1412)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
    at com.sun.proxy.$Proxy15.getClusterMetrics(Unknown Source)
    at org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getClusterMetrics(ApplicationClientProtocolPBClientImpl.java:206)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
    at com.sun.proxy.$Proxy16.getClusterMetrics(Unknown Source)
    at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getYarnClusterMetrics(YarnClientImpl.java:487)
    at org.apache.spark.deploy.yarn.Client$$anonfun$submitApplication$1.apply(Client.scala:156)
    at org.apache.spark.deploy.yarn.Client$$anonfun$submitApplication$1.apply(Client.scala:156)
    at org.apache.spark.internal.Logging$class.logInfo(Logging.scala:54)
    at org.apache.spark.deploy.yarn.Client.logInfo(Client.scala:59)
    at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:155)
    at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:56)
    at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:173)
    at org.apache.spark.SparkContext.<init>(SparkContext.scala:509)
    at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:58)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
    at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:247)
    at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
    at py4j.Gateway.invoke(Gateway.java:236)
    at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80)
    at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69)
    at py4j.GatewayConnection.run(GatewayConnection.java:214)
    at java.lang.Thread.run(Thread.java:748)
Caused by: java.net.ConnectException: Connection refused
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
    at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
    at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
    at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531)
    at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:495)
    at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:614)
    at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:712)
    at org.apache.hadoop.ipc.Client$Connection.access$2900(Client.java:375)
    at org.apache.hadoop.ipc.Client.getConnection(Client.java:1528)
    at org.apache.hadoop.ipc.Client.call(Client.java:1451)
    ... 32 more
19/01/21 03:38:46 INFO client.ConfiguredRMFailoverProxyProvider: Failing over to rm2
19/01/21 03:38:46 INFO yarn.Client: Requesting a new application from cluster with 80 NodeManagers
19/01/21 03:38:46 INFO yarn.Client: Verifying our application has not requested more than the maximum memory capability of the cluster (45056 MB per container)
19/01/21 03:38:46 INFO yarn.Client: Will allocate AM container, with 24576 MB memory including 2234 MB overhead
19/01/21 03:38:46 INFO yarn.Client: Setting up container launch context for our AM
19/01/21 03:38:46 INFO yarn.Client: Setting up the launch environment for our AM container
19/01/21 03:38:46 INFO yarn.Client: Preparing resources for our AM container
19/01/21 03:38:48 INFO yarn.Client: Uploading resource file:/opt/hadoop/spark/python/lib/pyspark.zip -> hdfs://name-dataproc/user/xxxxxxx/.sparkStaging/application_1547596846411_1167/pyspark.zip
19/01/21 03:38:48 INFO yarn.Client: Uploading resource file:/opt/hadoop/spark/python/lib/py4j-0.10.4-src.zip -> hdfs://name-dataproc/user/xxxxxxx/.sparkStaging/application_1547596846411_1167/py4j-0.10.4-src.zip
19/01/21 03:38:48 INFO yarn.Client: Uploading resource file:/tmp/spark-1c0d417f-4fd6-411a-9480-0fc147d7c9a8/__spark_conf__2865868052747382300.zip -> hdfs://name-dataproc/user/xxxxxxx/.sparkStaging/application_1547596846411_1167/__spark_conf__.zip
19/01/21 03:38:48 INFO spark.SecurityManager: Changing view acls to: xxxxxxx
19/01/21 03:38:48 INFO spark.SecurityManager: Changing modify acls to: xxxxxxx
19/01/21 03:38:48 INFO spark.SecurityManager: Changing view acls groups to: 
19/01/21 03:38:48 INFO spark.SecurityManager: Changing modify acls groups to: 
19/01/21 03:38:48 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(xxxxxxx); groups with view permissions: Set(); users  with modify permissions: Set(xxxxxxx); groups with modify permissions: Set()
19/01/21 03:38:48 INFO yarn.Client: Submitting application application_1547596846411_1167 to ResourceManager
19/01/21 03:38:48 INFO impl.YarnClientImpl: Submitted application application_1547596846411_1167
19/01/21 03:38:48 INFO cluster.SchedulerExtensionServices: Starting Yarn extension services with app application_1547596846411_1167 and attemptId None
19/01/21 03:38:49 INFO yarn.Client: Application report for application_1547596846411_1167 (state: ACCEPTED)
19/01/21 03:38:49 INFO yarn.Client: 
     client token: N/A
     diagnostics: AM container is launched, waiting for AM container to Register with RM
     ApplicationMaster host: N/A
     ApplicationMaster RPC port: -1
     queue: long_running
     start time: 1548063528733
     final status: UNDEFINED
     tracking URL: http://name-dataproc-.:0000/proxy/application_1547596846411_1167/
     user: xxxxxxx
19/01/21 03:38:50 INFO yarn.Client: Application report for application_1547596846411_1167 (state: ACCEPTED)
19/01/21 03:38:51 INFO yarn.Client: Application report for application_1547596846411_1167 (state: ACCEPTED)
19/01/21 03:38:52 INFO yarn.Client: Application report for application_1547596846411_1167 (state: ACCEPTED)
19/01/21 03:38:52 INFO cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: ApplicationMaster registered as NettyRpcEndpointRef(spark-client://YarnAM)
19/01/21 03:38:52 INFO cluster.YarnClientSchedulerBackend: Add WebUI Filter. org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter, 
19/01/21 03:38:53 INFO cluster.YarnClientSchedulerBackend: Application application_1547596846411_1167 has started running.
19/01/21 03:38:53 INFO util.Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 34040.
19/01/21 03:38:53 INFO netty.NettyBlockTransferService: Server created on 00.000.00.00:23930
19/01/21 03:38:53 INFO storage.BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
19/01/21 03:38:53 INFO storage.BlockManagerMaster: Registering BlockManager BlockManagerId(driver, ip-address, port, None)
19/01/21 03:38:53 INFO storage.BlockManagerMasterEndpoint: Registering block manager ip-address:port with 4.3 GB RAM, BlockManagerId(driver, 10.206.52.22, 46766, None)
19/01/21 03:38:53 INFO storage.BlockManagerMaster: Registered BlockManager BlockManagerId(driver, ip-address, port, None)
19/01/21 03:38:53 INFO storage.BlockManager: external shuffle service port = 0000
19/01/21 03:38:53 INFO storage.BlockManager: Initialized BlockManager: BlockManagerId(driver, ip-address, port, None)
19/01/21 03:38:54 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@dfsdfsdfgs{/metrics/json,null,AVAILABLE,@Spark}
19/01/21 03:38:54 INFO scheduler.EventLoggingListener: Logging events to hdfs://name-dataproc/user/spark/eventlog/application_1547596846411_1167
19/01/21 03:38:54 INFO util.Utils: Using initial executors = 8, max of spark.dynamicAllocation.initialExecutors, spark.dynamicAllocation.minExecutors and spark.executor.instances
19/01/21 03:38:54 INFO cluster.YarnClientSchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.0
19/01/21 03:38:54 INFO internal.SharedState: loading hive config file: file:/opt/hadoop/conf/hive-site.xml
19/01/21 03:38:54 INFO internal.SharedState: spark.sql.warehouse.dir is not set, but hive.metastore.warehouse.dir is set. Setting spark.sql.warehouse.dir to the value of hive.metastore.warehouse.dir ('gs://place/place/path').
19/01/21 03:38:54 INFO internal.SharedState: Warehouse path is 'gs://place/place/path'.
19/01/21 03:38:54 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@sdfsdgs{/SQL,null,AVAILABLE,@Spark}
19/01/21 03:38:54 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@sdfsdfs{/SQL/json,null,AVAILABLE,@Spark}
19/01/21 03:38:54 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@sdfsdf{/SQL/execution,null,AVAILABLE,@Spark}
19/01/21 03:38:54 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@sdfsdf{/SQL/execution/json,null,AVAILABLE,@Spark}
19/01/21 03:38:54 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@dsfgsdgd{/static/sql,null,AVAILABLE,@Spark}
19/01/21 03:38:55 INFO gcs.GoogleHadoopFileSystemBase: GCS Metadata Cache is enabled: this isn't necessary and in fact is probably detrimental to your job!
19/01/21 03:38:55 INFO state.StateStoreCoordinatorRef: Registered StateStoreCoordinator endpoint
19/01/21 03:38:55 INFO execution.SparkSqlParser: Parsing command: select * from mytable limit 100
Traceback (most recent call last):
  File "/home/xxxxxx/spark_job_example.py", line 8, in <module>
    df= sqlContext.sql('select * from mytable limit 100')
  File "/opt/hadoop/spark/python/lib/pyspark.zip/pyspark/sql/context.py", line 384, in sql
  File "/opt/hadoop/spark/python/lib/pyspark.zip/pyspark/sql/session.py", line 603, in sql
  File "/opt/hadoop/spark/python/lib/py4j-0.10.4-src.zip/py4j/java_gateway.py", line 1133, in __call__
  File "/opt/hadoop/spark/python/lib/pyspark.zip/pyspark/sql/utils.py", line 69, in deco
pyspark.sql.utils.AnalysisException: u"Table or view not found: `mytable`.`myttable`; line 1 pos 14;\n'GlobalLimit 100\n+- 'LocalLimit 100\n   +- 'Project [*]\n      +- 'UnresolvedRelation `mytable`.`table`\n"
19/01/21 03:38:56 INFO spark.SparkContext: Invoking stop() from shutdown hook
19/01/21 03:38:56 INFO server.AbstractConnector: Stopped Spark@fec850a{HTTP/1.1,[http/1.1]}{0.0.0.0:4049}
19/01/21 03:38:56 INFO ui.SparkUI: Stopped Spark web UI at http://10.206.52.22:4049
19/01/21 03:38:56 INFO cluster.YarnClientSchedulerBackend: Interrupting monitor thread
19/01/21 03:38:56 INFO cluster.YarnClientSchedulerBackend: Shutting down all executors
19/01/21 03:38:56 INFO cluster.YarnSchedulerBackend$YarnDriverEndpoint: Asking each executor to shut down
19/01/21 03:38:56 INFO cluster.SchedulerExtensionServices: Stopping SchedulerExtensionServices
(serviceOption=None,
 services=List(),
 started=false)
19/01/21 03:38:56 INFO cluster.YarnClientSchedulerBackend: Stopped
19/01/21 03:38:56 INFO spark.MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
19/01/21 03:38:56 INFO memory.MemoryStore: MemoryStore cleared
19/01/21 03:38:56 INFO storage.BlockManager: BlockManager stopped
19/01/21 03:38:56 INFO storage.BlockManagerMaster: BlockManagerMaster stopped
19/01/21 03:38:56 INFO scheduler.OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
19/01/21 03:38:56 INFO spark.SparkContext: Successfully stopped SparkContext
19/01/21 03:38:56 INFO util.ShutdownHookManager: Shutdown hook called
19/01/21 03:38:56 INFO util.ShutdownHookManager: Deleting directory /tmp/spark-1c0d417f-4fd6-411a-9480-0fc147d7c9a8
19/01/21 03:38:56 INFO util.ShutdownHookManager: Deleting directory /tmp/spark-1c0d417f-4fd6-411a-9480-0fc147d7c9a8/pyspark-82d123ce-18ce-43ce-b631-8638bf5ffbfb

Я ценю любую помощь

...