PySpark поднял Ошибка, показывая что-то не так с Hive metastore_db - PullRequest
0 голосов
/ 01 апреля 2020

Я просто запускал две строки на консоли pyspark:

data = [('alice', 30), ('luis', 33), ('david', 39)]
df = spark.createDataFrame(data)

Затем консоль выдавала ошибки ниже:

Caused by: java.sql.SQLException: Failed to start database 'metastore_db' with class loader org.apache.spark.sql.hive.client.IsolatedClientLoader$$anon$1@1ffd6cd7, see the next exception for details.
    at org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(Unknown Source)
    at org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(Unknown Source)
    at org.apache.derby.impl.jdbc.Util.seeNextException(Unknown Source)
    at org.apache.derby.impl.jdbc.EmbedConnection.bootDatabase(Unknown Source)
    at org.apache.derby.impl.jdbc.EmbedConnection.<init>(Unknown Source)
    at org.apache.derby.jdbc.InternalDriver$1.run(Unknown Source)
    at org.apache.derby.jdbc.InternalDriver$1.run(Unknown Source)
    at java.security.AccessController.doPrivileged(Native Method)
    at org.apache.derby.jdbc.InternalDriver.getNewEmbedConnection(Unknown Source)
    at org.apache.derby.jdbc.InternalDriver.connect(Unknown Source)
    at org.apache.derby.jdbc.InternalDriver.connect(Unknown Source)
    at org.apache.derby.jdbc.AutoloadedDriver.connect(Unknown Source)
    at java.sql.DriverManager.getConnection(DriverManager.java:664)
    at java.sql.DriverManager.getConnection(DriverManager.java:208)
    at com.jolbox.bonecp.BoneCP.obtainRawInternalConnection(BoneCP.java:361)
    at com.jolbox.bonecp.BoneCP.<init>(BoneCP.java:416)
    ... 97 more
Caused by: ERROR XJ040: Failed to start database 'metastore_db' with class loader org.apache.spark.sql.hive.client.IsolatedClientLoader$$anon$1@1ffd6cd7, see the next exception for details.
    at org.apache.derby.iapi.error.StandardException.newException(Unknown Source)
    at org.apache.derby.impl.jdbc.SQLExceptionFactory.wrapArgsForTransportAcrossDRDA(Unknown Source)
    ... 113 more
Caused by: ERROR XSDB6: Another instance of Derby may have already booted the database /home/dennis/metastore_db.
    at org.apache.derby.iapi.error.StandardException.newException(Unknown Source)
    at org.apache.derby.iapi.error.StandardException.newException(Unknown Source)
    at org.apache.derby.impl.store.raw.data.BaseDataFileFactory.privGetJBMSLockOnDB(Unknown Source)
    at org.apache.derby.impl.store.raw.data.BaseDataFileFactory.run(Unknown Source)
    at java.security.AccessController.doPrivileged(Native Method)
    at org.apache.derby.impl.store.raw.data.BaseDataFileFactory.getJBMSLockOnDB(Unknown Source)
    at org.apache.derby.impl.store.raw.data.BaseDataFileFactory.boot(Unknown Source)
    at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
    at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
    at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source)
    at org.apache.derby.impl.services.monitor.FileMonitor.startModule(Unknown Source)
    at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown Source)
    at org.apache.derby.impl.store.raw.RawStore$6.run(Unknown Source)
    at java.security.AccessController.doPrivileged(Native Method)
    at org.apache.derby.impl.store.raw.RawStore.bootServiceModule(Unknown Source)
    at org.apache.derby.impl.store.raw.RawStore.boot(Unknown Source)
    at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
    at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
    at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source)
    at org.apache.derby.impl.services.monitor.FileMonitor.startModule(Unknown Source)
    at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown Source)
    at org.apache.derby.impl.store.access.RAMAccessManager$5.run(Unknown Source)
    at java.security.AccessController.doPrivileged(Native Method)
    at org.apache.derby.impl.store.access.RAMAccessManager.bootServiceModule(Unknown Source)
    at org.apache.derby.impl.store.access.RAMAccessManager.boot(Unknown Source)
    at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
    at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
    at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source)
    at org.apache.derby.impl.services.monitor.FileMonitor.startModule(Unknown Source)
    at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown Source)
    at org.apache.derby.impl.db.BasicDatabase$5.run(Unknown Source)
    at java.security.AccessController.doPrivileged(Native Method)
    at org.apache.derby.impl.db.BasicDatabase.bootServiceModule(Unknown Source)
    at org.apache.derby.impl.db.BasicDatabase.bootStore(Unknown Source)
    at org.apache.derby.impl.db.BasicDatabase.boot(Unknown Source)
    at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
    at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
    at org.apache.derby.impl.services.monitor.BaseMonitor.bootService(Unknown Source)
    at org.apache.derby.impl.services.monitor.BaseMonitor.startProviderService(Unknown Source)
    at org.apache.derby.impl.services.monitor.BaseMonitor.findProviderAndStartService(Unknown Source)
    at org.apache.derby.impl.services.monitor.BaseMonitor.startPersistentService(Unknown Source)
    at org.apache.derby.iapi.services.monitor.Monitor.startPersistentService(Unknown Source)
    at org.apache.derby.impl.jdbc.EmbedConnection$4.run(Unknown Source)
    at org.apache.derby.impl.jdbc.EmbedConnection$4.run(Unknown Source)
    at java.security.AccessController.doPrivileged(Native Method)
    at org.apache.derby.impl.jdbc.EmbedConnection.startPersistentService(Unknown Source)
    ... 110 more
---------------------------------------------------------------------------
Py4JJavaError                             Traceback (most recent call last)
/software/spark-2.3.1-bin-hadoop2.7/python/pyspark/sql/utils.py in deco(*a, **kw)
     62         try:
---> 63             return f(*a, **kw)
     64         except py4j.protocol.Py4JJavaError as e:

/software/spark-2.3.1-bin-hadoop2.7/python/lib/py4j-0.10.7-src.zip/py4j/protocol.py in get_return_value(answer, gateway_client, target_id, name)
    327                     "An error occurred while calling {0}{1}{2}.\n".
--> 328                     format(target_id, ".", name), value)
    329             else:

Py4JJavaError: An error occurred while calling o24.applySchemaToPythonRDD.

Казалось, из-за метастора в Hive. но Hive в CDH работает нормально с MySQL metastore.

Среда:

CDH 6.2.0
spark-2.3.1-bin-hadoop2.7
Добро пожаловать на сайт PullRequest, где вы можете задавать вопросы и получать ответы от других членов сообщества.
...