В чем проблема sql над pandas dataframe (ошибка NullPointerException) - PullRequest
0 голосов
/ 23 октября 2018

Я пытаюсь использовать SQL DataPrame с помощью интерпретатора Python.Но ниже сообщение об ошибке.Я не знаю, почему возникает эта ошибка.

пункт 1. pandas dataframe

[код]

%python
mport pandas as pd
import numpy as np
import sys

sys.version_info
print(sys.version)
print(sys.path)

s = pd.Series(np.random.randn(100))
s.head()

df = pd.DataFrame(np.random.randn(110, 4), columns=list(‘ABCD’))
df.head()

[результат]

A B C D 0 -1.104620 -0.203555 -0.708837 0.811160 1 0.755126 0.060209
-0.206536 -0.442819 2 0.056334 0.953871 -1.441647 -0.262722 3 -0.399785
1.195350 0.500972 1.028257 4 -1.738896 0.198309 0.986380 0.042211

параграф 2. выполнить запрос

[код]

%python.sql
select * from df

[результат (сообщение об ошибке)]

java.lang.NullPointerException
at
org.apache.zeppelin.python.IPythonInterpreter.interpret(IPythonInterpreter.java:331)
at
org.apache.zeppelin.python.PythonInterpreter.interpret(PythonInterpreter.java:371)
at
org.apache.zeppelin.python.PythonInterpreter.bootStrapInterpreter(PythonInterpreter.java:557)
at
org.apache.zeppelin.python.PythonInterpreterPandasSql.open(PythonInterpreterPandasSql.java:73)
at
org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:69)
at
org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:617)
at org.apache.zeppelin.scheduler.Job.run(Job.java:188) at
org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:140)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266) at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

Это журнал журнала Zeppelin (сервер, интерпретатор)

[журнал сервера]

INFO [2018-10-19 15:25:22,244] ({qtp2054574951-13}
VFSNotebookRepo.java[save]:196) - Saving note:2DVU2DMPM
INFO [2018-10-19 15:25:22,248] ({pool-2-thread-3}
SchedulerFactory.java[jobStarted]:109) - Job 20181018-023104_253425563
started by scheduler
org.apache.zeppelin.interpreter.remote.RemoteInterpreter-python:k91159664:-shared_session
INFO [2018-10-19 15:25:22,249] ({pool-2-thread-3}
Paragraph.java[jobRun]:380) - Run paragraph [paragraph_id:
20181018-023104_253425563, interpreter: python.sql, note_id: 2DVU2DMPM,
user: k91159664]
WARN [2018-10-19 15:25:22,258] ({pool-2-thread-3}
NotebookServer.java[afterStatusChange]:2344) - Job
20181018-023104_253425563 is finished, status: ERROR, exception: null,
result: %text java.lang.NullPointerException
at
org.apache.zeppelin.python.IPythonInterpreter.interpret(IPythonInterpreter.java:331)
at
org.apache.zeppelin.python.PythonInterpreter.interpret(PythonInterpreter.java:371)
at
org.apache.zeppelin.python.PythonInterpreter.bootStrapInterpreter(PythonInterpreter.java:557)
at
org.apache.zeppelin.python.PythonInterpreterPandasSql.open(PythonInterpreterPandasSql.java:73)
at
org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:69)
at
org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:617)
at org.apache.zeppelin.scheduler.Job.run(Job.java:188)
at org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:140)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

INFO [2018-10-19 15:25:22,278] ({pool-2-thread-3}
VFSNotebookRepo.java[save]:196) - Saving note:2DVU2DMPM
INFO [2018-10-19 15:25:22,280] ({pool-2-thread-3}
SchedulerFactory.java[jobFinished]:115) - Job 20181018-023104_253425563
finished by scheduler
org.apache.zeppelin.interpreter.remote.RemoteInterpreter-python:k91159664:-shared_session

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

[интерпретатор-python.log]

INFO [2018-10-19 15:25:22,250] ({pool-2-thread-7}
SchedulerFactory.java[jobStarted]:109) - Job 20181018-023104_253425563
started by scheduler interpreter_128873761
INFO [2018-10-19 15:25:22,250] ({pool-2-thread-7}
PythonInterpreterPandasSql.java[open]:67) - Open Python SQL interpreter
instance: org.apache.zeppelin.python.PythonInterpreterPandasSql@7ae7521
INFO [2018-10-19 15:25:22,250] ({pool-2-thread-7}
PythonInterpreterPandasSql.java[open]:70) - Bootstrap
org.apache.zeppelin.python.PythonInterpreterPandasSql@7ae7521 interpreter
with /python/bootstrap_sql.py
ERROR [2018-10-19 15:25:22,251] ({pool-2-thread-7} Job.java[run]:190) - Job
failed
java.lang.NullPointerException
at
org.apache.zeppelin.python.IPythonInterpreter.interpret(IPythonInterpreter.java:331)
at
org.apache.zeppelin.python.PythonInterpreter.interpret(PythonInterpreter.java:371)
at
org.apache.zeppelin.python.PythonInterpreter.bootStrapInterpreter(PythonInterpreter.java:557)
at
org.apache.zeppelin.python.PythonInterpreterPandasSql.open(PythonInterpreterPandasSql.java:73)
at
org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:69)
at
org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:617)
at org.apache.zeppelin.scheduler.Job.run(Job.java:188)
at org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:140)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
INFO [2018-10-19 15:25:22,252] ({pool-2-thread-7}
SchedulerFactory.java[jobFinished]:115) - Job 20181018-023104_253425563
finished by scheduler interpreter_128873761

Кроме того, среда Python 2.7.5 и 3.6.5 хорошо работает, ноАнаконда 3.6.5 не работает.

Спасибо за вашу помощь.

...