Для мультииндексных фреймов данных Pandas записывает имена столбцов в базу данных в формате ('index1[i]', 'index2[i]')
.При попытке сохранить в БД это приводит к следующей ошибке:
Traceback:
File "/usr/local/lib/python3.7/site-packages/django/core/handlers/exception.py" in inner
34. response = get_response(request)
File "/usr/local/lib/python3.7/site-packages/django/core/handlers/base.py" in _get_response
126. response = self.process_exception_by_middleware(e, request)
File "/usr/local/lib/python3.7/site-packages/django/core/handlers/base.py" in _get_response
124. response = wrapped_callback(request, *callback_args, **callback_kwargs)
File "/usr/local/lib/python3.7/site-packages/django/views/decorators/csrf.py" in wrapped_view
54. return view_func(*args, **kwargs)
File "/usr/local/lib/python3.7/site-packages/rest_framework/viewsets.py" in view
116. return self.dispatch(request, *args, **kwargs)
File "/usr/local/lib/python3.7/site-packages/rest_framework/views.py" in dispatch
495. response = self.handle_exception(exc)
File "/usr/local/lib/python3.7/site-packages/rest_framework/views.py" in handle_exception
455. self.raise_uncaught_exception(exc)
File "/usr/local/lib/python3.7/site-packages/rest_framework/views.py" in dispatch
492. response = handler(request, *args, **kwargs)
File "/code/core/store/views.py" in from_reducer
50. serializer.save(self.get_object(), background_job)
File "/code/core/store/serializers.py" in save
275. self.context.get("request").tenant,
File "/code/core/store/serializers.py" in _XXXX_ingestion_serializer_async_save
71. XXXX.dataframe = dataframe
File "/code/core/store/models.py" in dataframe
88. con=engine,
File "/usr/local/lib/python3.7/site-packages/pandas/core/generic.py" in to_sql
2130. dtype=dtype)
File "/usr/local/lib/python3.7/site-packages/pandas/io/sql.py" in to_sql
450. chunksize=chunksize, dtype=dtype)
File "/usr/local/lib/python3.7/site-packages/pandas/io/sql.py" in to_sql
1127. table.insert(chunksize)
File "/usr/local/lib/python3.7/site-packages/pandas/io/sql.py" in insert
641. self._execute_insert(conn, keys, chunk_iter)
File "/usr/local/lib/python3.7/site-packages/pandas/io/sql.py" in _execute_insert
616. conn.execute(self.insert_statement(), data)
File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/base.py" in execute
948. return meth(self, multiparams, params)
File "/usr/local/lib/python3.7/site-packages/sqlalchemy/sql/elements.py" in _execute_on_connection
269. return connection._execute_clauseelement(self, multiparams, params)
File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/base.py" in _execute_clauseelement
1060. compiled_sql, distilled_params
File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/base.py" in _execute_context
1200. context)
File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/base.py" in _handle_dbapi_exception
1416. util.reraise(*exc_info)
File "/usr/local/lib/python3.7/site-packages/sqlalchemy/util/compat.py" in reraise
249. raise value
File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/base.py" in _execute_context
1170. context)
File "/usr/local/lib/python3.7/site-packages/sqlalchemy/dialects/postgresql/psycopg2.py" in do_executemany
683. cursor.executemany(statement, parameters)
Exception Type: KeyError at /xxxxx/94feec54-e74e-4596-b42d-f52a53eb631a/yyyyy/
Exception Value: "('first', 'second'"
Request information:
Где SF
- первый индекс, а monkeys
- второй.Закрывающие скобки усекаются.
Я использую следующий df для его проверки:
first SF LA
second monkeys orangutans girafes monkeys orangutans girafes
A -1.017560 0.354499 -0.446993 1.219814 0.098890 0.717737
B 0.130512 -0.374556 0.536788 0.896989 2.266275 1.539214
C 0.444351 0.155903 -0.238987 1.971802 0.702577 1.215963
Я пытался проследить это все через стек, и я могу подтвердить, что Pandas генерирует следующее утверждение:
CREATE TABLE "data" (
"('SF', 'monkeys')" REAL,
"('SF', 'orangutans')" REAL,
"('SF', 'girafes')" REAL,
"('LA', 'monkeys')" REAL,
"('LA', 'orangutans')" REAL,
"('LA', 'girafes')" REAL
)
Но, похоже, он не работает с SQL Alchemy
Код Pandas:
dataframe.to_sql(
'"{}"'.format(self._table_name),
index=False,
chunksize=50000,
con=engine,
)
Механизм SQLAlchemy
engine = sqlalchemy.create_engine(
"postgresql://{}:{}@{}:{}/{}".format(
_warehouse_db_settings["USER"],
_warehouse_db_settings["PASSWORD"],
_warehouse_db_settings["HOST"],
_warehouse_db_settings["PORT"],
_warehouse_db_settings["NAME"],
)
)