У меня возникают проблемы при переходе на формат данных NetCDF из HDF5 для хранения словаря панд DataFrames, который содержит данные и результаты модели pyomo.
Текущий скрипт сохранения HDF5, который работает без проблем, выглядит следующим образом:
import pandas as pd
def save(prob, filename):
with pd.HDFStore(filename, mode='w') as store:
for name in prob._data.keys():
store['data/'+name] = prob._data[name]
for name in prob._result.keys():
store['result/'+name] = prob._result[name]
, где prob
- это решенный экземпляр модели pyomo.
Поскольку в нашем проекте мы выполняем миграцию на PyPy по причинам времени выполнения, которые в настоящее время не поддерживают h5py
, мы также хотим перейти на NetCDF, а не HDF5 для хранения экземпляров нашей модели.
Для этого я использую xarray
Наборы данных, которые, кажется, совместимы с форматом NetCDF:
import xarray as xr
def save(prob, filename):
ds = xr.Dataset()
for name in prob._data.keys():
ds['data/'+name] = prob._data[name]
for name in prob._result.keys():
ds['result/'+name] = prob._result[name]
ds.to_netcdf(filename)
Несмотря на то, что выглядит довольно аналогично предыдущему сценарию HDF5, я получаю следующую ошибку:
urbs.save(prob, os.path.join(result_dir, '{}.nc'.format(sce)))
File "/home/scandas/nas/pypy_for_asinus/urbs_pypy/urbs/saveload.py", line 63, in save
ds['data/'+name] = prob._data[name]
File "/home/scandas/nas/pypy_for_asinus/pypy3-v6.0.0-linux64/site-packages/xarray/core/dataset.py", line 899, in __setitem__
self.update({key: value})
File "/home/scandas/nas/pypy_for_asinus/pypy3-v6.0.0-linux64/site-packages/xarray/core/dataset.py", line 2305, in update
variables, coord_names, dims = dataset_update_method(self, other)
File "/home/scandas/nas/pypy_for_asinus/pypy3-v6.0.0-linux64/site-packages/xarray/core/merge.py", line 580, in dataset_update_method
indexes=dataset.indexes)
File "/home/scandas/nas/pypy_for_asinus/pypy3-v6.0.0-linux64/site-packages/xarray/core/merge.py", line 434, in merge_core
aligned = deep_align(coerced, join=join, copy=False, indexes=indexes)
File "/home/scandas/nas/pypy_for_asinus/pypy3-v6.0.0-linux64/site-packages/xarray/core/alignment.py", line 213, in deep_align
exclude=exclude)
File "/home/scandas/nas/pypy_for_asinus/pypy3-v6.0.0-linux64/site-packages/xarray/core/alignment.py", line 164, in align
new_obj = obj.reindex(copy=copy, **valid_indexers)
File "/home/scandas/nas/pypy_for_asinus/pypy3-v6.0.0-linux64/site-packages/xarray/core/dataarray.py", line 906, in reindex
indexers=indexers, method=method, tolerance=tolerance, copy=copy)
File "/home/scandas/nas/pypy_for_asinus/pypy3-v6.0.0-linux64/site-packages/xarray/core/dataset.py", line 1812, in reindex
tolerance, copy=copy)
File "/home/scandas/nas/pypy_for_asinus/pypy3-v6.0.0-linux64/site-packages/xarray/core/alignment.py", line 324, in reindex_variables
int_indexer = get_indexer_nd(index, target, method, tolerance)
File "/home/scandas/nas/pypy_for_asinus/pypy3-v6.0.0-linux64/site-packages/xarray/core/indexing.py", line 117, in get_indexer_nd
flat_indexer = index.get_indexer(flat_labels, **kwargs)
File "/home/scandas/nas/pypy_for_asinus/pypy3-v6.0.0-linux64/site-packages/pandas/core/indexes/multi.py", line 2042, in get_indexer
indexer = self._engine.get_indexer(target)
File "pandas/_libs/index.pyx", line 654, in pandas._libs.index.BaseMultiIndexCodesEngine.get_indexer
ValueError: operands could not be broadcast together with shapes (244,3) (4,) (244,3)
Похоже, что для некоторых ключей в _data
из prob
существует несоответствие форм (во время конкатенации?), Что приводит к ошибке при назначении элементов набора данных xarray. Однако процедура сохранения HDF5 с аналогичными назначениями работает без каких-либо ошибок.
Редактировать: prob._data
словарь выглядит следующим образом:
{'global_prop': value description
Property
CO2 limit 150000000 Limits the sum of all created (as calculated b..., 'site': area
Name
Mid 280000000
South 5000000000,
'commodity': price max maxperhour
Site Commodity Type
Mid Biomass Stock 6.0 inf inf
CO2 Env 0.0 inf inf
Coal Stock 7.0 inf inf
Elec Demand NaN NaN NaN
Gas Stock 27.0 inf inf
Hydro SupIm NaN NaN NaN
Lignite Stock 4.0 inf inf
Slack Stock 999.0 inf inf
Solar SupIm NaN NaN NaN
Wind SupIm NaN NaN NaN
South Biomass Stock 6.0 inf inf
CO2 Env 0.0 inf inf
Coal Stock 7.0 inf inf
Elec Demand NaN NaN NaN
Elec buy Buy 1.0 inf inf
Elec sell Sell 3.0 inf inf
Gas Stock 27.0 inf inf
Hydro SupIm NaN NaN NaN
Lignite Stock 4.0 inf inf
Slack Stock 999.0 inf inf
Solar SupIm NaN NaN NaN
Wind SupIm NaN NaN NaN,
'process': inst-cap cap-lo cap-up max-grad min-fraction inv-cost fix-cost var-cost wacc depreciation area-per-cap annuity-factor
Site Process
Mid Biomass plant 0 0 5000 1.200000 0.00 875000 28000 1.40 0.07 25 NaN 0.085811
Gas plant 0 0 80000 4.800000 0.25 450000 6000 1.62 0.07 30 NaN 0.080586
Hydro plant 0 0 1400 inf 0.00 1600000 20000 0.00 0.07 50 NaN 0.072460
Lignite plant 0 0 60000 0.900000 0.65 600000 18000 0.60 0.07 40 NaN 0.075009
Photovoltaics 0 15000 160000 inf 0.00 600000 12000 0.00 0.07 25 14000.0 0.085811
Slack powerplant 999999 999999 999999 inf 0.00 0 0 100.00 0.07 1 NaN 1.070000
Wind park 0 0 13000 inf 0.00 1500000 30000 0.00 0.07 25 NaN 0.085811
South Biomass plant 0 0 2000 1.200000 0.00 875000 28000 1.40 0.07 25 NaN 0.085811
Coal plant 0 0 100000 0.600000 0.50 600000 18000 0.60 0.07 40 NaN 0.075009
Feed-in 0 0 1500 inf 0.00 0 0 0.00 0.07 1 NaN 1.070000
Gas plant 0 0 100000 4.800000 0.25 450000 6000 1.62 0.07 30 NaN 0.080586
Hydro plant 0 0 0 inf 0.00 1600000 20000 0.00 0.07 50 NaN 0.072460
Photovoltaics 0 20000 600000 inf 0.00 600000 12000 0.00 0.07 25 14000.0 0.085811
Purchase 0 0 1500 inf 0.00 0 80 0.00 0.07 1 NaN 1.070000
Slack powerplant 999999 999999 999999 inf 0.00 0 0 999.00 0.07 1 NaN 1.070000
Wind park 0 0 200000 inf 0.00 1500000 30000 0.00 0.07 25 NaN 0.085811,
'process_commodity': ratio ratio-min
Process Commodity Direction
Biomass plant Biomass In 1.0000 NaN
CO2 Out 0.0000 NaN
Elec Out 0.3500 NaN
Coal plant CO2 Out 0.3000 NaN
Coal In 1.0000 1.4
Elec Out 0.4000 NaN
Feed-in Elec In 1.0000 NaN
Elec sell Out 1.0000 NaN
Gas plant CO2 Out 0.2000 NaN
Elec Out 0.6000 NaN
Gas In 1.0000 1.2
Hydro plant Elec Out 1.0000 NaN
Hydro In 1.0000 NaN
Lignite plant CO2 Out 0.4000 NaN
Elec Out 0.4000 NaN
Lignite In 1.0000 2.0
Photovoltaics Elec Out 1.0000 NaN
Solar In 1.0000 NaN
Purchase CO2 Out 0.0005 NaN
Elec Out 1.0000 NaN
Elec buy In 1.0000 NaN
Slack powerplant CO2 Out 0.0000 NaN
Elec Out 1.0000 NaN
Slack In 1.0000 NaN
Wind park Elec Out 1.0000 NaN
Wind In 1.0000 NaN,
'transmission': eff inv-cost fix-cost var-cost inst-cap cap-lo cap-up wacc depreciation annuity-factor
Site In Site Out Transmission Commodity
Mid South hvac Elec 0.9 1650000 16500 0 0 0 inf 0.07 40 0.075009
South Mid hvac Elec 0.9 1650000 16500 0 0 0 inf 0.07 40 0.075009,
'storage': inst-cap-c cap-lo-c cap-up-c inst-cap-p cap-lo-p cap-up-p eff-in eff-out inv-cost-p inv-cost-c fix-cost-p fix-cost-c var-cost-p var-cost-c wacc depreciation init discharge annuity-factor
Site Storage Commodity
Mid Hydrogen Elec 0 0 inf 0 0 inf 0.64 0.64 42000 6.54 0 0.327 0.02 0 0.07 50 0.5 0.000003 0.07246
Pump storage Elec 0 60000 inf 0 8000 inf 0.94 0.94 100000 0.00 20000 0.000 0.02 0 0.07 50 0.5 0.000000 0.07246
South Hydrogen Elec 0 0 inf 0 0 inf 0.64 0.64 42000 6.54 0 0.327 0.02 0 0.07 50 0.5 0.000003 0.07246
Pump storage Elec 0 163000 inf 0 500 inf 0.94 0.94 100000 0.00 20000 0.000 0.02 0 0.07 50 0.5 0.000000 0.07246,
'demand': Mid South North
Elec Elec Elec
t
0 0.000000 0.00000 0.00000
1 43102.490062 4877.39981 11001.19176,
'supim': Mid South North
Wind Solar Hydro Wind Solar Hydro Wind Solar Hydro
t
0 0.000000 0 0.000000 0.000000 0 0.000000 0.000000 0 0.000000
1 0.935265 0 0.416194 0.457772 0 0.353497 0.602583 0 0.651799,
'buy_sell_price': Elec buy Elec sell
t
0 0.00 0.00000
1 0.08 -0.02106
'dsm': Empty DataFrame
Columns: [delay, eff, recov, cap-max-do, cap-max-up]
Index: []}
где список ['global_prop', 'commodity,', 'process', 'process_commodity', 'transmission', 'storage', 'demand', 'supim', 'buy_sell_price', 'dsm']
- это список ключей dict, с помощью которых я выполняю итерацию для создания набора данных xarray (который я хочу затем сохранить в файле NetCDF). Если быть точным, я получаю упомянутую ошибку на шаге name='transmission'
.