Данные временного ряда - «допустимы только целые числа, срезы (`: `), многоточие (` ... `), numpy.newaxis (` None`) и целые или логические массивы " - PullRequest
0 голосов
/ 24 сентября 2019

Я изучаю анализ данных временных рядов.Данные взяты с сайта - www.quandl.com .

Я использую модель ARIMA для прогнозирования будущих прогнозов.Тем не менее, я думаю, что я сделал ошибку в коде, но я не знаю, что я пропустил.Я проверил код, но это правильно, насколько я знаю.

Это мой код:

import quandl
import pandas as pd
import matplotlib.pyplot as plt
from statsmodels.tsa.arima_model import ARIMA

data = quandl.get('EIA/PET_RWTC_D')

# converting the dataframe into series for Time Series Analysis
dataset = pd.Series(data = data['Value'], index = data.index)

# I am creating a hold_out_data. This data is removed from the training data. This is the test data.
hold_out_data = dataset['2019-09']
train_data = dataset.drop(dataset['2019-09'].index, axis = 0)

# ARIMA model to predict the hold out data
model = ARIMA(endog = train_data, order = (5, 1, 0)).fit(transparams = False)

# predicting the values for the dates 2019-09-02 to 2019-09-16. 
prediction = model.predict('2019-09-02', '2019-09-16')

Ошибка появляется при попытке запустить последнюю строку:

prediction = model.predict('2019-09-02', '2019-09-16').

Полный вывод

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
pandas\_libs\index.pyx in pandas._libs.index.DatetimeEngine.get_loc()

pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item()

TypeError: an integer is required

During handling of the above exception, another exception occurred:

KeyError                                  Traceback (most recent call last)
~\Anaconda3\envs\tensorflow\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
   2896             try:
-> 2897                 return self._engine.get_loc(key)
   2898             except KeyError:

pandas\_libs\index.pyx in pandas._libs.index.DatetimeEngine.get_loc()

pandas\_libs\index.pyx in pandas._libs.index.DatetimeEngine.get_loc()

pandas\_libs\index.pyx in pandas._libs.index.DatetimeEngine._date_check_type()

KeyError: '2019-09-02'

During handling of the above exception, another exception occurred:

TypeError                                 Traceback (most recent call last)
pandas\_libs\index.pyx in pandas._libs.index.DatetimeEngine.get_loc()

pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item()

TypeError: an integer is required

During handling of the above exception, another exception occurred:

KeyError                                  Traceback (most recent call last)
~\Anaconda3\envs\tensorflow\lib\site-packages\pandas\core\indexes\datetimes.py in get_loc(self, key, method, tolerance)
   1056         try:
-> 1057             return Index.get_loc(self, key, method, tolerance)
   1058         except (KeyError, ValueError, TypeError):

~\Anaconda3\envs\tensorflow\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
   2898             except KeyError:
-> 2899                 return self._engine.get_loc(self._maybe_cast_indexer(key))
   2900         indexer = self.get_indexer([key], method=method, tolerance=tolerance)

pandas\_libs\index.pyx in pandas._libs.index.DatetimeEngine.get_loc()

pandas\_libs\index.pyx in pandas._libs.index.DatetimeEngine.get_loc()

pandas\_libs\index.pyx in pandas._libs.index.DatetimeEngine._date_check_type()

KeyError: '2019-09-02'

During handling of the above exception, another exception occurred:

KeyError                                  Traceback (most recent call last)
pandas\_libs\index.pyx in pandas._libs.index.DatetimeEngine.get_loc()

pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item()

pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item()

KeyError: 1567382400000000000

During handling of the above exception, another exception occurred:

KeyError                                  Traceback (most recent call last)
~\Anaconda3\envs\tensorflow\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
   2896             try:
-> 2897                 return self._engine.get_loc(key)
   2898             except KeyError:

pandas\_libs\index.pyx in pandas._libs.index.DatetimeEngine.get_loc()

pandas\_libs\index.pyx in pandas._libs.index.DatetimeEngine.get_loc()

KeyError: Timestamp('2019-09-02 00:00:00')

During handling of the above exception, another exception occurred:

KeyError                                  Traceback (most recent call last)
pandas\_libs\index.pyx in pandas._libs.index.DatetimeEngine.get_loc()

pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item()

pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item()

KeyError: 1567382400000000000

During handling of the above exception, another exception occurred:

KeyError                                  Traceback (most recent call last)
~\Anaconda3\envs\tensorflow\lib\site-packages\pandas\core\indexes\datetimes.py in get_loc(self, key, method, tolerance)
   1069                     stamp = stamp.tz_localize(self.tz)
-> 1070                 return Index.get_loc(self, stamp, method, tolerance)
   1071             except KeyError:

~\Anaconda3\envs\tensorflow\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
   2898             except KeyError:
-> 2899                 return self._engine.get_loc(self._maybe_cast_indexer(key))
   2900         indexer = self.get_indexer([key], method=method, tolerance=tolerance)

pandas\_libs\index.pyx in pandas._libs.index.DatetimeEngine.get_loc()

pandas\_libs\index.pyx in pandas._libs.index.DatetimeEngine.get_loc()

KeyError: Timestamp('2019-09-02 00:00:00')

During handling of the above exception, another exception occurred:

KeyError                                  Traceback (most recent call last)
~\Anaconda3\envs\tensorflow\lib\site-packages\statsmodels\tsa\base\tsa_model.py in _get_index_label_loc(self, key, base_index)
    425                 if not isinstance(key, (int, long, np.integer)):
--> 426                     loc = self.data.row_labels.get_loc(key)
    427                 else:

~\Anaconda3\envs\tensorflow\lib\site-packages\pandas\core\indexes\datetimes.py in get_loc(self, key, method, tolerance)
   1071             except KeyError:
-> 1072                 raise KeyError(key)
   1073             except ValueError as e:

KeyError: '2019-09-02'

During handling of the above exception, another exception occurred:

KeyError                                  Traceback (most recent call last)
<ipython-input-366-2fff9dcc06e2> in <module>
----> 1 prediction = model.predict('2019-09-02', '2019-09-16')

~\Anaconda3\envs\tensorflow\lib\site-packages\statsmodels\base\wrapper.py in wrapper(self, *args, **kwargs)
     93             obj = data.wrap_output(func(results, *args, **kwargs), how[0], how[1:])
     94         elif how:
---> 95             obj = data.wrap_output(func(results, *args, **kwargs), how)
     96         return obj
     97 

~\Anaconda3\envs\tensorflow\lib\site-packages\statsmodels\tsa\arima_model.py in predict(self, start, end, exog, typ, dynamic)
   1816     def predict(self, start=None, end=None, exog=None, typ='linear',
   1817                 dynamic=False):
-> 1818         return self.model.predict(self.params, start, end, exog, typ, dynamic)
   1819     predict.__doc__ = _arima_results_predict
   1820 

~\Anaconda3\envs\tensorflow\lib\site-packages\statsmodels\tsa\arima_model.py in predict(self, params, start, end, exog, typ, dynamic)
   1163         if isinstance(start, (string_types, datetime)):
   1164             # start = _index_date(start, self.data.dates)
-> 1165             start, _, _ = self._get_index_label_loc(start)
   1166             if isinstance(start, slice):
   1167                 start = start.start

~\Anaconda3\envs\tensorflow\lib\site-packages\statsmodels\tsa\base\tsa_model.py in _get_index_label_loc(self, key, base_index)
    456                 index_was_expanded = False
    457             except:
--> 458                 raise e
    459         return loc, index, index_was_expanded
    460 

~\Anaconda3\envs\tensorflow\lib\site-packages\statsmodels\tsa\base\tsa_model.py in _get_index_label_loc(self, key, base_index)
    420         try:
    421             loc, index, index_was_expanded = (
--> 422                 self._get_index_loc(key, base_index))
    423         except KeyError as e:
    424             try:

~\Anaconda3\envs\tensorflow\lib\site-packages\statsmodels\tsa\base\tsa_model.py in _get_index_loc(self, key, base_index)
    370             #   (as of Pandas 0.22)
    371             except (IndexError, ValueError) as e:
--> 372                 raise KeyError(str(e))
    373             loc = key
    374         else:

KeyError: 'only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices'

Я проверил, и в StackOverflow есть много ответов, но я не могу найти ответа относительно анализа временных рядов.Большинство ответов состоит в том, чтобы изменить индекс на int32 с float64. Однако мой индекс - DateTime, и я не могу изменить его на целое число.

Может ли кто-нибудь помочь мне, где я допустил ошибку здесь?

Примечание. Возможно, вы захотите получить ключ API для доступа к наборам данных из quandl.

Заранее спасибо:)

...