Question

Визуально кажется, что мои прогнозы SARIMAX отклонены на 1 период (поздно), но я не могу понять, почему. Из того, что я написал, он должен построить прогноз на ту же дату / время, что и тестовые данные, по которым цикл повторяется. Данные обучения всегда на 1 период отстают от тестового индекса.

RMSE между отчетами о тестировании и прогнозах 0.378. Это плохой результат? Я бы не стал спрашивать об этом, если бы график выглядел не так заметно.

# timeframe : custom class that holds (among other things):
#    Frequency (pandas-compatible string representing periodicity)
#    Data (pandas dataframe where cols = close, open, high, low, volume, rsi; indexes = symbol, time)
#    Sarimax (dict that holds key (symbol) => value (dict of ARIMA order tuples)) generated by earlier script
# s : symbol name (ex., 'SPY')

forecasts = 25
def forecast_data(timeframe, series):
    data = series.asfreq(timeframe.Frequency)
    data.interpolate(inplace=True)
    # Limit datasize due to processing time (some models may fail due to too few nobs!)
    data = data.tail(1000)
    horizon = len(data) - forecasts
    return data[:horizon], data[horizon:]

# Datasets (training, testing, exog)
data = timeframe.Data.loc[s].close
training, testing = forecast_data(timeframe, data)

# Am I using exog correctly? I want to incorporate RSI into the predictive model
exog = timeframe.Data.loc[s].rsi
exog_training, exog_testing = forecast_data(timeframe, exog)

# Walk-Forward Forecasting
predictions = testing.copy(deep=True)
i = 1
print("{} SARIMAX {}x{} : {} rows".format(timeframe, timeframe.Sarimax[s]['order'], timeframe.Sarimax[s]['seasonal_order'], len(training)))
for index, value in testing.iteritems():
    print("   Forecasting {}/{} ({} @ {})".format(i, forecasts, '%.4f' % value, index), end='\r')
    # Fit Model
    fit = SARIMAX(training, order=timeframe.Sarimax[s]['order'], seasonal_order=timeframe.Sarimax[s]['seasonal_order'], enforce_stationarity=False, enforce_invertibility=False, exog=exog_training).fit()
    # one step forecast at current testing date from past training data
    # Am I using exog correctly here?
    predictions.loc[index] = fit.forecast(exog=pd.DataFrame(exog_training.tail(1))).iloc[0]
    # move testing data into training data for the next fit + forecast
    training.loc[index] = value
    exog_training[index] = exog_testing[index]
    i += 1
print('')

# Data/Fit Comparison
plt.figure(figsize=(16, 5))
plt.xlabel("Timeframe: {}".format(timeframe))
plt.ylabel("Price")
# Trim training plot for better visual inspection
training = training[-forecasts:]
plt.ylim(bottom=min(training), top=max(training))
training.plot(label=s + " Actuals", marker='o')
predictions.plot(label=s + " Predictions", marker='o')
plt.legend(loc='upper left')
ax = plt.gca()
ax.grid(which='major', alpha=0.5, linestyle='--')
ax.grid(which='minor', alpha=0.5, linestyle=':')
plt.show()
print(fit.summary())
fit.plot_diagnostics()
plt.show()

Я бы опубликовал это в Cross Validated, но это место похоже на город-призрак.

Прогнозирование SARIMAX каждый раз кажется отключенным на 1 период

Пожалуйста, войдите или зарегистрируйтесь чтобы ответить на этот вопрос.

Ответы [ 0 ]

Прогнозирование SARIMAX каждый раз кажется отключенным на 1 период

Пожалуйста, войдите или зарегистрируйтесь чтобы ответить на этот вопрос.

Ответы [ 0 ]

Похожие темы