Прогнозы с ANN в пакете h20 - PullRequest
0 голосов
/ 26 сентября 2019

Мой ИНС дает мне матрицу предсказаний вместо одного вектора предсказаний.предсказание «матрица» prob_pred выглядит следующим образом:

predict         p0         p1
1       1 0.02001849 0.97998151
2       1 0.28398389 0.71601611
3       1 0.43267364 0.56732636
4       0 0.85280527 0.14719473
5       0 0.97794556 0.02205444
6       1 0.04342903 0.95657097

Правильно ли взять последний столбец матрицы (столбец 3) и сказать, что если> 0,5, он предсказал 1, а вместо этого его 0?Вот мой код:

# Splitting the dataset into the training set and test set
training_set = dataset[1:2740,]
test_set = dataset[2741:3748,]

# Feature Scaling - very important for ANN as it reduces the computational effort
training_set[-63] = scale(training_set[-63])
test_set[-63] = scale(test_set[-63])

# Fitting ANN to the Training set - h2o package is for classification (most efficient package)/neuralnet for regressors/nnet for models with only one layer
h2o.init(nthreads = -1) # minus one means i take all the available cores to use for the computation
classifier = h2o.deeplearning(y = 'Good.Bad.Stock',
                              training_frame = as.h2o(training_set),
                              activation = 'Rectifier',
                              hidden = c(30,30),
                              epochs = 100, # this is how many times the iteration should take place
                              train_samples_per_iteration = 1) # -2 here is auto-tuning the Gradint descent backpropogation (one could also have 0 for one epoch or -1 for batch) or simply put in 1 for the stochastic GD

# Predicting the Test set results
prob_pred = h2o.predict(classifier, newdata = as.h2o(test_set[-63])) # now we have probablities - but now we need to classify - so next line needed
prob_pred
y_pred = (prob_pred [,3] >= 0.5)
y_pred = as.vector(y_pred) # we need to make a standard vector - the h2o frame is not accepted in the CM

#maybe also apply summary function (if possible) to find out which independent variables are the most important ones

# Making the Confusion Matrix
cm_ann = table(test_set[, 63], y_pred)
cm_ann

Спасибо!

...