Рассмотрим следующую нейронную сеть для классификации текста:
model = Sequential()
model.add(layers.Dense(2500, activation = "relu", input_shape=(8000,)))
# Hidden - Layers
model.add(layers.Dropout(0.5, noise_shape=None, seed=None))
model.add(layers.Dense(1000, activation = "relu"))
model.add(layers.Dropout(0.2, noise_shape=None, seed=None))
model.add(layers.Dense(20, activation = "softmax"))
model.summary()
model.compile(loss="categorical_crossentropy",
optimizer="adam",
metrics=['accuracy'])
model.fit( np.array(vectorized_training), np.array(y_train_neralnettr),
batch_size=2000,
epochs=2000,
verbose=1,
validation_data=(np.array(vectorized_validation), np.array(y_validation_neralnet)))
Вот некоторые из ее характеристик, мне интересно, было бы полезно попробовать что-нибудь полезное для тренировки 2000 эпох? Я видел, что некоторые примеры в playgroud.tensorflow.org требуют нескольких сотен эпох, чтобы получить достойную границу принятия решения.
Model: "sequential_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense_1 (Dense) (None, 2500) 20002500
_________________________________________________________________
dropout_1 (Dropout) (None, 2500) 0
_________________________________________________________________
dense_2 (Dense) (None, 1000) 2501000
_________________________________________________________________
dropout_2 (Dropout) (None, 1000) 0
_________________________________________________________________
dense_3 (Dense) (None, 20) 20020
=================================================================
Total params: 22,523,520
Trainable params: 22,523,520
Non-trainable params: 0
_________________________________________________________________
Train on 60000 samples, validate on 10000 samples
Epoch 1/2000
60000/60000 [==============================] - 62s 1ms/step - loss: 2.6002 - accuracy: 0.2925 - val_loss: 1.8567 - val_accuracy: 0.4805
Epoch 2/2000
60000/60000 [==============================] - 62s 1ms/step - loss: 1.5686 - accuracy: 0.5415 - val_loss: 1.6707 - val_accuracy: 0.5092
Epoch 3/2000
60000/60000 [==============================] - 62s 1ms/step - loss: 1.2384 - accuracy: 0.6291 - val_loss: 1.7539 - val_accuracy: 0.4990
Epoch 4/2000
60000/60000 [==============================] - 62s 1ms/step - loss: 1.0349 - accuracy: 0.6880 - val_loss: 1.8724 - val_accuracy: 0.4882
Epoch 5/2000
60000/60000 [==============================] - 62s 1ms/step - loss: 0.8591 - accuracy: 0.7398 - val_loss: 2.0041 - val_accuracy: 0.4804
Epoch 6/2000
60000/60000 [==============================] - 62s 1ms/step - loss: 0.6973 - accuracy: 0.7931 - val_loss: 2.1676 - val_accuracy: 0.4723
Epoch 7/2000
60000/60000 [==============================] - 63s 1ms/step - loss: 0.5416 - accuracy: 0.8449 - val_loss: 2.3530 - val_accuracy: 0.4663
Epoch 8/2000
60000/60000 [==============================] - 62s 1ms/step - loss: 0.4071 - accuracy: 0.8875 - val_loss: 2.5339 - val_accuracy: 0.4652
Epoch 9/2000
60000/60000 [==============================] - 63s 1ms/step - loss: 0.2967 - accuracy: 0.9244 - val_loss: 2.7374 - val_accuracy: 0.4582
Epoch 10/2000
60000/60000 [==============================] - 62s 1ms/step - loss: 0.2122 - accuracy: 0.9503 - val_loss: 2.9066 - val_accuracy: 0.4633
Epoch 11/2000
60000/60000 [==============================] - 63s 1ms/step - loss: 0.1530 - accuracy: 0.9663 - val_loss: 3.0738 - val_accuracy: 0.4588
Epoch 12/2000
60000/60000 [==============================] - 62s 1ms/step - loss: 0.1149 - accuracy: 0.9772 - val_loss: 3.2101 - val_accuracy: 0.4583
Epoch 13/2000
60000/60000 [==============================] - 63s 1ms/step - loss: 0.0866 - accuracy: 0.9833 - val_loss: 3.3498 - val_accuracy: 0.4551
Epoch 14/2000
60000/60000 [==============================] - 62s 1ms/step - loss: 0.0685 - accuracy: 0.9872 - val_loss: 3.4619 - val_accuracy: 0.4567
Epoch 15/2000
60000/60000 [==============================] - 62s 1ms/step - loss: 0.0573 - accuracy: 0.9898 - val_loss: 3.5577 - val_accuracy: 0.4588
Помогает ли уменьшение обучающей выборки?