Модель Keras, кажется, не грузит вес? - PullRequest
0 голосов
/ 26 апреля 2020

Я могу заставить модель эффективно тренироваться. Однако сейчас у меня проблема с загрузкой модели и повторным тестированием на новых данных.

Я смог убедиться, что моя модель хорошо работает с новыми данными, загрузив новые данные в модель, прежде чем сохранить ее и сделать прогноз на нее. (См. Код для примера)

Я действительно пытался следовать документации от Keras о том, как правильно загрузить и сохранить модель (это казалось легким), но, похоже, она не работает должным образом для меня. Буду признателен за любую оказанную помощь.

Ниже приведен код, который я использую для обучения и тестирования модели, а также для ее сохранения и загрузки.

def Build_Model_CNN_Text(word_index, embeddings_index, nclasses,
                     MAX_SEQUENCE_LENGTH=500, EMBEDDING_DIM=50,
                     dropout=0.5):

"""
Function to build a Convolutional Neural Network (CNN). Using the
rectified linear unit (ReLU) activation function, along with the softmax
activation function at the end and sparse_categorical_crossentropy loss with
the adam optimizer. The original of this code came from Swayam Mittal's
Medium.com aritcle "Deep Learning Techniques for Text Classification."
https://medium.com/datadriveninvestor/deep-learning-techniques-for-text-
classification-9392ca9492c7

Parameters
----------
word_index : dict
    The word index that was created from the tokenizer fucntion.
embeddings_index : dict
    The embedded index of the word.
nclasses : int
    The number of classes that are provided to the model.
MAX_SEQUENCE_LENGTH : int
    Max length of each sequence, by default 500.
EMBEDDING_DIM : int
    Dimension for word embedding, by default 50.
dropout : float, optional
    Used to help prevent overfitting of the model, by default 0.5.

Returns
-------
Model
    A untrained CNN model that is ready to fit data.
"""
model = Sequential()
embedding_matrix = np.random.random((len(word_index) + 1, EMBEDDING_DIM))
for word, i in word_index.items():
    embedding_vector = embeddings_index.get(word)
    if embedding_vector is not None:
        # words not found in embedding index will be all-zeros.
        if len(embedding_matrix[i]) != len(embedding_vector):
            print("could not broadcast input array from shape",
                  str(len(embedding_matrix[i])),
                  "into shape", str(len(embedding_vector)),
                  " Please make sure your"
                  " EMBEDDING_DIM is equal to embedding_vector file,GloVe,")
            exit(1)
        embedding_matrix[i] = embedding_vector
embedding_layer = Embedding(len(word_index) + 1,
                            EMBEDDING_DIM,
                            weights=[embedding_matrix],
                            input_length=MAX_SEQUENCE_LENGTH,
                            trainable=True)
# applying a more complex convolutional approach
convs = []
filter_sizes = []
layer = 5
print("Filter  ", layer)
for fl in range(0, layer):
    filter_sizes.append((fl + 2))
node = 128
sequence_input = Input(shape=(MAX_SEQUENCE_LENGTH,), dtype='int32')
embedded_sequences = embedding_layer(sequence_input)
for fsz in filter_sizes:
    l_conv = Conv1D(node, kernel_size=fsz, activation='relu')(
        embedded_sequences)
    l_pool = MaxPooling1D(5)(l_conv)
    convs.append(l_pool)
l_merge = Concatenate(axis=1)(convs)
l_cov1 = Conv1D(node, 5, activation='relu')(l_merge)
l_cov1 = Dropout(dropout)(l_cov1)
l_pool1 = MaxPooling1D(5)(l_cov1)
l_cov2 = Conv1D(node, 5, activation='relu')(l_pool1)
l_cov2 = Dropout(dropout)(l_cov2)
l_pool2 = MaxPooling1D(30)(l_cov2)
l_flat = Flatten()(l_pool2)
l_dense = Dense(1024, activation='relu')(l_flat)
l_dense = Dropout(dropout)(l_dense)
l_dense = Dense(512, activation='relu')(l_dense)
l_dense = Dropout(dropout)(l_dense)
preds = Dense(nclasses, activation='softmax')(l_dense)
model = Model(sequence_input, preds)
model.compile(loss='sparse_categorical_crossentropy',
              optimizer='adam',
              metrics=['acc'])
return model

def use_cnn_model():
    """
    Function used to train the CNN model. It starts by gathering the dataset
    by using the import_dataset function from the make_dataset package. The
    determine_if_spam function is then called on the dataframe and loaded into
    the 'data' variable. The data is then split and tokenized and the CNN model
    is built.
    The model is then fit with the data and number of epochs, batch_size,
    and verbosity level, returning a trained model. The trained model is then
    used to predict on the test data and the output matix is printed to the
    terminal. The model is then saved in an .h5 format.
    """
    dirty_data = import_dataset()
    data = determine_if_spam(dirty_data)
    data = data.dropna(subset=['body'])
    split_data = split_dataset(data)
    X_train = split_data[0]
    X_test = split_data[1]
    y_train = split_data[2]
    y_test = split_data[3]
    X_train_Glove, X_test_Glove, word_index, embeddings_index = loadData_Tokenizer(
        X_train, X_test)
    model_CNN = Build_Model_CNN_Text(word_index, embeddings_index, 20)
    model_CNN.summary()
    model_CNN.fit(X_train_Glove, y_train,
                  validation_data=(X_test_Glove, y_test),
                  epochs=10,
                  batch_size=128,
                  verbose=2)
    model_CNN.save('saved_model.h5')
    predicted = model_CNN.predict(X_test_Glove)
    predicted = np.argmax(predicted, axis=1)
    print(metrics.classification_report(y_test, predicted))


    new_dirty_data = import_dataset()
    new_data = determine_if_spam(new_dirty_data)
    new_data = data.dropna(subset=['body'])
    new_split_data = split_dataset(new_data)
    new_X_train = new_split_data[0]
    new_X_test = new_split_data[1]
    new_y_train = new_split_data[2]
    new_y_test = new_split_data[3]
    new_X_train_Glove, new_X_test_Glove, new_word_index, \
        new_embeddings = loadData_Tokenizer(new_X_train, new_X_test)
    new_predicted = model_CNN.predict(new_X_test_Glove)
    new_predicted = np.argmax(new_predicted, axis=1)
    print("New data prediction: \n)
    print(metrics.classification_report(new_y_test, new_predicted))

def loaded_model():
"""
Function to load the trained model to use on the new data.

Returns
-------
Model
    The trained model that is saved in the saved_model.h5 file.
"""
new_model = tf.keras.models.load_model('saved_model.h5')
return new_model

def main():
    """
    Main function that is used to call all other functions.
    """
    model = loaded_model()
    dirty_data = import_dataset()
    data = determine_if_spam(dirty_data)
    data = data.dropna(subset=['body'])
    split_data = split_dataset(data)
    X_train = split_data[0]
    X_test = split_data[1]
    y_train = split_data[2]
    y_test = split_data[3]
    X_train_Glove, X_test_Glove, word_index, \
      embeddings_index = loadData_Tokenizer(X_train, X_test)
    loss, acc = model.evaluate(X_test_Glove, y_test)

    print('Restored model, accuracy: {:5.2f}%'.format(100 * acc))

    predicted = model.predict(X_test_Glove)
    predicted = np.argmax(predicted, axis=1)

    print(metrics.classification_report(y_test, predicted))
    print(metrics.confusion_matrix(y_test, predicted))
    print("Accuracy Score: ", metrics.accuracy_score(y_test,predicted))

Вот результаты вызова функции use_cnn_model ()

    Trained Data
              precision    recall  f1-score   support
         0.0       0.99      1.00      0.99       851
         1.0       0.75      0.77      0.76       110
         2.0       0.33      0.23      0.27        48
         3.0       0.90      0.96      0.93       303
         4.0       0.99      0.93      0.96       306
         5.0       0.99      0.98      0.99       325
         6.0       1.00      1.00      1.00       584
         7.0       0.98      0.98      0.98       246
    accuracy                           0.96      2773
   macro avg       0.87      0.86      0.86      2773
weighted avg       0.96      0.96      0.96      2773

New data prediction
              precision    recall  f1-score   support
         0.0       0.99      1.00      0.99       832
         1.0       0.78      0.72      0.75       115
         2.0       0.37      0.33      0.35        49
         3.0       0.87      0.96      0.91       283
         4.0       0.98      0.92      0.95       337
         5.0       0.98      0.97      0.98       333
         6.0       1.00      1.00      1.00       572
         7.0       0.98      0.97      0.97       252
    accuracy                           0.96      2773
   macro avg       0.87      0.86      0.86      2773
weighted avg       0.96      0.96      0.96      2773

Вот вывод моей функции main () после загрузки функции.

Restored model, accuracy: 34.11%
              precision    recall  f1-score   support
         0.0       1.00      0.81      0.89      3628
         1.0       0.01      0.01      0.01       652
         2.0       0.01      0.06      0.02       205
         3.0       0.00      0.00      0.00      1375
         4.0       0.25      0.50      0.33      1387
         5.0       0.21      0.37      0.27      1177
         6.0       0.21      0.02      0.04      2190
         7.0       0.04      0.04      0.04      1661
    accuracy                           0.34     12275
   macro avg       0.22      0.23      0.20     12275
weighted avg       0.39      0.34      0.34     12275
[[2926    0    0  702    0    0    0    0]
 [   0    8    9    0  130   50   50  405]
 [   0   55   12  138    0    0    0    0]
 [   0   14   24    4  117 1109   10   97]
 [   0  185  264   92  687   14   13  132]
 [   0    1    1    2  154  431  113  475]
 [   0   79   87    8 1081  444   51  440]
 [   0  276  556  103  627   26    5   68]]
Accuracy Score:  0.3410997963340122

Спасибо за любую помощь в этом, я понимаю, что это, вероятно, не самая легкая вещь для подражания. Надеюсь, я просто что-то упустил, если честно.

Добро пожаловать на сайт PullRequest, где вы можете задавать вопросы и получать ответы от других членов сообщества.
...