моя обученная модель дает мне неправильный прогноз на входных изображениях - PullRequest
0 голосов
/ 08 мая 2019

Я делаю программу для распознавания арабских символов с использованием CNN на керасах, а затем я обучил модель на разных архитектурах, несмотря на то, что предложили создатели наборов данных. проблема в том, что когда я прогнозирую для test_data, включенного в наборы данных, хорошие результаты, однако, когда я пытаюсь предсказать с помощью действительного изображения, которое я ввожу, или изображения, сгенерированного канвой (я делаю webapp), это все время дает мне неправильные прогнозы, независимо от того, количество изображений, которые я пробовал.

Я сохранил и загрузил модель с хорошей точностью и меньшими потерями, и я загрузил изображения, используя lib openCV и выполнив изменение формы, чтобы она могла соответствовать модели и сделать ее серой, после чего я преобразовал ее в массив и передать его в функцию предиката, вывод неправильный, по сравнению с этим я загрузил test_data с метками и передать его в модель дает мне истинные результаты

так что это мой код от загрузки наборов данных для обучения к результатам test_data к images_input неправильные результаты

# Training letters images and labels files
letters_training_images_file_path = "drive/My Drive/ARlearning/Arabic Handwritten Characters Dataset CSV/training images.zip"
letters_training_labels_file_path = "drive/My Drive/ARlearning/Arabic Handwritten Characters Dataset CSV/training labels.zip"
# Testing letters images and labels files
letters_testing_images_file_path = "drive/My Drive/ARlearning/Arabic Handwritten Characters Dataset CSV/testing images.zip"
letters_testing_labels_file_path = "drive/My Drive/ARlearning/Arabic Handwritten Characters Dataset CSV/testing labels.zip"

# Loading dataset into dataframes
training_letters_images = pd.read_csv(letters_training_images_file_path, compression='zip', header=None)
training_letters_labels = pd.read_csv(letters_training_labels_file_path, compression='zip', header=None)
testing_letters_images = pd.read_csv(letters_testing_images_file_path, compression='zip', header=None)
testing_letters_labels = pd.read_csv(letters_testing_labels_file_path, compression='zip', header=None)


# Training digits images and labels files
digits_training_images_file_path = "drive/My Drive/ARlearning/Arabic Handwritten Digits Dataset CSV/training images.zip"
digits_training_labels_file_path = "drive/My Drive/ARlearning/Arabic Handwritten Digits Dataset CSV/training labels.zip"
# Testing digits images and labels files
digits_testing_images_file_path = "drive/My Drive/ARlearning/Arabic Handwritten Digits Dataset CSV/testing images.zip"
digits_testing_labels_file_path = "drive/My Drive/ARlearning/Arabic Handwritten Digits Dataset CSV/testing labels.zip"

# Loading dataset into dataframes
training_digits_images = pd.read_csv(digits_training_images_file_path, compression='zip', header=None)
training_digits_labels = pd.read_csv(digits_training_labels_file_path, compression='zip', header=None)
testing_digits_images = pd.read_csv(digits_testing_images_file_path, compression='zip', header=None)
testing_digits_labels = pd.read_csv(digits_testing_labels_file_path, compression='zip', header=None)

training_digits_images_scaled = training_digits_images.values.astype('float32')/255
training_digits_labels = training_digits_labels.values.astype('int32')
testing_digits_images_scaled = testing_digits_images.values.astype('float32')/255
testing_digits_labels = testing_digits_labels.values.astype('int32')

training_letters_images_scaled = training_letters_images.values.astype('float32')/255
training_letters_labels = training_letters_labels.values.astype('int32')
testing_letters_images_scaled = testing_letters_images.values.astype('float32')/255
testing_letters_labels = testing_letters_labels.values.astype('int32')

print("Training images of digits after scaling")
print(training_digits_images_scaled.shape)
training_digits_images_scaled[0:5]

print("Training images of letters after scaling")
print(training_letters_images_scaled.shape)
training_letters_images_scaled[0:5]

# one hot encoding
# number of classes = 10 (digits classes) + 28 (arabic alphabet classes)
number_of_classes = 38
training_letters_labels_encoded = to_categorical(training_letters_labels, num_classes=number_of_classes)
testing_letters_labels_encoded = to_categorical(testing_letters_labels, num_classes=number_of_classes)
training_digits_labels_encoded = to_categorical(training_digits_labels, num_classes=number_of_classes)
testing_digits_labels_encoded = to_categorical(testing_digits_labels, num_classes=number_of_classes)


# reshape input digit images to 64x64x1
training_digits_images_scaled = training_digits_images_scaled.reshape([-1, 64, 64, 1])
testing_digits_images_scaled = testing_digits_images_scaled.reshape([-1, 64, 64, 1])

# reshape input letter images to 64x64x1
training_letters_images_scaled = training_letters_images_scaled.reshape([-1, 64, 64, 1])
testing_letters_images_scaled = testing_letters_images_scaled.reshape([-1, 64, 64, 1])

print(training_digits_images_scaled.shape, training_digits_labels_encoded.shape, testing_digits_images_scaled.shape, testing_digits_labels_encoded.shape)
print(training_letters_images_scaled.shape, training_letters_labels_encoded.shape, testing_letters_images_scaled.shape, testing_letters_labels_encoded.shape)

training_data_images = np.concatenate((training_digits_images_scaled, training_letters_images_scaled), axis=0) 
training_data_labels = np.concatenate((training_digits_labels_encoded, training_letters_labels_encoded), axis=0)
print("Total Training images are {} images of shape".format(training_data_images.shape[0]))
print(training_data_images.shape, training_data_labels.shape)


testing_data_images = np.concatenate((testing_digits_images_scaled, testing_letters_images_scaled), axis=0) 
testing_data_labels = np.concatenate((testing_digits_labels_encoded, testing_letters_labels_encoded), axis=0)
print("Total Testing images are {} images of shape".format(testing_data_images.shape[0]))
print(testing_data_images.shape, testing_data_labels.shape)

def create_model(optimizer='adam', kernel_initializer='he_normal', activation='relu'):
  # create model
  model = Sequential()
  model.add(Conv2D(filters=16, kernel_size=3, padding='same', input_shape=(64, 64, 1), kernel_initializer=kernel_initializer, activation=activation))
  model.add(BatchNormalization())
  model.add(MaxPooling2D(pool_size=2))
  model.add(Dropout(0.2))

  model.add(Conv2D(filters=32, kernel_size=3, padding='same', kernel_initializer=kernel_initializer, activation=activation))
  model.add(BatchNormalization())
  model.add(MaxPooling2D(pool_size=2))
  model.add(Dropout(0.2))

  model.add(Conv2D(filters=64, kernel_size=3, padding='same', kernel_initializer=kernel_initializer, activation=activation))
  model.add(BatchNormalization())
  model.add(MaxPooling2D(pool_size=2))
  model.add(Dropout(0.2))

  model.add(Conv2D(filters=128, kernel_size=3, padding='same', kernel_initializer=kernel_initializer, activation=activation))
  model.add(BatchNormalization())
  model.add(MaxPooling2D(pool_size=2))
  model.add(Dropout(0.2))
  model.add(GlobalAveragePooling2D())



  #Fully connected final layer
  model.add(Dense(38, activation='softmax'))

  # Compile model
  model.compile(loss='categorical_crossentropy', metrics=['accuracy'], optimizer=optimizer)
  return model

model = create_model()
model.summary()

model = create_model(optimizer='Adam', kernel_initializer='normal', activation='relu')


epochs = 20
batch_size = 20

checkpointer = ModelCheckpoint(filepath='weights.hdf5', verbose=1, save_best_only=True)

history = model.fit(training_data_images, training_data_labels, 
                    validation_data=(testing_data_images, testing_data_labels),
                    epochs=epochs, batch_size=batch_size, verbose=1, callbacks=[checkpointer])

результаты обучения:

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/math_ops.py:3066: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.
Train on 73440 samples, validate on 13360 samples
Epoch 1/10
73440/73440 [==============================] - 52s 702us/step - loss: 0.3535 - acc: 0.9062 - val_loss: 0.2023 - val_acc: 0.9236

Epoch 00001: val_loss improved from inf to 0.20232, saving model to weights.hdf5
Epoch 2/10
73440/73440 [==============================] - 48s 658us/step - loss: 0.1068 - acc: 0.9672 - val_loss: 0.1701 - val_acc: 0.9469

Epoch 00002: val_loss improved from 0.20232 to 0.17013, saving model to weights.hdf5
Epoch 3/10
73440/73440 [==============================] - 49s 667us/step - loss: 0.0799 - acc: 0.9753 - val_loss: 0.1112 - val_acc: 0.9707

Epoch 00003: val_loss improved from 0.17013 to 0.11123, saving model to weights.hdf5
Epoch 4/10
73440/73440 [==============================] - 47s 638us/step - loss: 0.0684 - acc: 0.9786 - val_loss: 0.0715 - val_acc: 0.9800

Epoch 00004: val_loss improved from 0.11123 to 0.07150, saving model to weights.hdf5
Epoch 5/10
73440/73440 [==============================] - 48s 660us/step - loss: 0.0601 - acc: 0.9812 - val_loss: 0.2134 - val_acc: 0.9343

Epoch 00005: val_loss did not improve from 0.07150
Epoch 6/10
73440/73440 [==============================] - 47s 647us/step - loss: 0.0545 - acc: 0.9828 - val_loss: 0.0641 - val_acc: 0.9814

Epoch 00006: val_loss improved from 0.07150 to 0.06413, saving model to weights.hdf5
Epoch 7/10
73440/73440 [==============================] - 48s 655us/step - loss: 0.0490 - acc: 0.9846 - val_loss: 0.8639 - val_acc: 0.7332

Epoch 00007: val_loss did not improve from 0.06413
Epoch 8/10
73440/73440 [==============================] - 48s 660us/step - loss: 0.0472 - acc: 0.9854 - val_loss: 0.0509 - val_acc: 0.9844

Epoch 00008: val_loss improved from 0.06413 to 0.05093, saving model to weights.hdf5
Epoch 9/10
73440/73440 [==============================] - 47s 644us/step - loss: 0.0433 - acc: 0.9859 - val_loss: 0.0713 - val_acc: 0.9791

Epoch 00009: val_loss did not improve from 0.05093
Epoch 10/10
73440/73440 [==============================] - 49s 665us/step - loss: 0.0434 - acc: 0.9861 - val_loss: 0.2861 - val_acc: 0.9012

Epoch 00010: val_loss did not improve from 0.05093

и после оценки модели с test_data

Точность теста: 0,9843562874251497

Испытательная потеря: 0.05093173268935584

и сейчас пытаюсь предсказать классы из test_data

def get_predicted_classes(model, data, labels=None):
  image_predictions = model.predict(data)
  predicted_classes = np.argmax(image_predictions, axis=1)
  true_classes = np.argmax(labels, axis=1)
  return predicted_classes, true_classes

from sklearn.metrics import classification_report

def get_classification_report(y_true, y_pred):
  print(classification_report(y_true, y_pred))

y_pred, y_true = get_predicted_classes(model, testing_data_images, testing_data_labels)
get_classification_report(y_true, y_pred)

    precision    recall  f1-score   support

           0       0.98      0.99      0.99      1000
           1       0.99      0.99      0.99      1000
           2       0.98      1.00      0.99      1000
           3       1.00      0.99      0.99      1000
           4       1.00      0.99      0.99      1000
           5       0.99      0.98      0.99      1000
           6       0.99      0.99      0.99      1000
           7       1.00      0.99      1.00      1000
           8       1.00      0.99      1.00      1000
           9       1.00      0.99      0.99      1000
          10       0.99      1.00      1.00       120
          11       1.00      0.97      0.99       120
          12       0.87      0.97      0.91       120
          13       1.00      0.89      0.94       120
          14       0.98      0.99      0.98       120
          15       0.96      0.98      0.97       120
          16       0.99      0.97      0.98       120
          17       0.91      0.99      0.95       120
          18       0.94      0.91      0.92       120
          19       0.94      0.93      0.93       120
          20       0.96      0.90      0.93       120
          21       0.99      0.93      0.96       120
          22       0.99      1.00      1.00       120
          23       0.91      0.99      0.95       120
          24       0.99      0.96      0.97       120
          25       0.96      0.96      0.96       120
          26       0.95      0.96      0.95       120
          27       0.99      0.97      0.98       120
          28       0.99      0.99      0.99       120
          29       0.95      0.84      0.89       120
          30       0.84      0.97      0.90       120
          31       0.98      0.98      0.98       120
          32       0.98      1.00      0.99       120
          33       0.99      1.00      1.00       120
          34       0.96      0.90      0.93       120
          35       0.99      0.96      0.97       120
          36       0.95      0.97      0.96       120
          37       0.98      0.99      0.99       120

   micro avg       0.98      0.98      0.98     13360
   macro avg       0.97      0.97      0.97     13360
weighted avg       0.98      0.98      0.98     13360


и для прогноза с input_image

    x = imread('output.png', mode='L')
    x = np.invert(x)
    x = imresize(x, (64, 64))
    #x = x/255
    x = x.reshape((-1,64,64,1))

    with graphAR.as_default():
        out = modelAR.predict(x)
        #print(out)
        print(np.argmax(out, axis=1))
        response = np.array_str(np.argmax(out, axis=1))
        print(response)

но результаты всегда ложные (неправильные)

Я ожидаю истинных выходных данных для input_images, например

https://imgur.com/Qxn8Zs3

ожидаемый прогноз: alif- أ

результаты: [[0]] = sifr-0

некоторые images_inputs, которые я пробовал

https://imgur.com/tvqXn2d

https://imgur.com/KPOGAl2

https://imgur.com/6axcUdp

https://imgur.com/8jQ73bX

https://imgur.com/NYzsabG

1 Ответ

0 голосов
/ 09 мая 2019

На этапе обучения вы используете эти функции перед тренировкой. Масштабирование и преобразование в целое число по внешнему виду.

training_digits_images_scaled = training_digits_images.values.astype('float32')/255
training_digits_labels = training_digits_labels.values.astype('int32')

Во время прогнозирования вы должны выполнять точно такие же функции. В прогнозе для input_image,

#Convert to grayscale only if training images are in grayscale too.
#It's generally a good idea to train and predict with grayscaled images.

x = imread('output.png', mode='L')
# Not sure why you are doing this
#x = np.invert(x) 
x = x.astype('float')/255
x = x.astype('int')
x = x.reshape((-1,64,64,1))

## Continue with prediction function

Это должно работать. Дайте мне знать, как это происходит.

Добро пожаловать на сайт PullRequest, где вы можете задавать вопросы и получать ответы от других членов сообщества.
...