Точность валидации и оценка f1 остаются неизменными с первой эпохи - PullRequest
0 голосов
/ 14 января 2019

Я использовал ResNet50 для классификации изображений для 5 классов. Потери обучения уменьшаются в течение эпох, а также увеличивается точность обучения, но validation_accuracy остается неизменным, а validation_loss продолжает зависать при некотором высоком значении. Есть ли что-то, что я делаю неправильно здесь в коде ниже?

Я пытался использовать разные скорости обучения. Пробовал добавлять слои BatchNorm и Dropout. Также убедитесь, что данные, которые я передаю, чистые и правильно структурированные. Указал размер пакета, а также использовал shuffle = True для проверки, но, похоже, ничего не помогает. Любая помощь по этому вопросу будет очень ценной.

from tensorflow.python.keras.applications import ResNet50
from tensorflow.python.keras.models import Sequential
from tensorflow.python.keras.layers import Dense, Flatten, GlobalAveragePooling2D, BatchNormalization, Dropout
from tensorflow.python.keras.applications.resnet50 import preprocess_input
from tensorflow.python.keras.preprocessing.image import ImageDataGenerator
from tensorflow.python.keras.preprocessing.image import load_img, img_to_array
from tensorflow.python.keras import callbacks
import time
from tensorflow.python.keras.callbacks import EarlyStopping, ModelCheckpoint, ReduceLROnPlateau
from tensorflow.python.keras.models import Model
from tensorflow.python.keras import optimizers

resnet_weights_path = './clean_resnet_data/resnet50_weights_tf_dim_ordering_tf_kernels_notop.h5'


data_generator = ImageDataGenerator(horizontal_flip=True,
                                    vertical_flip=True,
                                    zoom_range=0.3,
                                    rescale=1. / 255
                                   )

validation_datagen = ImageDataGenerator(rescale=1. / 255)

image_size = 512
batch_size = 16
train_generator = data_generator.flow_from_directory(
        './org_train',
        target_size=(image_size, image_size),
        #batch_size=batch_size,
        class_mode='categorical')

validation_generator = validation_datagen.flow_from_directory(
    './valid_org',
    target_size=(image_size, image_size),
    batch_size=batch_size,
    shuffle=True,
    class_mode='categorical')

num_classes = len(train_generator.class_indices)

Found 1204 images belonging to 5 classes.
Found 250 images belonging to 5 classes.
5


model = Sequential()

model.add(ResNet50(include_top=False, pooling='avg', weights=resnet_weights_path))

model.add(Flatten())
model.add(Dense(512, activation='relu'))
model.add(BatchNormalization())
model.add(Dropout(0.5))
model.add(Dense(256, activation='relu'))
model.add(BatchNormalization())
model.add(Dense(num_classes, activation='sigmoid'))

model.layers[0].trainable = False
model.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
resnet50 (Model)             (None, 2048)              23587712  
_________________________________________________________________
flatten (Flatten)            (None, 2048)              0         
_________________________________________________________________
dense (Dense)                (None, 512)               1049088   
_________________________________________________________________
batch_normalization (BatchNo (None, 512)               2048      
_________________________________________________________________
dropout (Dropout)            (None, 512)               0         
_________________________________________________________________
dense_1 (Dense)              (None, 256)               131328    
_________________________________________________________________
batch_normalization_1 (Batch (None, 256)               1024      
_________________________________________________________________
dense_2 (Dense)              (None, 5)                 1285      
=================================================================
Total params: 24,772,485
Trainable params: 1,183,237
Non-trainable params: 23,589,248
_________________________________________________________________

count = sum([len(files) for r, d, files in os.walk("./org_train/")])
steps_in_each_epoch=int(count/batch_size) + 1

#setting callback parameters for model saving, early stopping and reduce_lr
model_checkpoint = ModelCheckpoint('resnet50_clean_data.model',monitor='f1', 
                                   mode = 'max', save_best_only=True, verbose=2)


log_dir = './tf-log/newdata_withlr_nodrop'

tb_cb = callbacks.TensorBoard(log_dir=log_dir, histogram_freq=0)


early_stopping = EarlyStopping(monitor='val_f1', mode = 'max',patience=15, verbose=2)

reduce_lr = ReduceLROnPlateau(monitor='val_f1', mode = 'max',factor=0.5, patience=3, min_lr=0.00001, verbose=2)

cbks = [early_stopping,reduce_lr]


from tensorflow.python.keras import optimizers

# adam
adam = optimizers.Adam(lr = 0.003)

# compile
model.compile(loss='categorical_crossentropy',optimizer=adam,metrics=["accuracy",f1])

model.fit_generator(train_generator,epochs=150,validation_data=validation_generator,callbacks=cbks)

Output:

Epoch 1/150
38/38 [==============================] - 86s 2s/step - loss: 1.5320 - acc: 0.3364 - f1: 0.3989 - val_loss: 5.5517 - val_acc: 0.2000 - val_f1: 0.2000
Epoch 2/150
38/38 [==============================] - 70s 2s/step - loss: 1.2850 - acc: 0.4567 - f1: 0.4663 - val_loss: 4.4173 - val_acc: 0.2000 - val_f1: 0.2100
Epoch 3/150
38/38 [==============================] - 73s 2s/step - loss: 1.2396 - acc: 0.4583 - f1: 0.4716 - val_loss: 4.7810 - val_acc: 0.2000 - val_f1: 0.2658
Epoch 4/150
38/38 [==============================] - 73s 2s/step - loss: 1.2147 - acc: 0.4973 - f1: 0.4902 - val_loss: 4.2491 - val_acc: 0.2000 - val_f1: 0.2667
Epoch 5/150
38/38 [==============================] - 73s 2s/step - loss: 1.1994 - acc: 0.5082 - f1: 0.4982 - val_loss: 3.5541 - val_acc: 0.2000 - val_f1: 0.2667
Epoch 6/150
38/38 [==============================] - 73s 2s/step - loss: 1.1525 - acc: 0.5284 - f1: 0.5116 - val_loss: 3.8147 - val_acc: 0.2000 - val_f1: 0.2667
Epoch 7/150
38/38 [==============================] - 73s 2s/step - loss: 1.1658 - acc: 0.5014 - f1: 0.5104 - val_loss: 3.4530 - val_acc: 0.1920 - val_f1: 0.2784
Epoch 8/150
38/38 [==============================] - 77s 2s/step - loss: 1.1181 - acc: 0.5222 - f1: 0.5137 - val_loss: 3.1350 - val_acc: 0.2000 - val_f1: 0.2083
Epoch 9/150
38/38 [==============================] - 73s 2s/step - loss: 1.0760 - acc: 0.5543 - f1: 0.5402 - val_loss: 2.5362 - val_acc: 0.2000 - val_f1: 0.2000
Epoch 10/150
37/38 [============================>.] - ETA: 1s - loss: 1.1092 - acc: 0.5331 - f1: 0.5335
Epoch 00010: ReduceLROnPlateau reducing learning rate to 0.001500000013038516.
38/38 [==============================] - 73s 2s/step - loss: 1.1155 - acc: 0.5322 - f1: 0.5329 - val_loss: 2.5705 - val_acc: 0.2000 - val_f1: 0.1993
Epoch 11/150
38/38 [==============================] - 72s 2s/step - loss: 1.0541 - acc: 0.5623 - f1: 0.5438 - val_loss: 2.2404 - val_acc: 0.2000 - val_f1: 0.2180
Epoch 12/150
38/38 [==============================] - 73s 2s/step - loss: 1.0180 - acc: 0.5630 - f1: 0.5421 - val_loss: 2.0331 - val_acc: 0.2480 - val_f1: 0.2733
Epoch 13/150
37/38 [============================>.] - ETA: 1s - loss: 1.0070 - acc: 0.5992 - f1: 0.5508
Epoch 00013: ReduceLROnPlateau reducing learning rate to 0.000750000006519258.
38/38 [==============================] - 73s 2s/step - loss: 1.0071 - acc: 0.6006 - f1: 0.5513 - val_loss: 1.9989 - val_acc: 0.2000 - val_f1: 0.2037
Epoch 14/150
38/38 [==============================] - 76s 2s/step - loss: 1.0195 - acc: 0.5523 - f1: 0.5082 - val_loss: 2.0054 - val_acc: 0.2000 - val_f1: 0.2001
Epoch 15/150
38/38 [==============================] - 73s 2s/step - loss: 1.0127 - acc: 0.5982 - f1: 0.5079 - val_loss: 2.1047 - val_acc: 0.2040 - val_f1: 0.2143
Epoch 16/150
37/38 [============================>.] - ETA: 1s - loss: 0.9787 - acc: 0.5934 - f1: 0.5173
Epoch 00016: ReduceLROnPlateau reducing learning rate to 0.000375000003259629.
38/38 [==============================] - 73s 2s/step - loss: 0.9776 - acc: 0.5918 - f1: 0.5183 - val_loss: 2.2156 - val_acc: 0.2000 - val_f1: 0.2705
Epoch 17/150
38/38 [==============================] - 73s 2s/step - loss: 0.9950 - acc: 0.5956 - f1: 0.4904 - val_loss: 2.2039 - val_acc: 0.2000 - val_f1: 0.2622
Epoch 18/150
38/38 [==============================] - 73s 2s/step - loss: 0.9451 - acc: 0.6174 - f1: 0.5297 - val_loss: 2.2094 - val_acc: 0.1960 - val_f1: 0.2689
Epoch 19/150
37/38 [============================>.] - ETA: 1s - loss: 0.9531 - acc: 0.6182 - f1: 0.5106
Epoch 00019: ReduceLROnPlateau reducing learning rate to 0.0001875000016298145.
38/38 [==============================] - 78s 2s/step - loss: 0.9566 - acc: 0.6168 - f1: 0.5065 - val_loss: 2.1872 - val_acc: 0.2080 - val_f1: 0.2652
Epoch 20/150
38/38 [==============================] - 73s 2s/step - loss: 0.9706 - acc: 0.5898 - f1: 0.4943 - val_loss: 2.1983 - val_acc: 0.2080 - val_f1: 0.2658
Epoch 21/150
38/38 [==============================] - 73s 2s/step - loss: 0.9365 - acc: 0.5958 - f1: 0.4986 - val_loss: 2.1936 - val_acc: 0.1960 - val_f1: 0.2416
Epoch 22/150
37/38 [============================>.] - ETA: 1s - loss: 0.9472 - acc: 0.6068 - f1: 0.4861
Epoch 00022: ReduceLROnPlateau reducing learning rate to 9.375000081490725e-05.
38/38 [==============================] - 73s 2s/step - loss: 0.9457 - acc: 0.6097 - f1: 0.4884 - val_loss: 2.1958 - val_acc: 0.1960 - val_f1: 0.2428
Epoch 00022: early stopping

Поведение набора проверки сбивает с толку. Я не понимаю, почему точность валидации или валидация f1 вообще не меняются с первой эпохи (остаются неизменными). В чем может быть проблема?

Добро пожаловать на сайт PullRequest, где вы можете задавать вопросы и получать ответы от других членов сообщества.
...