Я пытаюсь создать CNN, чтобы классифицировать изображения, связанные с раком кожи, на семь категорий.Я относительно новичок в концепции CNN и адаптировал сценарий использования классификации собака / кошка к известной проблеме с базой данных рака кожи.Проблема в том, что потери и точность чрезвычайно малы, а также постоянны на протяжении эпох.Однако я не уверен, в чем проблема - мое первое предположение состоит в том, что количество используемых изображений слишком мало: 436 образцов для обучения и 109 проверок.Я уменьшил количество используемых изображений с 10000+, потому что я использую свой MacBook Pro.
script:
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, Activation, Flatten, Conv2D, MaxPooling2D
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import sys
import os
import cv2
DATA_DIR = "/Users/namefolder/PycharmProjects/skin-cancer/HAM10000_images_part_1"
metadata = pd.read_csv(os.path.join(DATA_DIR, 'HAM10000_metadata.csv'))
lesion_type_dict = {'nv': 'Melanocytic nevi',
'mel': 'Melanoma',
'bkl': 'Benign keratosis-like lesions ',
'bcc': 'Basal cell carcinoma',
'akiec': 'Actinic keratoses',
'vasc': 'Vascular lesions',
'df': 'Dermatofibroma'}
metadata['cell_type'] = metadata['dx'].map(lesion_type_dict.get)
metadata['dx_code'] = pd.Categorical(metadata['dx']).codes
# save array of image-id and diagnosis-type (categorical)
metadata = metadata[['image_id', 'dx', 'dx_type', 'dx_code']]
training_data = []
IMG_SIZE=40
# preparing training data
def creating_training_data(path):
for img in os.listdir(path):
try:
img_array = cv2.imread(os.path.join(path, img), cv2.IMREAD_GRAYSCALE)
new_array = cv2.resize(img_array, (IMG_SIZE, IMG_SIZE))
for index, row in metadata.iterrows():
if img == row['image_id']+'.jpg':
try:
training_data.append([new_array, row['dx_code']])
except Exception as ee:
pass
except Exception as e:
pass
return training_data
training_data = creating_training_data(DATA_DIR)
import random
random.shuffle(training_data)
# Splitting data into X features and Y label
X_train = []
y_train = []
for features, label in training_data:
X_train.append(features)
y_train.append(label)
# Reshaping of the data - required by Tensorflow and Keras (*necessary step of deep-learning using these repos)
X_train = np.array(X_train).reshape(-1, IMG_SIZE, IMG_SIZE, 1)
# Normalize data - to reduce processing requirements
X_train = X_train/255.0
# model configuration
model = Sequential()
model.add(Conv2D(64, (3,3), input_shape = X_train.shape[1:]))
model.add(Activation("relu"))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(64, (3,3)))
model.add(Activation("relu"))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Flatten())
model.add(Dense(64))
model.add(Dense(1))
model.add(Activation("softmax"))
model.compile(loss="mean_squared_error",
optimizer="adam",
metrics=["accuracy"])
Модель обучения:
Model fitting output:
Train on 436 samples, validate on 109 samples
Epoch 1/20
436/436 [==============================] - 1s 2ms/sample - loss: 11.7890 - acc: 0.0688 - val_loss: 13.6697 - val_acc: 0.0642
Epoch 2/20
436/436 [==============================] - 1s 1ms/sample - loss: 11.7890 - acc: 0.0688 - val_loss: 13.6697 - val_acc: 0.0642
Epoch 3/20
436/436 [==============================] - 1s 1ms/sample - loss: 11.7890 - acc: 0.0688 - val_loss: 13.6697 - val_acc: 0.0642
Epoch 4/20
436/436 [==============================] - 1s 1ms/sample - loss: 11.7890 - acc: 0.0688 - val_loss: 13.6697 - val_acc: 0.0642
Epoch 5/20
436/436 [==============================] - 1s 1ms/sample - loss: 11.7890 - acc: 0.0688 - val_loss: 13.6697 - val_acc: 0.0642
Epoch 6/20
436/436 [==============================] - 1s 1ms/sample - loss: 11.7890 - acc: 0.0688 - val_loss: 13.6697 - val_acc: 0.0642
Epoch 7/20
436/436 [==============================] - 1s 1ms/sample - loss: 11.7890 - acc: 0.0688 - val_loss: 13.6697 - val_acc: 0.0642
Epoch 8/20
436/436 [==============================] - 1s 1ms/sample - loss: 11.7890 - acc: 0.0688 - val_loss: 13.6697 - val_acc: 0.0642
Epoch 9/20
436/436 [==============================] - 1s 1ms/sample - loss: 11.7890 - acc: 0.0688 - val_loss: 13.6697 - val_acc: 0.0642
Epoch 10/20
436/436 [==============================] - 1s 1ms/sample - loss: 11.7890 - acc: 0.0688 - val_loss: 13.6697 - val_acc: 0.0642
Epoch 11/20
436/436 [==============================] - 1s 1ms/sample - loss: 11.7890 - acc: 0.0688 - val_loss: 13.6697 - val_acc: 0.0642
Epoch 12/20
436/436 [==============================] - 1s 1ms/sample - loss: 11.7890 - acc: 0.0688 - val_loss: 13.6697 - val_acc: 0.0642
Epoch 13/20
436/436 [==============================] - 1s 1ms/sample - loss: 11.7890 - acc: 0.0688 - val_loss: 13.6697 - val_acc: 0.0642
Epoch 14/20
436/436 [==============================] - 1s 1ms/sample - loss: 11.7890 - acc: 0.0688 - val_loss: 13.6697 - val_acc: 0.0642
Epoch 15/20
436/436 [==============================] - 1s 1ms/sample - loss: 11.7890 - acc: 0.0688 - val_loss: 13.6697 - val_acc: 0.0642
Epoch 16/20
436/436 [==============================] - 1s 1ms/sample - loss: 11.7890 - acc: 0.0688 - val_loss: 13.6697 - val_acc: 0.0642
Epoch 17/20
436/436 [==============================] - 1s 1ms/sample - loss: 11.7890 - acc: 0.0688 - val_loss: 13.6697 - val_acc: 0.0642
Epoch 18/20
436/436 [==============================] - 1s 1ms/sample - loss: 11.7890 - acc: 0.0688 - val_loss: 13.6697 - val_acc: 0.0642
Epoch 19/20
436/436 [==============================] - 1s 1ms/sample - loss: 11.7890 - acc: 0.0688 - val_loss: 13.6697 - val_acc: 0.0642
Epoch 20/20
436/436 [==============================] - 1s 1ms/sample - loss: 11.7890 - acc: 0.0688 - val_loss: 13.6697 - val_acc: 0.0642
Краткое описание модели:
Model: "sequential_16"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_30 (Conv2D) (None, 38, 38, 64) 640
_________________________________________________________________
activation_44 (Activation) (None, 38, 38, 64) 0
_________________________________________________________________
max_pooling2d_30 (MaxPooling (None, 19, 19, 64) 0
_________________________________________________________________
conv2d_31 (Conv2D) (None, 17, 17, 64) 36928
_________________________________________________________________
activation_45 (Activation) (None, 17, 17, 64) 0
_________________________________________________________________
max_pooling2d_31 (MaxPooling (None, 8, 8, 64) 0
_________________________________________________________________
flatten_14 (Flatten) (None, 4096) 0
_________________________________________________________________
dense_28 (Dense) (None, 64) 262208
_________________________________________________________________
dense_29 (Dense) (None, 1) 65
_________________________________________________________________
activation_46 (Activation) (None, 1) 0
=================================================================
Total params: 299,841
Trainable params: 299,841
Non-trainable params: 0
Может кто-нибудь, пожалуйста, посоветуйте мне, если это может быть дело или нет.Вы видите другие области, которые мне нужно изменить / исправить?
Заранее спасибо!