Я хочу решить проблему, заключающуюся в том, что обучение шумоподавляющего автокодировщика не работает должным образом - PullRequest
0 голосов
/ 21 июня 2020

Я хочу реализовать автоматический кодировщик denosing с использованием набора данных mnist.

Я добавил шум к данным mnist не просто добавляя случайный шум к изображению mnist, а рассматривая данные mnist как спектрограмму, преобразовывая ее в звук, добавив awgn, а затем преобразовав его обратно в исходную форму изображения. И я хочу уменьшить шум на этих изображениях.

Позвольте мне показать вам несколько примеров изображений. These are mnist images in which awgn is added with SNR of 0db, 10db, and 20db in order from the top.

These are mnist images in which awgn is added with SNR of 0db, 10db, and 20db in order from the top.

And down below is denoising AutoEncoder code which I used.

The code I used is divided into convautencoder.py and train_denoising_autoencoder.py, and you can run train_denoising_autoencoder.py to run.

convautoencoder.py :

    from tensorflow.keras.layers import BatchNormalization
    from tensorflow.keras.layers import Conv2D
    from tensorflow.keras.layers import Conv2DTranspose
    from tensorflow.keras.layers import LeakyReLU
    from tensorflow.keras.layers import Activation
    from tensorflow.keras.layers import Flatten
    from tensorflow.keras.layers import Dense
    from tensorflow.keras.layers import Reshape
    from tensorflow.keras.layers import Input
    from tensorflow.keras.models import Model
    from tensorflow.keras import backend as K
    import numpy as np

    class ConvAutoencoder:
        @staticmethod
        def build(width, height, depth, filters=(32, 64), latentDim=16):
            # initialize the input shape to be "channels last" along with
            # the channels dimension itself
            # channels dimension itself
            inputShape = (height, width, depth)
            chanDim = -1

            # define the input to the encoder
            inputs = Input(shape=inputShape)
            x = inputs

    # loop over the number of filters
    for f in filters:
        # apply a CONV => RELU => BN operation
        x = Conv2D(f, (3, 3), strides=2, padding="same")(x)
        x = LeakyReLU(alpha=0.2)(x)
        x = BatchNormalization(axis=chanDim)(x)

    # flatten the network and then construct our latent vector
    volumeSize = K.int_shape(x)
    x = Flatten()(x)
    latent = Dense(latentDim)(x)

    # build the encoder model
    encoder = Model(inputs, latent, name="encoder")

    # start building the decoder model which will accept the
    # output of the encoder as its inputs
    latentInputs = Input(shape=(latentDim,))
    x = Dense(np.prod(volumeSize[1:]))(latentInputs)
    x = Reshape((volumeSize[1], volumeSize[2], volumeSize[3]))(x)

    # loop over our number of filters again, but this time in
    # reverse order
    for f in filters[::-1]:
        # apply a CONV_TRANSPOSE => RELU => BN operation
        x = Conv2DTranspose(f, (3, 3), strides=2,
            padding="same")(x)
        x = LeakyReLU(alpha=0.2)(x)
        x = BatchNormalization(axis=chanDim)(x)

    # apply a single CONV_TRANSPOSE layer used to recover the
    # original depth of the image
    x = Conv2DTranspose(depth, (3, 3), padding="same")(x)
    outputs = Activation("sigmoid")(x)

    # build the decoder model
    decoder = Model(latentInputs, outputs, name="decoder")

    # our autoencoder is the encoder + decoder
    autoencoder = Model(inputs, decoder(encoder(inputs)),
        name="autoencoder")

    # return a 3-tuple of the encoder, decoder, and autoencoder
    return (encoder, decoder, autoencoder)

train_denoising_autoencoder.py: import matplotlib

    # import the necessary packages
    from pyimagesearch.convautoencoder import ConvAutoencoder
    from tensorflow.keras.optimizers import Adam
    from tensorflow.keras.datasets import mnist
    import matplotlib.pyplot as plt
    import numpy as np
    import argparse
    import cv2

    # construct the argument parse and parse the arguments
    ap = argparse.ArgumentParser()
    ap.add_argument("-s", "--samples", type=int, default=8,
        help="# number of samples to visualize when decoding")
    ap.add_argument("-o", "--output", type=str, default="output.png",
        help="path to output visualization file")
    ap.add_argument("-p", "--plot", type=str, default="plot.png",
        help="path to output plot file")
    args = vars(ap.parse_args())


    EPOCHS = 20
    BS = 64

    # load the MNIST dataset
    print("[INFO] loading MNIST dataset...")
    ((trainX, _), (testX, _)) = mnist.load_data()

    # add a channel dimension to every image in the dataset, then scale
    # the pixel intensities to the range [0, 1]
    trainX = np.expand_dims(trainX, axis=-1)
    testX = np.expand_dims(testX, axis=-1)
    trainX = trainX.astype("float32") / 255.0
    testX = testX.astype("float32") / 255.0

    np_train = np.array(trainX)
    np_test = np.array(testX)
    print(np_train.shape)
    print(np_test.shape)


    # sample noise from a random normal distribution centered at 0.5 (since
    # our images lie in the range [0, 1]) and a standard deviation of 0.5
    trainNoise = np.random.normal(loc=0.5, scale=0.5, size=trainX.shape)
    testNoise = np.random.normal(loc=0.5, scale=0.5, size=testX.shape)
    trainXNoisy = np.clip(trainX + trainNoise, 0, 1)
    testXNoisy = np.clip(testX + testNoise, 0, 1)


    # construct our convolutional autoencoder
    print("[INFO] building autoencoder...")
    (encoder, decoder, autoencoder) = ConvAutoencoder.build(28, 28, 1)
    opt = Adam(lr=1e-3)
    autoencoder.compile(loss="mse", optimizer=opt)

    # train the convolutional autoencoder
    H = autoencoder.fit(
        trainXNoisy, trainX,
        validation_data=(testXNoisy, testX),
        epochs=EPOCHS,
        batch_size=BS)


    # construct a plot that plots and saves the training history
    N = np.arange(0, EPOCHS)
    plt.style.use("ggplot")
    plt.figure()
    plt.plot(N, H.history["loss"], label="train_loss")
    plt.plot(N, H.history["val_loss"], label="val_loss")
    plt.title("Training Loss and Accuracy")
    plt.xlabel("Epoch #")
    plt.ylabel("Loss/Accuracy")
    plt.legend(loc="lower left")
    plt.savefig(args["plot"])

    # use the convolutional autoencoder to make predictions on the
    # testing images, then initialize our list of output images
    print("[INFO] making predictions...")
    decoded = autoencoder.predict(testXNoisy)
    outputs = None


    n=8
    plt.figure(num=2, figsize=(16, 3))
    for i in range(n):
        # input data
        ax = plt.subplot(2, n, i + 1)
        plt.imshow(testXNoisy[i].reshape(28, 28))
        plt.gray()
        ax.get_yaxis().set_visible(False)
        ax.get_xaxis().set_visible(False)

        # reconstructed data
        ax = plt.subplot(2, n, 1 + i + n)
        plt.imshow(decoded[i].reshape(28, 28))
        plt.gray()
        ax.get_xaxis().set_visible(False)
        ax.get_yaxis().set_visible(False)
    plt.show()

The above code is the code of the autoencoder that denosing images with random noise added to the mnist image.

And, as you can see, the results are very good.

results with simply adding random noise : enter image description here

enter image description here

The value loss value was reduced well, and the output image was also denosed well.

However, if I learn the images I gave above, the results are not good.

results with awgn noise added :

enter image description here

введите описание изображения здесь

Как видите, обучение не go хорошо, и результаты тоже не были хорошими.

Приведенный выше результат представляет собой изображение mnist с добавлением 20db snr awgn.

Я не знаю, почему автоэнкодер dsnoising отлично работает, когда я просто даю изображение mnist со случайным шумом в качестве ввода,

, но он не работает, когда я помещаю изображение mnist с awgn Я ввел в качестве ввода.

Мне отчаянно нужна помощь.

Спасибо.

...