Tensorflow GPU использует GPU-RAM, но не вычислительные единицы? | CUPTI_ERROR_INSUFFICIENT_PRIVILEGES - PullRequest
0 голосов
/ 16 апреля 2020

Я пытаюсь использовать tenorflow-gpu 2.1.0, установленный через pip.

Проблема: Диспетчер задач на windows10 показывает практически нулевое использование графического процессора. Использование от 2% до 5%. Но баран используется почти на 100%. Что может быть причиной того, что tasker-manager показывает, что графический процессор (GTX 1660 Ti) не используется?

С nvidia-smi Я получаю другое изображение:

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 445.87       Driver Version: 445.87       CUDA Version: 11.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name            TCC/WDDM | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 166... WDDM  | 00000000:10:00.0  On |                  N/A |
| 79%   64C    P2   109W / 130W |   5964MiB /  6144MiB |     89%      Default |

Я использую CUDA 10.1

Предупреждения для Tensorflow:

2020-04-16 21:07:55.541837: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
2020-04-16 21:07:58.416796: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library nvcuda.dll
2020-04-16 21:07:58.450054: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties: 
pciBusID: 0000:10:00.0 name: GeForce GTX 1660 Ti computeCapability: 7.5
coreClock: 1.845GHz coreCount: 24 deviceMemorySize: 6.00GiB deviceMemoryBandwidth: 268.26GiB/s
2020-04-16 21:07:58.450406: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
2020-04-16 21:07:58.455452: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll
2020-04-16 21:07:58.459642: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_10.dll
2020-04-16 21:07:58.461515: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_10.dll
2020-04-16 21:07:58.466455: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_10.dll
2020-04-16 21:07:58.469085: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_10.dll
2020-04-16 21:07:58.479479: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2020-04-16 21:07:58.480206: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0
2020-04-16 21:07:58.480629: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
2020-04-16 21:07:58.482300: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties: 
pciBusID: 0000:10:00.0 name: GeForce GTX 1660 Ti computeCapability: 7.5
coreClock: 1.845GHz coreCount: 24 deviceMemorySize: 6.00GiB deviceMemoryBandwidth: 268.26GiB/s
2020-04-16 21:07:58.482677: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
2020-04-16 21:07:58.482875: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll
2020-04-16 21:07:58.483040: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_10.dll
2020-04-16 21:07:58.483203: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_10.dll
2020-04-16 21:07:58.483355: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_10.dll
2020-04-16 21:07:58.483529: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_10.dll
2020-04-16 21:07:58.483712: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2020-04-16 21:07:58.484448: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0
2020-04-16 21:07:59.249742: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1096] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-04-16 21:07:59.250043: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102]      0 
2020-04-16 21:07:59.250203: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] 0:   N 
2020-04-16 21:07:59.251187: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1241] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 4625 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1660 Ti, pci bus id: 0000:10:00.0, compute capability: 7.5)
Found 5338 images belonging to 4 classes.
Found 3554 images belonging to 4 classes.
WARNING:tensorflow:sample_weight modes were coerced from
  ...
    to  
  ['...']
WARNING:tensorflow:sample_weight modes were coerced from
  ...
    to  
  ['...']
2020-04-16 21:08:11.464027: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll
2020-04-16 21:08:12.081246: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2020-04-16 21:08:13.727563: W tensorflow/stream_executor/gpu/redzone_allocator.cc:312] Internal: Invoking GPU asm compilation is supported on Cuda non-Windows platforms only
Relying on driver to perform ptx compilation. This message will be only logged once.
2020-04-16 21:08:15.806688: I tensorflow/core/profiler/lib/profiler_session.cc:225] Profiler session started.
2020-04-16 21:08:15.806850: I tensorflow/core/profiler/internal/gpu/cupti_tracer.cc:1259] Profiler found 1 GPUs
2020-04-16 21:08:15.808769: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cupti64_101.dll
2020-04-16 21:08:15.909368: E tensorflow/core/profiler/internal/gpu/cupti_tracer.cc:1307] function cupti_interface_->Subscribe( &subscriber_, (CUpti_CallbackFunc)ApiCallback, this)failed with error CUPTI_ERROR_INSUFFICIENT_PRIVILEGES
2020-04-16 21:08:15.910677: E tensorflow/core/profiler/internal/gpu/cupti_tracer.cc:1346] function cupti_interface_->ActivityRegisterCallbacks( AllocCuptiActivityBuffer, FreeCuptiActivityBuffer)failed with error CUPTI_ERROR_INSUFFICIENT_PRIVILEGES
2020-04-16 21:08:16.092575: E tensorflow/core/profiler/internal/gpu/cupti_tracer.cc:1329] function cupti_interface_->EnableCallback( 0 , subscriber_, CUPTI_CB_DOMAIN_DRIVER_API, cbid)failed with error CUPTI_ERROR_INVALID_PARAMETER
2020-04-16 21:08:16.092946: I tensorflow/core/profiler/internal/gpu/device_tracer.cc:88]  GpuTracer has collected 0 callback api events and 0 activity events.
WARNING:tensorflow:Method (on_train_batch_end) is slow compared to the batch update (0.338369). Check your callbacks. 

Я хочу выделить ошибку: CUPTI_ERROR_INSUFFICIENT_PRIVILEGES

Текущий недостаточный код:

import argparse

from datetime import datetime
import itertools
from six.moves import range

import io
import matplotlib.pyplot as plt
import numpy as np
import sklearn.metrics

import tensorflow as tf
from tensorflow.keras import applications
from tensorflow.keras.callbacks import TensorBoard, ReduceLROnPlateau, ModelCheckpoint, EarlyStopping, LambdaCallback
from tensorflow.keras.layers import Dense, Dropout, GlobalAveragePooling2D
from tensorflow.keras.models import Model, load_model
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.preprocessing.image import ImageDataGenerator


def plot_confusion_matrix(cm, class_names):
    """
    Returns a matplotlib figure containing the plotted confusion matrix.

    Args:
    cm (array, shape = [n, n]): a confusion matrix of integer classes
    class_names (array, shape = [n]): String names of the integer classes
    """

    figure = plt.figure(figsize=(8, 8))
    plt.imshow(cm, interpolation='nearest', cmap=plt.cm.Blues)
    plt.title("Confusion matrix")
    plt.colorbar()
    tick_marks = np.arange(len(class_names))
    plt.xticks(tick_marks, class_names, rotation=45)
    plt.yticks(tick_marks, class_names)

    # Normalize the confusion matrix.
    cm = np.around(cm.astype('float') / cm.sum(axis=1)[:, np.newaxis], decimals=2)

    # Use white text if squares are dark; otherwise black.
    threshold = cm.max() / 2.
    for i, j in itertools.product(range(cm.shape[0]), range(cm.shape[1])):
        color = "white" if cm[i, j] > threshold else "black"
        plt.text(j, i, cm[i, j], horizontalalignment="center", color=color)

    plt.tight_layout()
    plt.ylabel('True label')
    plt.xlabel('Predicted label')
    return figure


def create_resnet50(img_h: int, img_w: int, num_classes: int):
    # define our MLP network
    base_model = applications.resnet50.ResNet50(weights=None, include_top=False, input_shape=(img_h, img_w, 3))

    x = base_model.output
    x = GlobalAveragePooling2D()(x)
    x = Dropout(rate=0.3)(x)
    predictions = Dense(num_classes, activation='softmax')(x)
    mdl = Model(inputs=base_model.input, outputs=predictions)
    return mdl


def plot_to_image(figure):
    """Converts the matplotlib plot specified by 'figure' to a PNG image and
    returns it. The supplied figure is closed and inaccessible after this call."""
    # Save the plot to a PNG in memory.
    buf = io.BytesIO()
    plt.savefig(buf, format='png')
    # Closing the figure prevents it from being displayed directly inside
    # the notebook.
    plt.close(figure)
    buf.seek(0)
    # Convert PNG buffer to TF image
    image = tf.image.decode_png(buf.getvalue(), channels=4)
    # Add the batch dimension
    image = tf.expand_dims(image, 0)
    return image


def log_confusion_matrix(epoch, logs):
    # Use the model to predict the values from the validation dataset.
    # create list of 256 images, labels
    itx = 256 // bch_size
    test_images, test_labels_raw = [], []
    for i in range(itx):

        tmp_img, tmp_lbs = next(val_gen)
        test_images.extend(tmp_img)
        test_labels_raw.extend(tmp_lbs)

    test_pred_raw = model.predict(np.array(test_images))
    test_pred = np.argmax(test_pred_raw, axis=1)
    test_labels = np.argmax(test_labels_raw, axis=1)

    # Calculate the confusion matrix.
    cm = sklearn.metrics.confusion_matrix(test_labels, test_pred)
    # Log the confusion matrix as an image summary.
    figure = plot_confusion_matrix(cm, class_names=[x for x in val_gen.class_indices.values()])
    cm_image = plot_to_image(figure)

    # Log the confusion matrix as an image summary.
    with file_writer_cm.as_default():
        tf.summary.image("Confusion Matrix", cm_image, step=epoch)


def run(train_generator, test_generator, epcs: int, mdl: Model, opt):
    # train the model
    mdl.compile(optimizer=opt, loss='categorical_crossentropy', metrics=['accuracy', 'mse', ])

    stopper = EarlyStopping(monitor='val_loss', patience=min(epcs / 16, 10), mode='auto',
                            restore_best_weights=True)

    checker = ModelCheckpoint(monitor='val_loss', filepath='weights.{epoch:03d}.hdf5',
                              save_best_only=True, save_freq='epoch')

    shower = TensorBoard(histogram_freq=1)

    reducer = ReduceLROnPlateau(factor=0.6, patience=10, min_delta=1e-4, cooldown=10)

    cm_callback = LambdaCallback(on_epoch_end=log_confusion_matrix)

    history = model.fit(train_generator, epochs=epcs, verbose=0,
                        validation_data=test_generator,
                        callbacks=[stopper, checker, shower, reducer, cm_callback]
                        )

    return history


if __name__ == "__main__":
    parser = argparse.ArgumentParser()
    parser.add_argument('train_path', type=str, help='Path to the train main folder of files.')
    parser.add_argument('test_path', type=str, help='Path to the test main folder of files.')
    parser.add_argument('new_model', type=bool, help='Create new model, or load from file.')
    parser.add_argument('-m', '--model_path', type=str, help='path to model.')
    args = parser.parse_args()

    train_p = args.train_path
    test_p = args.test_path
    is_new = args.new_model
    model_path = args.model_path

    img_height, img_width = 214, 214

    file_writer_cm = tf.summary.create_file_writer('logs/cm')

    model = create_resnet50(img_height, img_width, num_classes=4) if is_new else load_model(model_path)
    adam = Adam(lr=0.0001)

    train_datagen = ImageDataGenerator(
        rescale=1. / 255,
        horizontal_flip=True,
        vertical_flip=True,
        rotation_range=90,
        width_shift_range=0.1,
        height_shift_range=0.1,
        zoom_range=0.2
    )

    validation_datagen = ImageDataGenerator(
        rescale=1.255
    )
    bch_size = 16

    train_gen = train_datagen.flow_from_directory(directory=train_p, target_size=(img_height, img_width),
                                                  batch_size=bch_size)
    val_gen = validation_datagen.flow_from_directory(directory=test_p, target_size=(img_height, img_width),
                                                     batch_size=bch_size)

    h = run(train_gen, val_gen, 100, model, adam)

    m_name = 'Model_resnet50_epoch{}_score{:3.2f}.hdf5'.format(100, min(h.history['val_loss']))
    model.save(m_name)

Я действительно хочу заранее поблагодарить вас. Я действительно ценю это!

...