Я пытаюсь обучить модель, используя TimeDistributed VGG16 в качестве входных данных для RNN, но графические процессоры выделяют всю память, но использование остается на минимальном значении. Иногда использование увеличивается, но возвращается к 0% (обновляется каждые 0,1 с использованием часов). Что я должен сделать, чтобы гарантировать максимальное использование графических процессоров?
nvidia-smi output
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 418.87.01 Driver Version: 418.87.01 CUDA Version: 10.1 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla V100-SXM2... On | 00000000:00:17.0 Off | 0 |
| N/A 53C P0 62W / 300W | 15882MiB / 16130MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 1 Tesla V100-SXM2... On | 00000000:00:18.0 Off | 0 |
| N/A 46C P0 61W / 300W | 15882MiB / 16130MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 2 Tesla V100-SXM2... On | 00000000:00:19.0 Off | 0 |
| N/A 47C P0 63W / 300W | 15882MiB / 16130MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 3 Tesla V100-SXM2... On | 00000000:00:1A.0 Off | 0 |
| N/A 53C P0 63W / 300W | 15882MiB / 16130MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 4 Tesla V100-SXM2... On | 00000000:00:1B.0 Off | 0 |
| N/A 56C P0 67W / 300W | 15882MiB / 16130MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 5 Tesla V100-SXM2... On | 00000000:00:1C.0 Off | 0 |
| N/A 50C P0 63W / 300W | 15882MiB / 16130MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 6 Tesla V100-SXM2... On | 00000000:00:1D.0 Off | 0 |
| N/A 47C P0 63W / 300W | 15882MiB / 16130MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 7 Tesla V100-SXM2... On | 00000000:00:1E.0 Off | 0 |
| N/A 52C P0 67W / 300W | 15882MiB / 16130MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 30696 C python 15871MiB |
| 1 30696 C python 15871MiB |
| 2 30696 C python 15871MiB |
| 3 30696 C python 15871MiB |
| 4 30696 C python 15871MiB |
| 5 30696 C python 15871MiB |
| 6 30696 C python 15871MiB |
| 7 30696 C python 15871MiB |
+-----------------------------------------------------------------------------+
РЕДАКТИРОВАТЬ: Код модели
model = Sequential()
model.add(TimeDistributed(Conv2D(filters=64, kernel_size=(3, 3), padding="same", activation="relu"), input_shape=(3, 224, 224, 3), name="Conv2D_1"))
model.add(TimeDistributed(Conv2D(filters=64, kernel_size=(3, 3), padding="same", activation="relu"), name="Conv2D_2"))
model.add(TimeDistributed(MaxPool2D(pool_size=(2, 2)), name="MaxPool2D_1"))
model.add(TimeDistributed(Conv2D(filters=128, kernel_size=(3, 3), padding="same", activation="relu"), name="Conv2D_3"))
model.add(TimeDistributed(Conv2D(filters=128, kernel_size=(3, 3), padding="same", activation="relu"), name="Conv2D_4"))
model.add(TimeDistributed(MaxPool2D(pool_size=(2, 2)), name="MaxPool2D_2"))
model.add(TimeDistributed(Conv2D(filters=256, kernel_size=(3, 3), padding="same", activation="relu"), name="Conv2D_5"))
model.add(TimeDistributed(Conv2D(filters=256, kernel_size=(3, 3), padding="same", activation="relu"), name="Conv2D_6"))
model.add(TimeDistributed(Conv2D(filters=256, kernel_size=(3, 3), padding="same", activation="relu"), name="Conv2D_7"))
model.add(TimeDistributed(MaxPool2D(pool_size=(2, 2)), name="MaxPool2D_3"))
model.add(TimeDistributed(Conv2D(filters=512, kernel_size=(3, 3), padding="same", activation="relu"), name="Conv2D_8"))
model.add(TimeDistributed(Conv2D(filters=512, kernel_size=(3, 3), padding="same", activation="relu"), name="Conv2D_9"))
model.add(TimeDistributed(Conv2D(filters=512, kernel_size=(3, 3), padding="same", activation="relu"), name="Conv2D_10"))
model.add(TimeDistributed(MaxPool2D(pool_size=(2, 2)), name="MaxPool2D_4"))
model.add(TimeDistributed(Conv2D(filters=512, kernel_size=(3, 3), padding="same", activation="relu"), name="Conv2D_11"))
model.add(TimeDistributed(Conv2D(filters=512, kernel_size=(3, 3), padding="same", activation="relu"), name="Conv2D_12"))
model.add(TimeDistributed(Conv2D(filters=512, kernel_size=(3, 3), padding="same", activation="relu"), name="Conv2D_13"))
model.add(TimeDistributed(MaxPool2D(pool_size=(2, 2)), name="MaxPool2D_5"))
model.add(TimeDistributed(Flatten(name="Flatten")))
model.add(GRU(6, name="Flatten"))
model.add(Dense(1, activation="sigmoid", name="Dense"))
model = multi_gpu_model(model, gpus=8, cpu_relocation=True)
model.compile(loss="binary_crossentropy", optimizer=opt, metrics="accuracy"])
РЕДАКТИРОВАТЬ: Keras 2.3. 1 Тензор потока 1.14.0