Низкое использование памяти графического процессора: используемая память графического процессора всегда составляет 135 МБ - PullRequest
0 голосов
/ 14 июля 2020

Я новичок в глубоком обучении и пытаюсь обучить cv model inception v3 на моей машине с Ubuntu. Когда я проверил использование gpu с помощью nvidia-smi, информация показала, что каждый процесс использует только 135 МБ памяти, а общее использование памяти составляет около 1226 МБ. Я немного сбит с толку и не знаю, что делать, чтобы увеличить использование памяти, так как у меня всего 22 ГБ памяти, но используется только 1 ГБ. Заранее спасибо.

Wed Jul 15 01:11:00 2020       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.100      Driver Version: 440.100      CUDA Version: 10.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 108...  Off  | 00000000:02:00.0  On |                  N/A |
|  0%   36C    P8    19W / 280W |   1226MiB / 11176MiB |     24%      Default |
+-------------------------------+----------------------+----------------------+
|   1  GeForce GTX 108...  Off  | 00000000:03:00.0 Off |                  N/A |
|  0%   37C    P8    14W / 250W |    687MiB / 11178MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0      1309      G   /usr/lib/xorg/Xorg                            30MiB |
|    0      1833      G   /usr/bin/gnome-shell                          50MiB |
|    0      2194      G   /usr/lib/xorg/Xorg                           245MiB |
|    0      2325      G   /usr/bin/gnome-shell                         213MiB |
|    0      7956      G   gnome-control-center                           2MiB |
|    0     11313      C   python                                       135MiB |
|    0     13401      C   python                                       135MiB |
|    0     13493      C   python                                       135MiB |
|    0     13591      C   python                                       135MiB |
|    0     14200      G   /usr/lib/firefox/firefox                       2MiB |
|    0     16053      C   ...asperzhang/anaconda3/envs/dl/bin/python   135MiB |
|    1     11313      C   python                                       135MiB |
|    1     13401      C   python                                       135MiB |
|    1     13493      C   python                                       135MiB |
|    1     13591      C   python                                       135MiB |
|    1     16053      C   ...asperzhang/anaconda3/envs/dl/bin/python   135MiB |
+-----------------------------------------------------------------------------+

А вот результаты tf.test

>>> tf.test.is_gpu_available()
WARNING:tensorflow:From <stdin>:1: is_gpu_available (from tensorflow.python.framework.test_util) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.config.list_physical_devices('GPU')` instead.
2020-07-15 16:48:08.546935: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2600030000 Hz
2020-07-15 16:48:08.548073: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55f93e8c0410 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-07-15 16:48:08.548119: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2020-07-15 16:48:08.551924: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2020-07-15 16:48:08.896211: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-07-15 16:48:08.898816: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-07-15 16:48:08.899840: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55f93e936c10 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2020-07-15 16:48:08.899866: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): GeForce GTX 1080 Ti, Compute Capability 6.1
2020-07-15 16:48:08.899879: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (1): GeForce GTX 1080 Ti, Compute Capability 6.1
2020-07-15 16:48:08.900324: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-07-15 16:48:08.901086: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties: 
pciBusID: 0000:02:00.0 name: GeForce GTX 1080 Ti computeCapability: 6.1
coreClock: 1.6575GHz coreCount: 28 deviceMemorySize: 10.91GiB deviceMemoryBandwidth: 451.17GiB/s
2020-07-15 16:48:08.901177: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-07-15 16:48:08.901947: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 1 with properties: 
pciBusID: 0000:03:00.0 name: GeForce GTX 1080 Ti computeCapability: 6.1
coreClock: 1.721GHz coreCount: 28 deviceMemorySize: 10.92GiB deviceMemoryBandwidth: 451.17GiB/s
2020-07-15 16:48:08.902134: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcudart.so.10.1'; dlerror: libcudart.so.10.1: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda-10.2/lib64${LD_LIBRARY_PATH:+:}:/usr/local/cuda/lib64
2020-07-15 16:48:08.904160: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2020-07-15 16:48:08.905724: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2020-07-15 16:48:08.906050: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2020-07-15 16:48:08.908137: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2020-07-15 16:48:08.909304: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2020-07-15 16:48:08.913807: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-07-15 16:48:08.913841: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1592] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...
2020-07-15 16:48:08.913923: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1096] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-07-15 16:48:08.913944: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102]      0 1 
2020-07-15 16:48:08.913958: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] 0:   N Y 
2020-07-15 16:48:08.913969: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] 1:   Y N 
False
>>> tf.test.is_built_with_cuda()
True
...