Когда я запускаю фрагмент ниже, как python test.py
import os
# Enable '0' or disable '-1' GPU use
# os.environ["CUDA_DEVICE_ORDER"]="PCI_BUS_ID"
os.environ['CUDA_VISIBLE_DEVICES'] = "0"
import warnings
with warnings.catch_warnings():
warnings.filterwarnings("ignore", category=FutureWarning)
import tensorflow as tf
config = tf.compat.v1.ConfigProto()
# config.gpu_options.visible_device_list = "0" # pylint: disable=no-member
config.gpu_options.allow_growth = True # pylint: disable=no-member
session = tf.compat.v1.Session(config=config)
# check if successfully using GPU
if tf.test.gpu_device_name():
print('Default GPU Device: {}'.format(tf.test.gpu_device_name()))
else:
print('Please install GPU version of TF')
, я получаю следующую ошибку
2020-04-23 13:13:15.969352: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 AVX512F FMA
2020-04-23 13:13:15.974088: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcuda.so.1
2020-04-23 13:13:15.990122: W tensorflow/compiler/xla/service/platform_util.cc:256] unable to create StreamExecutor for CUDA:0: failed initializing StreamExecutor for CUDA device ordinal 0: Internal: failed call to cuDevicePrimaryCtxRetain: CUDA_ERROR_INVALID_VALUE: invalid argument
2020-04-23 13:13:15.990240: F tensorflow/stream_executor/lib/statusor.cc:34] Attempting to fetch value instead of handling error Internal: no supported devices found for platform CUDA
Aborted (core dumped)
Когда я устанавливаю os.environ['CUDA_VISIBLE_DEVICES'] = "-1"
(ie без использования графического процессора ), ошибки нет, и выходные данные соответствуют ожидаемым, как показано ниже.
2020-04-23 13:18:24.911806: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 AVX512F FMA
2020-04-23 13:18:24.916849: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcuda.so.1
2020-04-23 13:18:24.920347: E tensorflow/stream_executor/cuda/cuda_driver.cc:318] failed call to cuInit: CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected
2020-04-23 13:18:24.920384: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:169] retrieving CUDA diagnostic information for host: vumacs
2020-04-23 13:18:24.920389: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:176] hostname: vumacs
2020-04-23 13:18:24.920456: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:200] libcuda reported version is: 440.64.0
2020-04-23 13:18:24.920482: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:204] kernel reported version is: 440.64.0
2020-04-23 13:18:24.920489: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:310] kernel version seems to match DSO: 440.64.0
2020-04-23 13:18:24.938734: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 3299990000 Hz
2020-04-23 13:18:24.939659: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x4849f40 executing computations on platform Host. Devices:
2020-04-23 13:18:24.939686: I tensorflow/compiler/xla/service/service.cc:175] StreamExecutor device (0): <undefined>, <undefined>
Please install GPU version of TF
Есть ли способ устранить эту ошибку, поскольку ранее я использовал один и тот же код, установив для CUDA_VISIBLE_DEVICES
значение 0 в обоих сценариях а также оболочка и проблем не было. Кажется, ошибка возникает при установке сеанса с помощью tf.compat.v1.Session(config=config)
Дополнительная информация
python: 3.6.9
tensorflow-gpu==1.14.0
protobuf==3.11.3
tensorflow-estimator==1.14.0
$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2018 NVIDIA Corporation
Built on Sat_Aug_25_21:08:01_CDT_2018
Cuda compilation tools, release 10.0, V10.0.130
$ nvidia-smi
Thu Apr 23 13:22:06 2020
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.64 Driver Version: 440.64 CUDA Version: 10.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce RTX 208... Off | 00000000:B3:00.0 Off | N/A |
| 26% 28C P8 12W / 250W | 119MiB / 11019MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 1277 G /usr/lib/xorg/Xorg 39MiB |
| 0 1388 G /usr/bin/gnome-shell 77MiB |
+-----------------------------------------------------------------------------+