tf.py_function не работает в коде учебника при запуске tensorflow-gpu 2.2 на Windows 10 - PullRequest
0 голосов
/ 11 июля 2020

Я отправил проблему на https://github.com/tensorflow/tensorflow/issues/41304. Будем очень признательны, если здесь будет какая-либо помощь.

Информация о системе

  • Платформа ОС: Windows 10
  • TensorFlow установлено: с использованием pip
  • Версия TensorFlow: tenorflow gpu 2.2
  • Python версия: 3.8.3
  • Версия CUDA / cuDNN: CUDA 10.1
  • GPU модель и память: GeForce GTX 1080, 8 ГБ

Опишите текущее поведение Сериализованный результат без tf.py_function был нормальным. Но были ошибки исключения из eager_py_fun c при попытке получить сериализованный результат с помощью tf.py_function. Опишите ожидаемое поведение Сериализованный результат без / с tf.py_fuction должен быть таким же.

Автономный код для воспроизведения проблемы Проблема может быть воспроизведена с помощью следующего кода (из учебника tensoflow, «TFRecord and tf.Example», https://www.tensorflow.org/tutorials/load_data/tfrecord). Код работает на Colab на веб-сайте руководства, но при запуске на Windows 10 с cmd или pycharm возникают ошибки исключения. Вот код: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

import tensorflow as tf

def _bytes_feature(value):
  """Returns a bytes_list from a string / byte."""
  if isinstance(value, type(tf.constant(0))):
    value = value.numpy() # BytesList won't unpack a string from an EagerTensor.
  return tf.train.Feature(bytes_list=tf.train.BytesList(value=[value]))

def _float_feature(value):
  """Returns a float_list from a float / double."""
  return tf.train.Feature(float_list=tf.train.FloatList(value=[value]))

def _int64_feature(value):
  """Returns an int64_list from a bool / enum / int / uint."""
  return tf.train.Feature(int64_list=tf.train.Int64List(value=[value]))

def serialize_example(feature0, feature1, feature2, feature3):
  feature = {
      'feature0': _int64_feature(feature0),
      'feature1': _int64_feature(feature1),
      'feature2': _bytes_feature(feature2),
      'feature3': _float_feature(feature3),
  }
  example_proto = tf.train.Example(features=tf.train.Features(feature=feature))
  return example_proto.SerializeToString()

def tf_serialize_example(f0,f1,f2,f3):
  tf_string = tf.py_function(
    serialize_example,
    (f0,f1,f2,f3),
    tf.string)
  return tf.reshape(tf_string, ())

result_without_py_function = serialize_example(False, 4, b'goat', 0.9876)
print("result_without_py_function =",result_without_py_function)

result_with_py_function = tf_serialize_example(False, 4, b'goat', 0.9876)
print("result_with_py_function=",result_with_py_function)

xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx41 * для 1038 xxxxxxxxxxxxxxxxxxxxxxx Запуск приведенного выше кода на windows cmd или pycharm приведет к возникновению ошибок исключения при выполнении:

result_with_py_function = tf_serialize_example(False, 4, b'goat', 0.9876)

Весь вывод, включая сообщения об ошибках:

020-07-10 20:36:22.730441: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
2020-07-10 20:36:25.159732: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library nvcuda.dll
2020-07-10 20:36:25.298922: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties: 
pciBusID: 0000:01:00.0 name: GeForce GTX 1080 computeCapability: 6.1
coreClock: 1.847GHz coreCount: 20 deviceMemorySize: 8.00GiB deviceMemoryBandwidth: 298.32GiB/s
2020-07-10 20:36:25.299515: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 1 with properties: 
pciBusID: 0000:02:00.0 name: GeForce GTX 1080 computeCapability: 6.1
coreClock: 1.847GHz coreCount: 20 deviceMemorySize: 8.00GiB deviceMemoryBandwidth: 298.32GiB/s
2020-07-10 20:36:25.299855: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
2020-07-10 20:36:25.330094: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll
2020-07-10 20:36:25.345517: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_10.dll
2020-07-10 20:36:25.375238: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_10.dll
2020-07-10 20:36:25.392428: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_10.dll
2020-07-10 20:36:25.412082: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_10.dll
2020-07-10 20:36:25.430892: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2020-07-10 20:36:25.432276: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1703] Adding visible gpu devices: 0, 1
2020-07-10 20:36:25.432747: I tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
2020-07-10 20:36:25.439470: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x18e19723790 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-07-10 20:36:25.439669: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2020-07-10 20:36:25.594336: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties: 
pciBusID: 0000:01:00.0 name: GeForce GTX 1080 computeCapability: 6.1
coreClock: 1.847GHz coreCount: 20 deviceMemorySize: 8.00GiB deviceMemoryBandwidth: 298.32GiB/s
2020-07-10 20:36:25.594839: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 1 with properties: 
pciBusID: 0000:02:00.0 name: GeForce GTX 1080 computeCapability: 6.1
coreClock: 1.847GHz coreCount: 20 deviceMemorySize: 8.00GiB deviceMemoryBandwidth: 298.32GiB/s
2020-07-10 20:36:25.595149: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
2020-07-10 20:36:25.595311: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll
2020-07-10 20:36:25.595476: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_10.dll
2020-07-10 20:36:25.595637: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_10.dll
2020-07-10 20:36:25.595793: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_10.dll
2020-07-10 20:36:25.595953: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_10.dll
2020-07-10 20:36:25.596121: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2020-07-10 20:36:25.596974: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1703] Adding visible gpu devices: 0, 1
2020-07-10 20:36:26.501843: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-07-10 20:36:26.502016: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1108]      0 1 
2020-07-10 20:36:26.502118: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1121] 0:   N N 
2020-07-10 20:36:26.502217: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1121] 1:   N N 
2020-07-10 20:36:26.503174: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1247] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 6280 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1080, pci bus id: 0000:01:00.0, compute capability: 6.1)
2020-07-10 20:36:26.504471: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1247] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:1 with 6280 MB memory) -> physical GPU (device: 1, name: GeForce GTX 1080, pci bus id: 0000:02:00.0, compute capability: 6.1)
2020-07-10 20:36:26.507161: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x18e8a13a5e0 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2020-07-10 20:36:26.507331: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): GeForce GTX 1080, Compute Capability 6.1
2020-07-10 20:36:26.507460: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (1): GeForce GTX 1080, Compute Capability 6.1
result_without_py_function = b'\nR\n\x11\n\x08feature0\x12\x05\x1a\x03\n\x01\x00\n\x11\n\x08feature1\x12\x05\x1a\x03\n\x01\x04\n\x14\n\x08feature2\x12\x08\n\x06\n\x04goat\n\x14\n\x08feature3\x12\x08\x12\x06\n\x04[\xd3|?'

2020-07-10 20:36:26.534597: W tensorflow/core/framework/op_kernel.cc:1741] Invalid argument: TypeError: <tf.Tensor: shape=(), dtype=bool, numpy=False> has type <class 'tensorflow.python.framework.ops.EagerTensor'>, but expected one of: (<class 'int'>,)
Traceback (most recent call last):

  File "C:\Program Files\Python38\lib\site-packages\tensorflow\python\ops\gen_script_ops.py", line 43, in eager_py_func
    _result = pywrap_tfe.TFE_Py_FastPathExecute(

tensorflow.python.eager.core._FallbackException: This function does not handle the case of the path where all inputs are not already EagerTensors.


During handling of the above exception, another exception occurred:


Traceback (most recent call last):

  File "C:\Program Files\Python38\lib\site-packages\tensorflow\python\ops\script_ops.py", line 241, in __call__
    return func(device, token, args)

  File "C:\Program Files\Python38\lib\site-packages\tensorflow\python\ops\script_ops.py", line 130, in __call__
    ret = self._func(*args)

  File "C:\Program Files\Python38\lib\site-packages\tensorflow\python\autograph\impl\api.py", line 309, in wrapper
    return func(*args, **kwargs)

  File "F:/python_projects/deep_learning/workspace/programs/dlcls/tf_py_function_test.py", line 20, in serialize_example
    'feature0': _int64_feature(feature0),

  File "F:/python_projects/deep_learning/workspace/programs/dlcls/tf_py_function_test.py", line 15, in _int64_feature
    return tf.train.Feature(int64_list=tf.train.Int64List(value=[value]))

  File "C:\Program Files\Python38\lib\site-packages\google\protobuf\internal\python_message.py", line 542, in init
    copy.extend(field_value)

  File "C:\Program Files\Python38\lib\site-packages\google\protobuf\internal\containers.py", line 282, in extend
    new_values = [self._type_checker.CheckValue(elem) for elem in elem_seq_iter]

  File "C:\Program Files\Python38\lib\site-packages\google\protobuf\internal\containers.py", line 282, in <listcomp>
    new_values = [self._type_checker.CheckValue(elem) for elem in elem_seq_iter]

  File "C:\Program Files\Python38\lib\site-packages\google\protobuf\internal\type_checkers.py", line 171, in CheckValue
    raise TypeError(message)

TypeError: <tf.Tensor: shape=(), dtype=bool, numpy=False> has type <class 'tensorflow.python.framework.ops.EagerTensor'>, but expected one of: (<class 'int'>,)


Traceback (most recent call last):
  File "C:\Program Files\Python38\lib\site-packages\tensorflow\python\ops\gen_script_ops.py", line 43, in eager_py_func
    _result = pywrap_tfe.TFE_Py_FastPathExecute(
tensorflow.python.eager.core._FallbackException: This function does not handle the case of the path where all inputs are not already EagerTensors.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "F:/python_projects/deep_learning/workspace/programs/dlcls/tf_py_function_test.py", line 40, in <module>
    result_with_py_function = tf_serialize_example(False, 4, b'goat', 0.9876)
  File "F:/python_projects/deep_learning/workspace/programs/dlcls/tf_py_function_test.py", line 30, in tf_serialize_example
    tf_string = tf.py_function(
  File "C:\Program Files\Python38\lib\site-packages\tensorflow\python\ops\script_ops.py", line 454, in eager_py_func
    return _internal_py_func(
  File "C:\Program Files\Python38\lib\site-packages\tensorflow\python\ops\script_ops.py", line 336, in _internal_py_func
    result = gen_script_ops.eager_py_func(
  File "C:\Program Files\Python38\lib\site-packages\tensorflow\python\ops\gen_script_ops.py", line 50, in eager_py_func
    return eager_py_func_eager_fallback(
  File "C:\Program Files\Python38\lib\site-packages\tensorflow\python\ops\gen_script_ops.py", line 99, in eager_py_func_eager_fallback
    _result = _execute.execute(b"EagerPyFunc", len(Tout), inputs=_inputs_flat,
  File "C:\Program Files\Python38\lib\site-packages\tensorflow\python\eager\execute.py", line 59, in quick_execute
    tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
tensorflow.python.framework.errors_impl.InvalidArgumentError: TypeError: <tf.Tensor: shape=(), dtype=bool, numpy=False> has type <class 'tensorflow.python.framework.ops.EagerTensor'>, but expected one of: (<class 'int'>,)
Traceback (most recent call last):

  File "C:\Program Files\Python38\lib\site-packages\tensorflow\python\ops\gen_script_ops.py", line 43, in eager_py_func
    _result = pywrap_tfe.TFE_Py_FastPathExecute(

tensorflow.python.eager.core._FallbackException: This function does not handle the case of the path where all inputs are not already EagerTensors.


During handling of the above exception, another exception occurred:


Traceback (most recent call last):

  File "C:\Program Files\Python38\lib\site-packages\tensorflow\python\ops\script_ops.py", line 241, in __call__
    return func(device, token, args)

  File "C:\Program Files\Python38\lib\site-packages\tensorflow\python\ops\script_ops.py", line 130, in __call__
    ret = self._func(*args)

  File "C:\Program Files\Python38\lib\site-packages\tensorflow\python\autograph\impl\api.py", line 309, in wrapper
    return func(*args, **kwargs)

  File "F:/python_projects/deep_learning/workspace/programs/dlcls/tf_py_function_test.py", line 20, in serialize_example
    'feature0': _int64_feature(feature0),

  File "F:/python_projects/deep_learning/workspace/programs/dlcls/tf_py_function_test.py", line 15, in _int64_feature
    return tf.train.Feature(int64_list=tf.train.Int64List(value=[value]))

  File "C:\Program Files\Python38\lib\site-packages\google\protobuf\internal\python_message.py", line 542, in init
    copy.extend(field_value)

  File "C:\Program Files\Python38\lib\site-packages\google\protobuf\internal\containers.py", line 282, in extend
    new_values = [self._type_checker.CheckValue(elem) for elem in elem_seq_iter]

  File "C:\Program Files\Python38\lib\site-packages\google\protobuf\internal\containers.py", line 282, in <listcomp>
    new_values = [self._type_checker.CheckValue(elem) for elem in elem_seq_iter]

  File "C:\Program Files\Python38\lib\site-packages\google\protobuf\internal\type_checkers.py", line 171, in CheckValue
    raise TypeError(message)

TypeError: <tf.Tensor: shape=(), dtype=bool, numpy=False> has type <class 'tensorflow.python.framework.ops.EagerTensor'>, but expected one of: (<class 'int'>,)

 [Op:EagerPyFunc]

Process finished with exit code 1
...