TensorFlow tfrecord edit (чтение и запись в файл) Ошибка OutOfRange - PullRequest
0 голосов
/ 14 января 2019

In src / datasets / h36m_edit.py :

with tf.Session() as sess:
    reader = tf.TFRecordReader()
    coder = ImageCoder()

    fqueue = tf.train.string_input_producer(files, num_epochs=1, shuffle=False, name="input")
    _, example_serialized = reader.read(fqueue)

    sess.run(tf.local_variables_initializer())
    sess.run(tf.global_variables_initializer())

    coord = tf.train.Coordinator()
    threads = tf.train.start_queue_runners(coord=coord)

    fidx = 0
    total_imgs = 0
    image, image_size, label, center, fname, pose, shape, gt3d, has_smpl3d = parse_example_proto(example_serialized)

    while not coord.should_stop():
        fidx += 1
        tf_filename = out_path% fidx

        print('Starting tfrecord file %s \n' % tf_filename)
        with tf.python_io.TFRecordWriter(tf_filename) as writer:
            for i in tqdm(range(train_shards)):  # min(train_shards, image_bs.shape[0])
                image_v, image_size_v, label_v, center_v, fname_v, pose_v, shape_v, gt3d_v, has_smpl3d_v = sess.run(
                    [image, image_size, label, center, fname, pose, shape, gt3d, has_smpl3d])
                image_s = coder.encode_jpeg(image_v)
                example = convert_to_example_wmosh(image_s, fname_v, image_size_v[0], image_size_v[1],
                                                   label_v, center_v, gt3d_v, pose_v, shape_v)
                writer.write(example.SerializeToString())
                total_imgs += 1

    coord.request_stop()
    coord.join(threads)

Иногда внутренний цикл останавливается до того, как он достигает максимального предела iter (train_shards) 500.

100%|██████████| 500/500 [00:02<00:00, 225.07it/s]
Starting tfrecord file /home/cdeng/tf_datasets/tf_records_human36m_wjoints/train_modified/train_0011.tfrecord 

 96%|█████████▌| 478/500 [00:02<00:00, 225.58it/s]Starting tfrecord file /home/cdeng/tf_datasets/tf_records_human36m_wjoints/train_modified/train_0012.tfrecord 

100%|██████████| 500/500 [00:02<00:00, 230.37it/s]

И когда он записывает в файл tfrecord номер 625, возникает ошибка OutOfRange (предполагается, что он завершает работу с более чем 3000 файлами tfrecord, из-за того, что поезд Human36m имеет 1559985 изображений, а каждая запись tfrecord содержит 500 изображений). Я предполагаю, что это потому, что очередь изображений не обрабатывается правильно, может быть, продюсер слишком медленный?

/home/cdeng/tf_datasets/tf_records_human36m_wjoints/train_modified/train_0625.tfrecord 
 36%|███▌      | 180/500 [00:00<00:01, 221.50it/s]2019-01-13 22:47:40.946736: W tensorflow/core/framework/op_kernel.cc:1192] Out of range: FIFOQueue '_0_input' is closed and has insufficient elements (requested 1, current size 0)
     [[Node: ReaderReadV2 = ReaderReadV2[_device="/job:localhost/replica:0/task:0/cpu:0"](TFRecordReaderV2, input)]]
2019-01-13 22:47:40.946816: W tensorflow/core/framework/op_kernel.cc:1192] Out of range: FIFOQueue '_0_input' is closed and has insufficient elements (requested 1, current size 0)
     [[Node: ReaderReadV2 = ReaderReadV2[_device="/job:localhost/replica:0/task:0/cpu:0"](TFRecordReaderV2, input)]]

Traceback (most recent call last):
  File "/home/cdeng/star_repos/hmr/src/datasets/h36m_edit.py", line 233, in <module>
    [image, image_size, label, center, fname, pose, shape, gt3d, has_smpl3d])
  File "/home/cdeng/.virtualenvs/hmr/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 895, in run
    run_metadata_ptr)
  File "/home/cdeng/.virtualenvs/hmr/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1124, in _run
    feed_dict_tensor, options, run_metadata)
  File "/home/cdeng/.virtualenvs/hmr/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1321, in _do_run
    options, run_metadata)
  File "/home/cdeng/.virtualenvs/hmr/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1340, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.OutOfRangeError: FIFOQueue '_0_input' is closed and has insufficient elements (requested 1, current size 0)
     [[Node: ReaderReadV2 = ReaderReadV2[_device="/job:localhost/replica:0/task:0/cpu:0"](TFRecordReaderV2, input)]]
     [[Node: ParseSingleExample/ParseExample/ParseExample/_21 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/gpu:0", send_device="/job:localhost/replica:0/task:0/cpu:0", send_device_incarnation=1, tensor_name="edge_52_ParseSingleExample/ParseExample/ParseExample", tensor_type=DT_INT64, _device="/job:localhost/replica:0/task:0/gpu:0"]()]]

Caused by op u'ReaderReadV2', defined at:
  File "/home/cdeng/star_repos/hmr/src/datasets/h36m_edit.py", line 204, in <module>
    _, example_serialized = reader.read(fqueue)
  File "/home/cdeng/.virtualenvs/hmr/local/lib/python2.7/site-packages/tensorflow/python/ops/io_ops.py", line 194, in read
    return gen_io_ops._reader_read_v2(self._reader_ref, queue_ref, name=name)
  File "/home/cdeng/.virtualenvs/hmr/local/lib/python2.7/site-packages/tensorflow/python/ops/gen_io_ops.py", line 423, in _reader_read_v2
    queue_handle=queue_handle, name=name)
  File "/home/cdeng/.virtualenvs/hmr/local/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 767, in apply_op
    op_def=op_def)
  File "/home/cdeng/.virtualenvs/hmr/local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2630, in create_op
    original_op=self._default_original_op, op_def=op_def)
  File "/home/cdeng/.virtualenvs/hmr/local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1204, in __init__
    self._traceback = self._graph._extract_stack()  # pylint: disable=protected-access

OutOfRangeError (see above for traceback): FIFOQueue '_0_input' is closed and has insufficient elements (requested 1, current size 0)
     [[Node: ReaderReadV2 = ReaderReadV2[_device="/job:localhost/replica:0/task:0/cpu:0"](TFRecordReaderV2, input)]]
     [[Node: ParseSingleExample/ParseExample/ParseExample/_21 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/gpu:0", send_device="/job:localhost/replica:0/task:0/cpu:0", send_device_incarnation=1, tensor_name="edge_52_ParseSingleExample/ParseExample/ParseExample", tensor_type=DT_INT64, _device="/job:localhost/replica:0/task:0/gpu:0"]()]]


Process finished with exit code 1

1 Ответ

0 голосов
/ 14 января 2019

Проблема решена. Два комментария:

  1. tqdm может быть неправильным, если цикл работает слишком быстро.
  2. когда очередь пуста в конце, ошибка OutOfRange будет выброшена, рекомендуется добавить обработку исключений, как предложено в QueueRunner
...