Я пытаюсь обучить модели Mask RCNN на основе официальной модели MaskRCNN, представленной здесь: tenorflow / models .
Ниже приведены шаги, которые я выполнил:
- Создан tfrecord для обучения и проверки. Я проверил кодирование и декодирование tfrecords, он работает нормально.
- Настройте файл конфигурации, как показано ниже:
# my_maskrcnn.yaml
train:
train_file_pattern: "data/<dataset_name>/train/tfrecords/train.tfrecord-*"
batch_size: 2
eval:
eval_file_pattern: "data/<data_set_name>/val/tfrecords/val.tfrecord-*"
batch_size: 2
predict:
batch_size: 2
architecture:
num_classes: 2
maskrcnn_parser:
output_size: [512, 512]
- Установите для моделей / официальных лиц значение PYTHONPATH.
- Выполните команду для одного графического процессора, как указано в документации:
python path/to/models/official/vision/detection/main.py \
--strategy_type=one_device \
--model_dir=models_mask_rcnn \
--mode=train \
--config_file="mymaskrcnn.yaml" \
--model=mask_rcnn
Но я получаю следующую ошибку:
Traceback (most recent call last):
File ".../models/official/vision/detection/main.py", line 255, in <module>
app.run(main)
File ".../lib/python3.7/site-packages/absl/app.py", line 299, in run
_run_main(main, args)
File ".../lib/python3.7/site-packages/absl/app.py", line 250, in _run_main
sys.exit(main(argv))
File ".../models/official/vision/detection/main.py", line 250, in main
run()
File ".../models/official/vision/detection/main.py", line 244, in run
callbacks=callbacks)
File ".../models/official/vision/detection/main.py", line 130, in run_executor
save_config=True)
File ".../models/official/modeling/training/distributed_executor.py", line 482, in train
tf.convert_to_tensor(num_steps, dtype=tf.int32))
File ".../lib/python3.7/site-packages/tensorflow/python/eager/def_function.py", line 688, in __call__
result = self._call(*args, **kwds)
File ".../lib/python3.7/site-packages/tensorflow/python/eager/def_function.py", line 741, in _call
return self._stateless_fn(*args, **kwds)
File ".../lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 2407, in __call__
return graph_function._filtered_call(args, kwargs) # pylint: disable=protected-access
File ".../lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 1655, in _filtered_call
self.captured_inputs)
File ".../lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 1732, in _call_flat
ctx, args, cancellation_manager=cancellation_manager))
File ".../lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 598, in call
ctx=ctx)
File ".../lib/python3.7/site-packages/tensorflow/python/eager/execute.py", line 60, in quick_execute
inputs, attrs, num_outputs)
tensorflow.python.framework.errors_impl.InvalidArgumentError: 2 root error(s) found.
(0) Invalid argument: indices[1,63] = [1, -1] does not index into param shape [2,100,112,112]
[[{{node while/body/_241/while/maskrcnn/tf_op_layer_sample_and_crop_foreground_masks/GatherNd_4/sample_and_crop_foreground_masks/GatherNd_4}}]]
(1) Invalid argument: indices[1,63] = [1, -1] does not index into param shape [2,100,112,112]
[[{{node while/body/_241/while/maskrcnn/tf_op_layer_sample_and_crop_foreground_masks/GatherNd_4/sample_and_crop_foreground_masks/GatherNd_4}}]]
[[while/body/_241/while/AddN_40/_3885]]
0 successful operations.
1 derived errors ignored. [Op:__inference_train_step_89096]
Function call stack:
train_step -> train_step
Где я иду не так?