Я пытаюсь выполнить этот сценарий , используя run_ner.py
, но все, что я пытался продолжить точную настройку с контрольной точки, не удалось. Есть идеи?
Я запускаю его с помощью Google Colab. Далее я запускаю содержимое ячейки:
%cd "/content/drive/My Drive/Colab Notebooks/NER/Batteria/transformers-master_2020_04_27"
%pip install .
%pip install --upgrade .
%pip install seqeval
from fastai import *
from transformers import *
%cd "/content/drive/My Drive/Colab Notebooks/NER/Batteria/transformers-master_2020_04_27/examples/ner"
!python "/content/drive/My Drive/Colab Notebooks/NER/Batteria/transformers-master_2020_04_27/examples/ner/run_ner.py" --data_dir ./ \
--model_type bert \
--labels ./labels.txt \
--model_name_or_path "/content/drive/My Drive/Colab Notebooks/NER/Batteria/transformers-master_2020_04_27/examples/ner/bert-base-256/checkpoint-10000" \
--output_dir "/content/drive/My Drive/Colab Notebooks/NER/Batteria/transformers-master_2020_04_27/examples/ner/bert-base-256/check" \
--max_seq_length "256" \
--num_train_epochs "5" \
--per_gpu_train_batch_size "4" \
--save_steps "10000" \
--seed "1" \
--do_train --do_eval --do_predict
Как вы можете видеть, я уже пытался заменить значение параметра model_name_or_path (которое было "bert-base-cased") каталогом контрольных точек, но произошло несколько ошибок, запрашивая правильное название модели и отсутствующие файлы.
04/28/2020 15:16:36 - INFO - transformers.tokenization_utils - Model name '/content/drive/My Drive/Colab Notebooks/NER/Batteria/transformers-master_2020_04_27/examples/ner/bert-base-256/checkpoint-10000' not found in model shortcut name list (bert-base-uncased, bert-large-uncased, bert-base-cased, bert-large-cased, bert-base-multilingual-uncased, bert-base-multilingual-cased, bert-base-chinese, bert-base-german-cased, bert-large-uncased-whole-word-masking, bert-large-cased-whole-word-masking, bert-large-uncased-whole-word-masking-finetuned-squad, bert-large-cased-whole-word-masking-finetuned-squad, bert-base-cased-finetuned-mrpc, bert-base-german-dbmdz-cased, bert-base-german-dbmdz-uncased, bert-base-finnish-cased-v1, bert-base-finnish-uncased-v1, bert-base-dutch-cased). Assuming '/content/drive/My Drive/Colab Notebooks/NER/Batteria/transformers-master_2020_04_27/examples/ner/bert-base-256/checkpoint-10000' is a path, a model identifier, or url to a directory containing tokenizer files.
04/28/2020 15:16:36 - INFO - transformers.tokenization_utils - Didn't find file /content/drive/My Drive/Colab Notebooks/NER/Batteria/transformers-master_2020_04_27/examples/ner/bert-base-256/checkpoint-10000/vocab.txt. We won't load it.
04/28/2020 15:16:36 - INFO - transformers.tokenization_utils - Didn't find file /content/drive/My Drive/Colab Notebooks/NER/Batteria/transformers-master_2020_04_27/examples/ner/bert-base-256/checkpoint-10000/added_tokens.json. We won't load it.
04/28/2020 15:16:36 - INFO - transformers.tokenization_utils - Didn't find file /content/drive/My Drive/Colab Notebooks/NER/Batteria/transformers-master_2020_04_27/examples/ner/bert-base-256/checkpoint-10000/special_tokens_map.json. We won't load it.
04/28/2020 15:16:36 - INFO - transformers.tokenization_utils - Didn't find file /content/drive/My Drive/Colab Notebooks/NER/Batteria/transformers-master_2020_04_27/examples/ner/bert-base-256/checkpoint-10000/tokenizer_config.json. We won't load it.
Traceback (most recent call last):
File "/content/drive/My Drive/Colab Notebooks/NER/Batteria/transformers-master_2020_04_27/examples/ner/run_ner.py", line 290, in <module>
main()
File "/content/drive/My Drive/Colab Notebooks/NER/Batteria/transformers-master_2020_04_27/examples/ner/run_ner.py", line 149, in main
use_fast=model_args.use_fast,
File "/usr/local/lib/python3.6/dist-packages/transformers/tokenization_auto.py", line 197, in from_pretrained
return tokenizer_class_py.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/transformers/tokenization_utils.py", line 868, in from_pretrained
return cls._from_pretrained(*inputs, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/transformers/tokenization_utils.py", line 971, in _from_pretrained
list(cls.vocab_files_names.values()),
OSError: Model name '/content/drive/My Drive/Colab Notebooks/NER/Batteria/transformers-master_2020_04_27/examples/ner/bert-base-256/checkpoint-10000' was not found in tokenizers model name list (bert-base-uncased, bert-large-uncased, bert-base-cased, bert-large-cased, bert-base-multilingual-uncased, bert-base-multilingual-cased, bert-base-chinese, bert-base-german-cased, bert-large-uncased-whole-word-masking, bert-large-cased-whole-word-masking, bert-large-uncased-whole-word-masking-finetuned-squad, bert-large-cased-whole-word-masking-finetuned-squad, bert-base-cased-finetuned-mrpc, bert-base-german-dbmdz-cased, bert-base-german-dbmdz-uncased, bert-base-finnish-cased-v1, bert-base-finnish-uncased-v1, bert-base-dutch-cased). We assumed '/content/drive/My Drive/Colab Notebooks/NER/Batteria/transformers-master_2020_04_27/examples/ner/bert-base-256/checkpoint-10000' was a path, a model identifier, or url to a directory containing vocabulary files named ['vocab.txt'] but couldn't find such vocabulary files at this path or url.
Заранее спасибо.