Я пытаюсь обучить сеть Transformer, используя Tensor2Tensor. Я адаптирую пример Cloud Poetry , чтобы он соответствовал моей собственной задаче kt_problem
, в которой я сопоставляю последовательности с плавающей точкой с последовательностями с плавающей точкой вместо предложений с предложениями.
Я адаптировал функции generate_data()
и generate_samples()
в соответствии с разрозненными спецификациями для использования собственных данных с тензорным тензором (например, генерация данных README , строка 174 Problem
класс и т. д.). Они следующие:
def generate_samples(self, data_dir, tmp_dir, train):
import numpy as np
features = pd.read_csv("data/kt/features.csv", dtype=np.float64)
targets = pd.read_csv("data/kt/targets.csv", dtype=np.float64)
for i in range(len(features)-1):
yield {
"inputs": list(features.iloc[i]),
"targets": list(targets.iloc[i])
}
def generate_data(self, data_dir, tmp_dir, task_id=-1):
generator_utils.generate_dataset_and_shuffle(
self.generate_samples(data_dir,tmp_dir,1),
self.training_filepaths(data_dir,4,False),
self.generate_samples(data_dir,tmp_dir,0),
self.dev_filepaths(data_dir,3,False))
определено в моем классе KTProblem
.
После внесения этого изменения я могу успешно запустить
PROBLEM='kt_problem' #my own problem, for which I've defined a class
%%bash
DATA_DIR=./t2t_data
TMP_DIR=$DATA_DIR/tmp
t2t-datagen \
--t2t_usr_dir=./kt/trainer \
--problem=$PROBLEM \
--data_dir=$DATA_DIR \
--tmp_dir=$TMP_DIR
и генерирует кучу файлов train и dev. Но когда я пытаюсь обучить этому преобразователю этот код,
%%bash
DATA_DIR=./t2t_data
OUTDIR=./trained_model
t2t-trainer \
--data_dir=$DATA_DIR \
--t2t_usr_dir=./kt/trainer \
--problem=$PROBLEM \
--model=transformer \
--hparams_set=transformer_kt \
--output_dir=$OUTDIR --job-dir=$OUTDIR --train_steps=10
Выдает следующую ошибку:
ValueError: x has to be a floating point tensor since it's going to be scaled. Got a <dtype: 'int32'> tensor instead.
Как вы можете видеть в generate_samples()
, сгенерированные данные np.float64
, и поэтому я уверен, что мои входные данные не должны быть int32
. Трассировка стека (размещена справа внизу) очень длинная, и я просматривал каждую из перечисленных строк и проверял тип входов, чтобы увидеть, где этот int32
вход попал на картинку, но я не могу его найти. Я хочу знать (1), почему, если мои входные данные являются числами с плавающей точкой, почему / как / где они становятся числами с плавающей точкой, но в основном (2) в целом, как один код отладки подобен этому? До сих пор мой подход заключался в том, чтобы помещать операторы печати прямо перед каждой строкой трассировки стека, но это кажется таким наивным способом отладки. Было бы лучше использовать VScode, или какой урок мне нужно выучить здесь, когда библиотека tensor2tensor
, в этом случае, не ведет себя так, как я думаю, но я не хочу, чтобы узнать близко что делает каждая функция в трассировке стека?
Трассировка стека:
INFO:tensorflow:Importing user module trainer from path /home/crytting/kt/kt
WARNING:tensorflow:From /home/crytting/kt/tensor2tensor/tensor2tensor/utils/trainer_lib.py:240: RunConfig.__init__ (from tensorflow.contrib.learn.python.learn.estimators.run_config) is deprecated and will be removed in a future version.
Instructions for updating:
When switching to tf.estimator.Estimator, use tf.estimator.RunConfig instead.
INFO:tensorflow:Configuring DataParallelism to replicate the model.
INFO:tensorflow:schedule=continuous_train_and_eval
INFO:tensorflow:worker_gpu=1
INFO:tensorflow:sync=False
WARNING:tensorflow:Schedule=continuous_train_and_eval. Assuming that training is running on a single machine.
INFO:tensorflow:datashard_devices: ['gpu:0']
INFO:tensorflow:caching_devices: None
INFO:tensorflow:ps_devices: ['gpu:0']
INFO:tensorflow:Using config: {'_task_type': None, '_task_id': 0, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f04151caba8>, '_master': '', '_num_ps_replicas': 0, '_num_worker_replicas': 0, '_environment': 'local', '_is_chief': True, '_evaluation_master': '', '_train_distribute': None, '_eval_distribute': None, '_device_fn': None, '_tf_config': gpu_options {
per_process_gpu_memory_fraction: 1.0
}
, '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_secs': None, '_log_step_count_steps': 100, '_protocol': None, '_session_config': gpu_options {
per_process_gpu_memory_fraction: 0.95
}
allow_soft_placement: true
graph_options {
optimizer_options {
global_jit_level: OFF
}
}
isolate_session_state: true
, '_save_checkpoints_steps': 1000, '_keep_checkpoint_max': 20, '_keep_checkpoint_every_n_hours': 10000, '_model_dir': './trained_model', 'use_tpu': False, 't2t_device_info': {'num_async_replicas': 1}, 'data_parallelism': <tensor2tensor.utils.expert_utils.Parallelism object at 0x7f0464512dd8>}
WARNING:tensorflow:Estimator's model_fn (<function T2TModel.make_estimator_model_fn.<locals>.wrapping_model_fn at 0x7f0414891e18>) includes params argument, but params are not passed to Estimator.
WARNING:tensorflow:ValidationMonitor only works with --schedule=train_and_evaluate
INFO:tensorflow:Not using Distribute Coordinator.
INFO:tensorflow:Running training and evaluation locally (non-distributed).
INFO:tensorflow:Start train and evaluate loop. The evaluate will happen after every checkpoint. Checkpoint frequency is determined based on RunConfig arguments: save_checkpoints_steps 1000 or save_checkpoints_secs None.
WARNING:tensorflow:From /home/crytting/anaconda3/envs/kt/lib/python3.7/site-packages/tensorflow/python/framework/op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
INFO:tensorflow:Reading data files from ./t2t_data/kt_problem-train*
INFO:tensorflow:partition: 0 num_data_files: 4
WARNING:tensorflow:From /home/crytting/kt/tensor2tensor/tensor2tensor/utils/data_reader.py:275: tf_record_iterator (from tensorflow.python.lib.io.tf_record) is deprecated and will be removed in a future version.
Instructions for updating:
Use eager execution and:
`tf.data.TFRecordDataset(path)`
WARNING:tensorflow:From /home/crytting/kt/tensor2tensor/tensor2tensor/utils/data_reader.py:37: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.
WARNING:tensorflow:Shapes are not fully defined. Assuming batch_size means tokens.
WARNING:tensorflow:From /home/crytting/kt/tensor2tensor/tensor2tensor/utils/data_reader.py:233: to_float (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Setting T2TModel mode to 'train'
INFO:tensorflow:Using variable initializer: uniform_unit_scaling
INFO:tensorflow:Building model body
WARNING:tensorflow:From /home/crytting/kt/tensor2tensor/tensor2tensor/models/transformer.py:156: calling dropout (from tensorflow.python.ops.nn_ops) with keep_prob is deprecated and will be removed in a future version.
Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.
Traceback (most recent call last):
File "/home/crytting/anaconda3/envs/kt/bin/t2t-trainer", line 33, in <module>
tf.app.run()
File "/home/crytting/anaconda3/envs/kt/lib/python3.7/site-packages/tensorflow/python/platform/app.py", line 125, in run
_sys.exit(main(argv))
File "/home/crytting/anaconda3/envs/kt/bin/t2t-trainer", line 28, in main
t2t_trainer.main(argv)
File "/home/crytting/kt/tensor2tensor/tensor2tensor/bin/t2t_trainer.py", line 400, in main
execute_schedule(exp)
File "/home/crytting/kt/tensor2tensor/tensor2tensor/bin/t2t_trainer.py", line 356, in execute_schedule
getattr(exp, FLAGS.schedule)()
File "/home/crytting/kt/tensor2tensor/tensor2tensor/utils/trainer_lib.py", line 400, in continuous_train_and_eval
self._eval_spec)
File "/home/crytting/anaconda3/envs/kt/lib/python3.7/site-packages/tensorflow_estimator/python/estimator/training.py", line 471, in train_and_evaluate
return executor.run()
File "/home/crytting/anaconda3/envs/kt/lib/python3.7/site-packages/tensorflow_estimator/python/estimator/training.py", line 611, in run
return self.run_local()
File "/home/crytting/anaconda3/envs/kt/lib/python3.7/site-packages/tensorflow_estimator/python/estimator/training.py", line 712, in run_local
saving_listeners=saving_listeners)
File "/home/crytting/anaconda3/envs/kt/lib/python3.7/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 358, in train
loss = self._train_model(input_fn, hooks, saving_listeners)
File "/home/crytting/anaconda3/envs/kt/lib/python3.7/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 1124, in _train_model
return self._train_model_default(input_fn, hooks, saving_listeners)
File "/home/crytting/anaconda3/envs/kt/lib/python3.7/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 1155, in _train_model_default
features, labels, model_fn_lib.ModeKeys.TRAIN, self.config)
File "/home/crytting/anaconda3/envs/kt/lib/python3.7/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 1112, in _call_model_fn
model_fn_results = self._model_fn(features=features, **kwargs)
File "/home/crytting/kt/tensor2tensor/tensor2tensor/utils/t2t_model.py", line 1414, in wrapping_model_fn
use_tpu=use_tpu)
File "/home/crytting/kt/tensor2tensor/tensor2tensor/utils/t2t_model.py", line 1477, in estimator_model_fn
logits, losses_dict = model(features) # pylint: disable=not-callable
File "/home/crytting/anaconda3/envs/kt/lib/python3.7/site-packages/tensorflow/python/layers/base.py", line 530, in __call__
outputs = super(Layer, self).__call__(inputs, *args, **kwargs)
File "/home/crytting/anaconda3/envs/kt/lib/python3.7/site-packages/tensorflow/python/keras/engine/base_layer.py", line 554, in __call__
outputs = self.call(inputs, *args, **kwargs)
File "/home/crytting/kt/tensor2tensor/tensor2tensor/utils/t2t_model.py", line 323, in call
sharded_logits, losses = self.model_fn_sharded(sharded_features)
File "/home/crytting/kt/tensor2tensor/tensor2tensor/utils/t2t_model.py", line 400, in model_fn_sharded
sharded_logits, sharded_losses = dp(self.model_fn, datashard_to_features)
File "/home/crytting/kt/tensor2tensor/tensor2tensor/utils/expert_utils.py", line 231, in __call__
outputs.append(fns[i](*my_args[i], **my_kwargs[i]))
File "/home/crytting/kt/tensor2tensor/tensor2tensor/utils/t2t_model.py", line 428, in model_fn
body_out = self.body(transformed_features)
File "/home/crytting/kt/tensor2tensor/tensor2tensor/models/transformer.py", line 280, in body
**decode_kwargs
File "/home/crytting/kt/tensor2tensor/tensor2tensor/models/transformer.py", line 217, in decode
**kwargs)
File "/home/crytting/kt/tensor2tensor/tensor2tensor/models/transformer.py", line 156, in transformer_decode
1.0 - hparams.layer_prepostprocess_dropout)
File "/home/crytting/anaconda3/envs/kt/lib/python3.7/site-packages/tensorflow/python/util/deprecation.py", line 507, in new_func
return func(*args, **kwargs)
File "/home/crytting/anaconda3/envs/kt/lib/python3.7/site-packages/tensorflow/python/ops/nn_ops.py", line 2979, in dropout
return dropout_v2(x, rate, noise_shape=noise_shape, seed=seed, name=name)
File "/home/crytting/anaconda3/envs/kt/lib/python3.7/site-packages/tensorflow/python/ops/nn_ops.py", line 3021, in dropout_v2
" be scaled. Got a %s tensor instead." % x.dtype)
ValueError: x has to be a floating point tensor since it's going to be scaled. Got a <dtype: 'int32'> tensor instead.