Я пытаюсь обучить модель LSTM для моего набора данных, используя AU C в качестве метри c. Я определяю этот показатель c как функцию, которая использует rocc_auc_score функцию SKLearn. Вот мой код для этого:
from sklearn.metrics import roc_auc_score
def auc_metric(y_true, y_pred):
return tf.py_function(roc_auc_score, (y_true, y_pred), tf.double)
Я получил это из этого ответа stackoverflow: { ссылка }
И это моя модель архитектуры:
Уровень (тип) выходной формы Параметр # Подключен к
input_seq_total_text_data (Inpu [(Нет, 400)] 0
Emb_text_data (Встраивание) (Нет, 400, 100) 4869100 input_seq_total_text_data [0] [0]
input_state (InputLayer) [(None, 1)] 0
input_grade_cat (InputLayer) [(None, 1)] 0
input_clean_cat (InputLayer) [(None, 1)] 0
input_clean_subcat (InputLayer) [(None, 1)] 0
input_prefix (InputLayer) [(None, 1)] 0
essay_LSTM (LSTM) (None, 100) 80400 Emb_text_data [0] [0]
Emb_state ( Вложение) (Нет, 1, 13) 676 input_state [0] [0]
Emb_grade_cat (Вложение) (Нет, 1, 2) 20 input_grade_cat [0] [0]
Emb_category (Embedding) (Нет, 1, 4) 64 input_clea n_cat [0] [0]
Emb_clean_subcats (Embedding) (Нет, 1, 4) 64 input_clean_subcat [0] [0]
Emb_prefix (Embedding) (Нет) , 1, 1) 6 input_prefix [0] [0]
numeric_values (InputLayer) [(None, 2)] 0
flatten (Flatten) (None, 100) 0 essay_LSTM [0] [0]
flatten_1 (Flatten) (нет, 13) 0 Emb_state [0] [0]
flatten_2 (Flatten) ( Нет, 2) 0 Emb_grade_cat [0] [0]
flatten_3 (Flatten) (Нет, 4) 0 Emb_category [0] [0]
flatten_4 (Flatten ) (Нет, 4) 0 Emb_clean_subcats [0] [0]
flatten_5 (Flatten) (Нет, 1) 0 Emb_prefix [0] [0]
numeric_dense (Плотный) (Нет, 4) 12 numeric_values [0] [0]
Concat (Конкатенация) (None, 128) 0 flatten [0] [0]
flatten_1 [0] [ 0]
flatten_2 [0] [0]
flatten_3 [0] [0]
flatten_4 [0] [0]
flatten_5 [0] [0]
numeric_dense [0] [ 0]
density_1 (Плотный) (Нет, 64) 8256 Concat [0] [0]
dropout (Dropout) (None, 64) 0 density_1 [0] [0]
density_2 (Dense) (None, 32 ) 2080 отсев [0] [0]
плотный (Плотный) (нет, 1) 33 плотный_2 [0] [0]
Всего параметров: 4 960 711 Обучаемых параметров: 91 611 Необучаемых параметров: 4 869 100
Когда Подходит для модели, она запускается в течение эпохи, а затем выдает следующее сообщение об ошибке:
Epoch 1/15
53504/53531 [============================>.] - ETA: 0s - loss: nan - auc_metric: 0.4999
---------------------------------------------------------------------------
InvalidArgumentError Traceback (most recent call last)
<ipython-input-28-55549e10a2a7> in <module>
2 model.fit(x = [tokenized_essay_train, tokenized_state_train, tokenized_grade_cat_train,
3 tokenized_clean_cat_train, tokenized_clean_subcats_train, tokenized_prefix_train, x_train[['price', 'teacher_number_of_previously_posted_projects']]], y = y_train, batch_size = batch_size, epochs = epochs, verbose = 1,
----> 4 validation_split= 0.3)
~/anaconda3/envs/deep_learning/lib/python3.7/site-packages/tensorflow/python/keras/engine/training.py in fit(self, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, steps_per_epoch, validation_steps, validation_freq, max_queue_size, workers, use_multiprocessing, **kwargs)
778 validation_steps=validation_steps,
779 validation_freq=validation_freq,
--> 780 steps_name='steps_per_epoch')
781
782 def evaluate(self,
~/anaconda3/envs/deep_learning/lib/python3.7/site-packages/tensorflow/python/keras/engine/training_arrays.py in model_iteration(model, inputs, targets, sample_weights, batch_size, epochs, verbose, callbacks, val_inputs, val_targets, val_sample_weights, shuffle, initial_epoch, steps_per_epoch, validation_steps, validation_freq, mode, validation_in_fit, prepared_feed_values_from_dataset, steps_name, **kwargs)
361
362 # Get outputs.
--> 363 batch_outs = f(ins_batch)
364 if not isinstance(batch_outs, list):
365 batch_outs = [batch_outs]
~/anaconda3/envs/deep_learning/lib/python3.7/site-packages/tensorflow/python/keras/backend.py in __call__(self, inputs)
3290
3291 fetched = self._callable_fn(*array_vals,
-> 3292 run_metadata=self.run_metadata)
3293 self._call_fetch_callbacks(fetched[-len(self._fetches):])
3294 output_structure = nest.pack_sequence_as(
~/anaconda3/envs/deep_learning/lib/python3.7/site-packages/tensorflow/python/client/session.py in __call__(self, *args, **kwargs)
1456 ret = tf_session.TF_SessionRunCallable(self._session._session,
1457 self._handle, args,
-> 1458 run_metadata_ptr)
1459 if run_metadata:
1460 proto_data = tf_session.TF_GetBuffer(run_metadata_ptr)
InvalidArgumentError: 2 root error(s) found.
(0) Invalid argument: ValueError: Input contains NaN, infinity or a value too large for dtype('float32').
Traceback (most recent call last):
File "/home/aman/anaconda3/envs/deep_learning/lib/python3.7/site-packages/tensorflow/python/ops/script_ops.py", line 207, in __call__
return func(device, token, args)
File "/home/aman/anaconda3/envs/deep_learning/lib/python3.7/site-packages/tensorflow/python/ops/script_ops.py", line 109, in __call__
ret = self._func(*args)
File "/home/aman/anaconda3/envs/deep_learning/lib/python3.7/site-packages/sklearn/metrics/_ranking.py", line 369, in roc_auc_score
y_score = check_array(y_score, ensure_2d=False)
File "/home/aman/anaconda3/envs/deep_learning/lib/python3.7/site-packages/sklearn/utils/validation.py", line 578, in check_array
allow_nan=force_all_finite == 'allow-nan')
File "/home/aman/anaconda3/envs/deep_learning/lib/python3.7/site-packages/sklearn/utils/validation.py", line 60, in _assert_all_finite
msg_dtype if msg_dtype is not None else X.dtype)
ValueError: Input contains NaN, infinity or a value too large for dtype('float32').
[[{{node metrics/auc_metric/EagerPyFunc}}]]
[[metrics/auc_metric/Identity/_195]]
(1) Invalid argument: ValueError: Input contains NaN, infinity or a value too large for dtype('float32').
Traceback (most recent call last):
File "/home/aman/anaconda3/envs/deep_learning/lib/python3.7/site-packages/tensorflow/python/ops/script_ops.py", line 207, in __call__
return func(device, token, args)
File "/home/aman/anaconda3/envs/deep_learning/lib/python3.7/site-packages/tensorflow/python/ops/script_ops.py", line 109, in __call__
ret = self._func(*args)
File "/home/aman/anaconda3/envs/deep_learning/lib/python3.7/site-packages/sklearn/metrics/_ranking.py", line 369, in roc_auc_score
y_score = check_array(y_score, ensure_2d=False)
File "/home/aman/anaconda3/envs/deep_learning/lib/python3.7/site-packages/sklearn/utils/validation.py", line 578, in check_array
allow_nan=force_all_finite == 'allow-nan')
File "/home/aman/anaconda3/envs/deep_learning/lib/python3.7/site-packages/sklearn/utils/validation.py", line 60, in _assert_all_finite
msg_dtype if msg_dtype is not None else X.dtype)
ValueError: Input contains NaN, infinity or a value too large for dtype('float32').
[[{{node metrics/auc_metric/EagerPyFunc}}]]
0 successful operations.
0 derived errors ignored.
Я проверил свой набор данных, а также все вложения, которые я сделал, и нет отсутствующих значений или значений NaN, как а также нет бесконечных значений и все входы и выходы каждого слоя равны Float32