Недопустимый аргумент: ValueError: Вход содержит NaN, бесконечность или слишком большое значение для dtype ('float32') - PullRequest
0 голосов
/ 24 марта 2020

Я пытаюсь обучить модель LSTM для моего набора данных, используя AU C в качестве метри c. Я определяю этот показатель c как функцию, которая использует rocc_auc_score функцию SKLearn. Вот мой код для этого:

from sklearn.metrics import roc_auc_score
def auc_metric(y_true, y_pred):
    return tf.py_function(roc_auc_score, (y_true, y_pred), tf.double)

Я получил это из этого ответа stackoverflow: { ссылка }

И это моя модель архитектуры:

Уровень (тип) выходной формы Параметр # Подключен к

input_seq_total_text_data (Inpu [(Нет, 400)] 0


Emb_text_data (Встраивание) (Нет, 400, 100) 4869100 input_seq_total_text_data [0] [0]


input_state (InputLayer) [(None, 1)] 0


input_grade_cat (InputLayer) [(None, 1)] 0


input_clean_cat (InputLayer) [(None, 1)] 0


input_clean_subcat (InputLayer) [(None, 1)] 0


input_prefix (InputLayer) [(None, 1)] 0


essay_LSTM (LSTM) (None, 100) 80400 Emb_text_data [0] [0]


Emb_state ( Вложение) (Нет, 1, 13) 676 input_state [0] [0]


Emb_grade_cat (Вложение) (Нет, 1, 2) 20 input_grade_cat [0] [0]


Emb_category (Embedding) (Нет, 1, 4) 64 input_clea n_cat [0] [0]


Emb_clean_subcats (Embedding) (Нет, 1, 4) 64 input_clean_subcat [0] [0]


Emb_prefix (Embedding) (Нет) , 1, 1) 6 input_prefix [0] [0]


numeric_values ​​(InputLayer) [(None, 2)] 0


flatten (Flatten) (None, 100) 0 essay_LSTM [0] [0]


flatten_1 (Flatten) (нет, 13) 0 Emb_state [0] [0]


flatten_2 (Flatten) ( Нет, 2) 0 Emb_grade_cat [0] [0]


flatten_3 (Flatten) (Нет, 4) 0 Emb_category [0] [0]


flatten_4 (Flatten ) (Нет, 4) 0 Emb_clean_subcats [0] [0]


flatten_5 (Flatten) (Нет, 1) 0 Emb_prefix [0] [0]


numeric_dense (Плотный) (Нет, 4) 12 numeric_values ​​[0] [0]


Concat (Конкатенация) (None, 128) 0 flatten [0] [0]
flatten_1 [0] [ 0]
flatten_2 [0] [0]
flatten_3 [0] [0]
flatten_4 [0] [0]
flatten_5 [0] [0]
numeric_dense [0] [ 0]


density_1 (Плотный) (Нет, 64) 8256 Concat [0] [0]


dropout (Dropout) (None, 64) 0 density_1 [0] [0]


density_2 (Dense) (None, 32 ) 2080 отсев [0] [0]


плотный (Плотный) (нет, 1) 33 плотный_2 [0] [0]

Всего параметров: 4 960 711 Обучаемых параметров: 91 611 Необучаемых параметров: 4 869 100


Когда Подходит для модели, она запускается в течение эпохи, а затем выдает следующее сообщение об ошибке:

    Epoch 1/15
53504/53531 [============================>.] - ETA: 0s - loss: nan - auc_metric: 0.4999
---------------------------------------------------------------------------
InvalidArgumentError                      Traceback (most recent call last)
<ipython-input-28-55549e10a2a7> in <module>
      2 model.fit(x = [tokenized_essay_train, tokenized_state_train, tokenized_grade_cat_train,
      3                tokenized_clean_cat_train, tokenized_clean_subcats_train, tokenized_prefix_train, x_train[['price', 'teacher_number_of_previously_posted_projects']]], y = y_train, batch_size = batch_size, epochs = epochs, verbose = 1,
----> 4           validation_split= 0.3)

~/anaconda3/envs/deep_learning/lib/python3.7/site-packages/tensorflow/python/keras/engine/training.py in fit(self, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, steps_per_epoch, validation_steps, validation_freq, max_queue_size, workers, use_multiprocessing, **kwargs)
    778           validation_steps=validation_steps,
    779           validation_freq=validation_freq,
--> 780           steps_name='steps_per_epoch')
    781 
    782   def evaluate(self,

~/anaconda3/envs/deep_learning/lib/python3.7/site-packages/tensorflow/python/keras/engine/training_arrays.py in model_iteration(model, inputs, targets, sample_weights, batch_size, epochs, verbose, callbacks, val_inputs, val_targets, val_sample_weights, shuffle, initial_epoch, steps_per_epoch, validation_steps, validation_freq, mode, validation_in_fit, prepared_feed_values_from_dataset, steps_name, **kwargs)
    361 
    362         # Get outputs.
--> 363         batch_outs = f(ins_batch)
    364         if not isinstance(batch_outs, list):
    365           batch_outs = [batch_outs]

~/anaconda3/envs/deep_learning/lib/python3.7/site-packages/tensorflow/python/keras/backend.py in __call__(self, inputs)
   3290 
   3291     fetched = self._callable_fn(*array_vals,
-> 3292                                 run_metadata=self.run_metadata)
   3293     self._call_fetch_callbacks(fetched[-len(self._fetches):])
   3294     output_structure = nest.pack_sequence_as(

~/anaconda3/envs/deep_learning/lib/python3.7/site-packages/tensorflow/python/client/session.py in __call__(self, *args, **kwargs)
   1456         ret = tf_session.TF_SessionRunCallable(self._session._session,
   1457                                                self._handle, args,
-> 1458                                                run_metadata_ptr)
   1459         if run_metadata:
   1460           proto_data = tf_session.TF_GetBuffer(run_metadata_ptr)

InvalidArgumentError: 2 root error(s) found.
  (0) Invalid argument: ValueError: Input contains NaN, infinity or a value too large for dtype('float32').
Traceback (most recent call last):

  File "/home/aman/anaconda3/envs/deep_learning/lib/python3.7/site-packages/tensorflow/python/ops/script_ops.py", line 207, in __call__
    return func(device, token, args)

  File "/home/aman/anaconda3/envs/deep_learning/lib/python3.7/site-packages/tensorflow/python/ops/script_ops.py", line 109, in __call__
    ret = self._func(*args)

  File "/home/aman/anaconda3/envs/deep_learning/lib/python3.7/site-packages/sklearn/metrics/_ranking.py", line 369, in roc_auc_score
    y_score = check_array(y_score, ensure_2d=False)

  File "/home/aman/anaconda3/envs/deep_learning/lib/python3.7/site-packages/sklearn/utils/validation.py", line 578, in check_array
    allow_nan=force_all_finite == 'allow-nan')

  File "/home/aman/anaconda3/envs/deep_learning/lib/python3.7/site-packages/sklearn/utils/validation.py", line 60, in _assert_all_finite
    msg_dtype if msg_dtype is not None else X.dtype)

ValueError: Input contains NaN, infinity or a value too large for dtype('float32').


     [[{{node metrics/auc_metric/EagerPyFunc}}]]
     [[metrics/auc_metric/Identity/_195]]
  (1) Invalid argument: ValueError: Input contains NaN, infinity or a value too large for dtype('float32').
Traceback (most recent call last):

  File "/home/aman/anaconda3/envs/deep_learning/lib/python3.7/site-packages/tensorflow/python/ops/script_ops.py", line 207, in __call__
    return func(device, token, args)

  File "/home/aman/anaconda3/envs/deep_learning/lib/python3.7/site-packages/tensorflow/python/ops/script_ops.py", line 109, in __call__
    ret = self._func(*args)

  File "/home/aman/anaconda3/envs/deep_learning/lib/python3.7/site-packages/sklearn/metrics/_ranking.py", line 369, in roc_auc_score
    y_score = check_array(y_score, ensure_2d=False)

  File "/home/aman/anaconda3/envs/deep_learning/lib/python3.7/site-packages/sklearn/utils/validation.py", line 578, in check_array
    allow_nan=force_all_finite == 'allow-nan')

  File "/home/aman/anaconda3/envs/deep_learning/lib/python3.7/site-packages/sklearn/utils/validation.py", line 60, in _assert_all_finite
    msg_dtype if msg_dtype is not None else X.dtype)

ValueError: Input contains NaN, infinity or a value too large for dtype('float32').


     [[{{node metrics/auc_metric/EagerPyFunc}}]]
0 successful operations.
0 derived errors ignored.

Я проверил свой набор данных, а также все вложения, которые я сделал, и нет отсутствующих значений или значений NaN, как а также нет бесконечных значений и все входы и выходы каждого слоя равны Float32

Добро пожаловать на сайт PullRequest, где вы можете задавать вопросы и получать ответы от других членов сообщества.
...