Экспериментальный слой TextVectorization Кераса выдает ошибку в форме тензор - PullRequest
0 голосов
/ 05 мая 2020

Доброе утро, я пытаюсь использовать слой TextVectorization Keras , и, следуя коду из примеров , я получаю индекс of range ошибка при проверке формы Тензор. Форма моего тензора точно такая же, как в примере, с той лишь разницей, что это TensorSliceDataset, а не DatasetV1Adapter. Ошибка генерируется методом adapt, и Вы можете найти записную книжку Colab здесь. Это Traceback:

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-15-d75b8891387b> in <module>()
----> 1 model = build_model(txt_ds)
      2 model.fit(training_dataset,
      3                       batch_size=10,
      4                       epochs=10)
      5 #trained = train_model(model, training_dataset, 10)

3 frames
<ipython-input-8-58a05e2a5fa5> in build_model(text_dataset)
     13     # dataset to create the vocabulary. You don't have to batch, but for large
     14     # datasets this means we're not keeping spare copies of the dataset in memory.
---> 15     vectorize_layer.adapt(text_dataset)
     16     # The first layer in our model is the vectorization layer. After this layer,
     17     # we have a tensor of shape (batch_size, features) containing TF-IDF features.

/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/layers/preprocessing/text_vectorization.py in adapt(self, data, reset_state)
    396       if shape.rank == 1:
    397         data = data.map(lambda tensor: array_ops.expand_dims(tensor, -1))
--> 398       self.build(dataset_ops.get_legacy_output_shapes(data))
    399       preprocessed_inputs = data.map(self._preprocess)
    400     else:

/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/layers/preprocessing/text_vectorization.py in build(self, input_shape)
    526     # expression to evaluate to False instead of True if the shape is undefined;
    527     # the expression needs to evaluate to True in that case.
--> 528     if self._split is not None and not input_shape[1] == 1:  # pylint: disable=g-comparison-negation
    529       raise RuntimeError(
    530           "When using TextVectorization to tokenize strings, the first "

/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/tensor_shape.py in __getitem__(self, key)
    868       else:
    869         if self._v2_behavior:
--> 870           return self._dims[key].value
    871         else:
    872           return self._dims[key]

IndexError: list index out of range

Спасибо!

...