Использование Tokenizer в TextLineDataset # map - PullRequest
0 голосов
/ 22 апреля 2020

Я пытаюсь использовать TextLineDataset и Tokenizer вместе, но у меня возникают проблемы при использовании его с #map. Я получаю сообщение об ошибке, что невозможно перебрать Тензор. Мне кажется, я понимаю, что #texts_to_sequence пытается запустить представление графа. Это верно? Можно ли даже объединить эти две части? Имеет ли смысл их объединять?

from keras.preprocessing.text import Tokenizer
import tensorflow as tf

# Example used from
# https://stackoverflow.com/a/51203923/998092

docs = ["A heart that",
         "full up like",
         "a landfill",
        "no surprises",
        "and no alarms"
         "a job that slowly"
         "Bruises that",
         "You look so",
         "tired happy",
         "no alarms",
        "and no surprises"]

T = Tokenizer()
T.fit_on_texts(docs)

def encode(sentence):
  return T.texts_to_sequences(sentence)

data = tf.data.TextLineDataset.from_tensor_slices(docs)
encoded_data = data.map(encode)

print("result for test 1:\n%s" %(data))

Минимальный пример в коллаб

В результате:

WARNING:tensorflow:Entity <bound method Tokenizer.texts_to_sequences_generator of <keras_preprocessing.text.Tokenizer object at 0x7f9089f5de48>> appears to be a generator function. It will not be converted by AutoGraph.
WARNING: Entity <bound method Tokenizer.texts_to_sequences_generator of <keras_preprocessing.text.Tokenizer object at 0x7f9089f5de48>> appears to be a generator function. It will not be converted by AutoGraph.

---------------------------------------------------------------------------

OperatorNotAllowedInGraphError            Traceback (most recent call last)

<ipython-input-8-46695a877229> in <module>()
     25 
     26 data = tf.data.TextLineDataset.from_tensor_slices(docs)
---> 27 encoded_data = data.map(encode)
     28 
     29 print("result for test 1:\n%s" %(data))

10 frames

/usr/local/lib/python3.6/dist-packages/tensorflow/python/autograph/impl/api.py in wrapper(*args, **kwargs)
    263       except Exception as e:  # pylint:disable=broad-except
    264         if hasattr(e, 'ag_error_metadata'):
--> 265           raise e.ag_error_metadata.to_exception(e)
    266         else:
    267           raise

OperatorNotAllowedInGraphError: in user code:

    <ipython-input-8-46695a877229>:24 encode  *
        return T.texts_to_sequences(sentence)
    /usr/local/lib/python3.6/dist-packages/keras_preprocessing/text.py:279 texts_to_sequences  *
        return list(self.texts_to_sequences_generator(texts))
    /usr/local/lib/python3.6/dist-packages/keras_preprocessing/text.py:298 texts_to_sequences_generator  **
        for text in texts:
    /usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py:561 __iter__
        self._disallow_iteration()
    /usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py:557 _disallow_iteration
        self._disallow_in_graph_mode("iterating over `tf.Tensor`")
    /usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py:537 _disallow_in_graph_mode
        " this function with @tf.function.".format(task))

    OperatorNotAllowedInGraphError: iterating over `tf.Tensor` is not allowed in Graph execution. Use Eager execution or decorate this function with @tf.function.
...