RuntimeError: Ожидаемый скрытый [0] size (2, 10, 100), завершенный вызов got (2, 5, 100) без активного исключения - PullRequest
0 голосов
/ 06 октября 2019

Я запускаю это в Dask (python) и получаю ошибку ниже, только когда я использую большой набор данных. При поиске ответов я не нахожу ничего, что связано с Dask, и решения, похоже, не относятся к моей проблеме. Я видел несколько ответов, чтобы поставить batch_first=True при инициализации LSTM, но я не знаю, как это сделать, поскольку я не использую напрямую pytorch.

import dask.dataframe as dd
from dask.diagnostics import ProgressBar
import stanfordnlp

nlp = stanfordnlp.Pipeline(processors='tokenize,mwt,lemma,pos', lang='en')

ddf = dd.from_pandas(df, npartitions=4)

ddf['tokens'] = ddf[column].apply(lambda text: nlp(text),
                                     meta=(column, 'object'))
with ProgressBar():
    df = ddf.compute()

Ошибка:

Traceback (most recent call last):
  File "posfinder.py", line 137, in <module>
    POS_TAGGED = find_pos(DATA, COLUMN, WANTED_POS)
  File "posfinder.py", line 36, in find_pos
    df = ddf.compute()
  File "/home/bertil/anaconda3/lib/python3.7/site-packages/dask/base.py", line 175, in compute
    (result,) = compute(self, traverse=False, **kwargs)
  File "/home/bertil/anaconda3/lib/python3.7/site-packages/dask/base.py", line 446, in compute
    results = schedule(dsk, keys, **kwargs)
  File "/home/bertil/anaconda3/lib/python3.7/site-packages/dask/threaded.py", line 82, in get
    **kwargs
  File "/home/bertil/anaconda3/lib/python3.7/site-packages/dask/local.py", line 491, in get_async
    raise_exception(exc, tb)
  File "/home/bertil/anaconda3/lib/python3.7/site-packages/dask/compatibility.py", line 130, in reraise
    raise exc
  File "/home/bertil/anaconda3/lib/python3.7/site-packages/dask/local.py", line 233, in execute_task
    result = _execute_task(task, data)
  File "/home/bertil/anaconda3/lib/python3.7/site-packages/dask/core.py", line 119, in _execute_task
    return func(*args2)
  File "/home/bertil/anaconda3/lib/python3.7/site-packages/dask/optimization.py", line 1059, in __call__
    return core.get(self.dsk, self.outkey, dict(zip(self.inkeys, args)))
  File "/home/bertil/anaconda3/lib/python3.7/site-packages/dask/core.py", line 149, in get
    result = _execute_task(task, cache)
  File "/home/bertil/anaconda3/lib/python3.7/site-packages/dask/core.py", line 119, in _execute_task
    return func(*args2)
  File "/home/bertil/anaconda3/lib/python3.7/site-packages/dask/compatibility.py", line 107, in apply
    return func(*args, **kwargs)
  File "/home/bertil/anaconda3/lib/python3.7/site-packages/dask/dataframe/core.py", line 4826, in apply_and_enforce
    df = func(*args, **kwargs)
  File "/home/bertil/anaconda3/lib/python3.7/site-packages/dask/utils.py", line 854, in __call__
    return getattr(obj, self.method)(*args, **kwargs)
  File "/home/bertil/anaconda3/lib/python3.7/site-packages/pandas/core/series.py", line 3591, in apply
    mapped = lib.map_infer(values, f, convert=convert_dtype)
  File "pandas/_libs/lib.pyx", line 2217, in pandas._libs.lib.map_infer
  File "posfinder.py", line 33, in <lambda>
    ddf['tokens'] = ddf['Message'].apply(lambda text: nlp(text),
  File "/home/bertil/anaconda3/lib/python3.7/site-packages/stanfordnlp/pipeline/core.py", line 176, in __call__
    self.process(doc)
  File "/home/bertil/anaconda3/lib/python3.7/site-packages/stanfordnlp/pipeline/core.py", line 170, in process
    self.processors[processor_name].process(doc)
  File "/home/bertil/anaconda3/lib/python3.7/site-packages/stanfordnlp/pipeline/lemma_processor.py", line 66, in process
    ps, es = self.trainer.predict(b, self.config['beam_size'])
  File "/home/bertil/anaconda3/lib/python3.7/site-packages/stanfordnlp/models/lemma/trainer.py", line 88, in predict
    preds, edit_logits = self.model.predict(src, src_mask, pos=pos, beam_size=beam_size)
  File "/home/bertil/anaconda3/lib/python3.7/site-packages/stanfordnlp/models/common/seq2seq_model.py", line 172, in predict
    h_in, (hn, cn) = self.encode(enc_inputs, src_lens)
  File "/home/bertil/anaconda3/lib/python3.7/site-packages/stanfordnlp/models/common/seq2seq_model.py", line 116, in encode
    packed_h_in, (hn, cn) = self.encoder(packed_inputs, (self.h0, self.c0))
  File "/home/bertil/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 547, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/bertil/anaconda3/lib/python3.7/site-packages/torch/nn/modules/rnn.py", line 562, in forward
    return self.forward_packed(input, hx)
  File "/home/bertil/anaconda3/lib/python3.7/site-packages/torch/nn/modules/rnn.py", line 554, in forward_packed
    output, hidden = self.forward_impl(input, hx, batch_sizes, max_batch_size, sorted_indices)
  File "/home/bertil/anaconda3/lib/python3.7/site-packages/torch/nn/modules/rnn.py", line 523, in forward_impl
    self.check_forward_args(input, hx, batch_sizes)
  File "/home/bertil/anaconda3/lib/python3.7/site-packages/torch/nn/modules/rnn.py", line 500, in check_forward_args
    'Expected hidden[0] size {}, got {}')
  File "/home/bertil/anaconda3/lib/python3.7/site-packages/torch/nn/modules/rnn.py", line 166, in check_hidden_size
    raise RuntimeError(msg.format(expected_hidden_size, tuple(hx.size())))
RuntimeError: Expected hidden[0] size (2, 10, 100), got (2, 5, 100)
terminate called without an active exception
Aborted (core dumped)
...