Следующий код выполнялся без ошибок на pytorch nightly (1.5.0.dev20200206), однако после того, как я установил стабильную сборку 1.5, метод RNN Forward, определенный ниже, начал выдавать ошибку:
def forward(self, sequence):
print('Sequence shape:', sequence.shape)
sequence = sequence.clone().view(len(sequence), 1, -1)
print("flattened shape: ", sequence.shape)
lstm_out, hidden = self.lstm(
sequence, self.hidden
)
print(lstm_out.shape)
out_space = self.hidden2out(lstm_out[:, -1])
self.hidden = hidden
print("hiddens")
print(hidden[0].shape)
print(hidden[1].shape)
print(" out_space: ", out_space.shape)
out_scores = torch.sigmoid(out_space)
print("out_scores: ", out_scores.shape)
out = out_scores.squeeze()
print(out.shape)
return out
I добавлена функция clone()
для предотвращения изменений памяти на месте из view()
и сделаны явно не соответствующие назначения переменных. Однако я все еще получаю следующую ошибку:
Sequence shape: torch.Size([200, 19, 62])
flattened shape: torch.Size([200, 1, 1178])
torch.Size([200, 1, 8])
hiddens
torch.Size([1, 1, 8])
torch.Size([1, 1, 8])
out_space: torch.Size([200, 1])
out_scores: torch.Size([200, 1])
torch.Size([200])
Warning: Error detected in AddmmBackward. Traceback of forward call that caused the error:
File "main.py", line 240, in <module>
main_loop(args)
File "main.py", line 115, in main_loop
train.run(args)
File "/data/learnedbloomfilter/python/classifier/train.py", line 519, in run
args.log_every,
File "/data/learnedbloomfilter/python/classifier/train.py", line 88, in train
predictions = model(features)
File "/data/miniconda3/envs/lbf/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in __call__
result = self.forward(*input, **kwargs)
File "/data/learnedbloomfilter/python/classifier/embedding_lstm.py", line 65, in forward
sequence, self.hidden
File "/data/miniconda3/envs/lbf/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in __call__
result = self.forward(*input, **kwargs)
File "/data/miniconda3/envs/lbf/lib/python3.7/site-packages/torch/nn/modules/rnn.py", line 570, in forward
self.dropout, self.training, self.bidirectional, self.batch_first)
(print_stack at /opt/conda/conda-bld/pytorch_1587428190859/work/torch/csrc/autograd/python_anomaly_mode.cpp:60)
Traceback (most recent call last):
File "main.py", line 240, in <module>
main_loop(args)
File "main.py", line 115, in main_loop
train.run(args)
File "/data/learnedbloomfilter/python/classifier/train.py", line 519, in run
args.log_every,
File "/data/learnedbloomfilter/python/classifier/train.py", line 97, in train
loss.backward(retain_graph=True)
File "/data/miniconda3/envs/lbf/lib/python3.7/site-packages/torch/tensor.py", line 198, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph)
File "/data/miniconda3/envs/lbf/lib/python3.7/site-packages/torch/autograd/__init__.py", line 100, in backward
allow_unreachable=True) # allow_unreachable flag
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.FloatTensor [8, 32]], which is output 0 of TBackward, is at version 2; expected version 1 instead. Hint: the backtrace
further above shows the operation that failed to compute its gradient. The variable in question was changed in there or anywhere later. Good luck!
Я изолировал ошибку от forward()
, но не могу найти промежуточный тензор [torch.FloatTensor [8, 32]]
, который, кажется, вызывает проблему (ни один из тензоров shape в моем прямом совпадении метода, поэтому он должен быть в методе lstm forward()
). Я использую только ЦП, а не cuda.
Остальную часть кода rnn см. Здесь: https://gist.github.com/yaatehr/aac21cae05b24101f2369c97cfecb47b
Спасибо!