Я использую декодер с вниманием, мой код работает, когда ввод (shape = (None,)) для декодера. но я не знаю почему - PullRequest
/ 24 февраля 2020

Здесь вы можете видеть, что я объявил Input (shape = (None,)) для декодера, но я хочу знать, как он работает во время обучения и тестирования.

emb_dim = 300

encoder_input = Input(shape=(60,))
x1=Embedding(vocab_size, 300,weights=[input_matrix],trainable=False)(encoder_input)
e_lstm_out, e_hidden_out, e_cell_out = LSTM(32,return_sequences=True,return_state=True,dropout=0.4)(x1)

decoder_input = Input(shape=(None,))

decoder_embedding_layer = Embedding(y_vocab_size, 300,trainable=True)
decoder_embedding = decoder_embedding_layer(decoder_input)

decoder_lstm = LSTM(32, return_sequences=True, return_state=True,dropout=0.4)
d_lstm_out,d_hidden_out,d_cell_out = decoder_lstm(decoder_embedding,initial_state=[e_hidden_out, e_cell_out])

attention_layer = AttentionLayer(name='attention_layer')
attention_out, attention_states = attention_layer([e_lstm_out, d_lstm_out])

# Concat attention input and decoder LSTM output
concat = Concatenate(axis=-1, name='concat_layer')([d_lstm_out, attention_out])

#dense layer
decoder_dense =  TimeDistributed(Dense(y_vocab_size, activation='softmax'))
decoder_dense_outputs = decoder_dense(concat)

# Define the model 
model = Model([encoder_input, decoder_input], decoder_dense_outputs)


и итоги Ниже приведено, что вы можете видеть, что input_2 (InputLayer) [(None, None)]

Layer (type)                    Output Shape         Param #     Connected to                     
input_1 (InputLayer)            [(None, 60)]         0                                            
input_2 (InputLayer)            [(None, None)]       0                                            
embedding (Embedding)           (None, 60, 300)      5206500     input_1[0][0]                    
embedding_1 (Embedding)         (None, None, 300)    6503400     input_2[0][0]                    
lstm (LSTM)                     [(None, 60, 32), (No 42624       embedding[0][0]                  
lstm_1 (LSTM)                   [(None, None, 32), ( 42624       embedding_1[0][0]                
attention_layer (AttentionLayer ((None, None, 32), ( 2080        lstm[0][0]                       
concat_layer (Concatenate)      (None, None, 64)     0           lstm_1[0][0]                     
time_distributed (TimeDistribut (None, None, 21678)  1409070     concat_layer[0][0]

Total params: 13,206,298
Trainable params: 7,999,798
Non-trainable params: 5,206,500
