Hello. I have a trouble understanding a certain line in the decoder forward method. This one: https://github.com/mozilla/TTS/blob/924d6ad4e55e7763e61933bf3542dae1e892c369/layers/tacotron.py#L405-L417
memory_input, attention_rnn_hidden, decoder_rnn_hiddens,\
current_context_vec, attention, attention_cum = self._init_states(inputs)
while True:
if t > 0:
if memory is None:
new_memory = outputs[-1]
else:
new_memory = memory[t - 1]
# Queuing if memory size defined else use previous prediction only.
if self.memory_size > 0:
memory_input = torch.cat([memory_input[:, self.r * self.memory_dim:].clone(), new_memory], dim=-1)
else:
memory_input = new_memory
So my question is, what does this concatenation do? We have a new_memory
with size (B, r*memory_channels)
and in step t=1
when we can enter this part of the condition, the current memory_input
we are trying to concatenate with is of shape (B, memory_size*memory_channels)
. So in case memory_size
is as a default set to memory_size = r
this line literally concatenates with an empty tensor. The only situation it actually concatenates with something is only when memory_size
is higher then r
, am I right? shouldn’t the condition be changed to if self.memory_size > r
or something like that in this case?
Maybe I am missing something, but I am stuck on this line for like more then half an hours already and I cannot see a different solution. I would be glad for clarification.