Shape mismatch for mel_input and decoder_output

@erogol
I was trying out the latest branch with the sound norm, and for some odd reason, the mel_input and the decoder_output shapes don’t match. mel_inputs are exactly 3.5x larger than the decoder. Any ideas on how to fix this?

RuntimeError: input and target shapes do not match: input [32 x 102 x 80], target [32 x 357 x 80] at /pytorch/aten/src/THCUNN/generic/MSECriterion.cu:12
RuntimeError: input and target shapes do not match: input [32 x 66 x 80], target [32 x 231 x 80] at /pytorch/aten/src/THCUNN/generic/MSECriterion.cu:12

Nevermind, Figured it out. The problem was with the “r” value set in gradual training. I can’t use r=7. I’ll post here if i figure out why it caused the problem.