@erogol
I was trying out the latest branch with the sound norm, and for some odd reason, the mel_input and the decoder_output shapes don’t match. mel_inputs are exactly 3.5x larger than the decoder. Any ideas on how to fix this?
RuntimeError: input and target shapes do not match: input [32 x 102 x 80], target [32 x 357 x 80] at /pytorch/aten/src/THCUNN/generic/MSECriterion.cu:12
RuntimeError: input and target shapes do not match: input [32 x 66 x 80], target [32 x 231 x 80] at /pytorch/aten/src/THCUNN/generic/MSECriterion.cu:12