Size mismatch for decoder.stopnet.1.linear_layer.weight: copying a param with shape torch.Size([1, 1584]) from checkpoint, the shape in current model is torch.Size([1, 1104])

Hi,
I trained tacotron2 on private dataset.
I am using waveRNN pertarined model.
While testing trained tacotron2 it is throwing below error:

Using model: Tacotron2
Setting up Audio Processor…
| > bits:None
| > sample_rate:22050
| > num_mels:80
| > min_level_db:-100
| > frame_shift_ms:12.5
| > frame_length_ms:50
| > ref_level_db:20
| > num_freq:1025
| > power:1.5
| > preemphasis:0.98
| > griffin_lim_iters:60
| > signal_norm:True
| > symmetric_norm:False
| > mel_fmin:0.0
| > mel_fmax:8000.0
| > max_norm:1.0
| > clip_norm:True
| > do_trim_silence:True
| > n_fft:2048
| > hop_length:275
| > win_length:1102
Traceback (most recent call last):
File “test.py”, line 66, in
model.load_state_dict(cp[‘model’])
File “/home/ubuntu/drive_a/mayank/Test/vin_test_3.6/lib/python3.6/site-packages/torch/nn/modules/module.py”, line 1052, in load_state_dict
self.class.name, “\n\t”.join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for Tacotron2:
Missing key(s) in state_dict: “encoder.convolutions.0.net.0.weight”, “encoder.convolutions.0.net.0.bias”, “encoder.convolutions.0.net.1.weight”, “encoder.convolutions.0.net.1.bias”, “encoder.convolutions.0.net.1.running_mean”, “encoder.convolutions.0.net.1.running_var”, “encoder.convolutions.1.net.0.weight”, “encoder.convolutions.1.net.0.bias”, “encoder.convolutions.1.net.1.weight”, “encoder.convolutions.1.net.1.bias”, “encoder.convolutions.1.net.1.running_mean”, “encoder.convolutions.1.net.1.running_var”, “encoder.convolutions.2.net.0.weight”, “encoder.convolutions.2.net.0.bias”, “encoder.convolutions.2.net.1.weight”, “encoder.convolutions.2.net.1.bias”, “encoder.convolutions.2.net.1.running_mean”, “encoder.convolutions.2.net.1.running_var”, “decoder.prenet.layers.0.linear_layer.weight”, “decoder.prenet.layers.0.bn.weight”, “decoder.prenet.layers.0.bn.bias”, “decoder.prenet.layers.0.bn.running_mean”, “decoder.prenet.layers.0.bn.running_var”, “decoder.prenet.layers.1.linear_layer.weight”, “decoder.prenet.layers.1.bn.weight”, “decoder.prenet.layers.1.bn.bias”, “decoder.prenet.layers.1.bn.running_mean”, “decoder.prenet.layers.1.bn.running_var”, “decoder.attention_layer.query_layer.linear_layer.weight”, “decoder.attention_layer.inputs_layer.linear_layer.weight”, “decoder.attention_layer.v.linear_layer.weight”, “decoder.attention_layer.v.linear_layer.bias”, “decoder.attention_layer.location_layer.location_conv.weight”, “decoder.attention_layer.location_layer.location_dense.linear_layer.weight”, “decoder.attention_rnn_init.weight”, “decoder.go_frame_init.weight”, “decoder.decoder_rnn_inits.weight”, “postnet.convolutions.0.net.0.weight”, “postnet.convolutions.0.net.0.bias”, “postnet.convolutions.0.net.1.weight”, “postnet.convolutions.0.net.1.bias”, “postnet.convolutions.0.net.1.running_mean”, “postnet.convolutions.0.net.1.running_var”, “postnet.convolutions.1.net.0.weight”, “postnet.convolutions.1.net.0.bias”, “postnet.convolutions.1.net.1.weight”, “postnet.convolutions.1.net.1.bias”, “postnet.convolutions.1.net.1.running_mean”, “postnet.convolutions.1.net.1.running_var”, “postnet.convolutions.2.net.0.weight”, “postnet.convolutions.2.net.0.bias”, “postnet.convolutions.2.net.1.weight”, “postnet.convolutions.2.net.1.bias”, “postnet.convolutions.2.net.1.running_mean”, “postnet.convolutions.2.net.1.running_var”, “postnet.convolutions.3.net.0.weight”, “postnet.convolutions.3.net.0.bias”, “postnet.convolutions.3.net.1.weight”, “postnet.convolutions.3.net.1.bias”, “postnet.convolutions.3.net.1.running_mean”, “postnet.convolutions.3.net.1.running_var”, “postnet.convolutions.4.net.0.weight”, “postnet.convolutions.4.net.0.bias”, “postnet.convolutions.4.net.1.weight”, “postnet.convolutions.4.net.1.bias”, “postnet.convolutions.4.net.1.running_mean”, “postnet.convolutions.4.net.1.running_var”.
Unexpected key(s) in state_dict: “coarse_decoder.prenet.linear_layers.0.linear_layer.weight”, “coarse_decoder.prenet.linear_layers.1.linear_layer.weight”, “coarse_decoder.attention_rnn.weight_ih”, “coarse_decoder.attention_rnn.weight_hh”, “coarse_decoder.attention_rnn.bias_ih”, “coarse_decoder.attention_rnn.bias_hh”, “coarse_decoder.attention.query_layer.linear_layer.weight”, “coarse_decoder.attention.inputs_layer.linear_layer.weight”, “coarse_decoder.attention.v.linear_layer.weight”, “coarse_decoder.attention.v.linear_layer.bias”, “coarse_decoder.attention.location_layer.location_conv1d.weight”, “coarse_decoder.attention.location_layer.location_dense.linear_layer.weight”, “coarse_decoder.decoder_rnn.weight_ih”, “coarse_decoder.decoder_rnn.weight_hh”, “coarse_decoder.decoder_rnn.bias_ih”, “coarse_decoder.decoder_rnn.bias_hh”, “coarse_decoder.linear_projection.linear_layer.weight”, “coarse_decoder.linear_projection.linear_layer.bias”, “coarse_decoder.stopnet.1.linear_layer.weight”, “coarse_decoder.stopnet.1.linear_layer.bias”, “encoder.convolutions.0.convolution1d.weight”, “encoder.convolutions.0.convolution1d.bias”, “encoder.convolutions.0.batch_normalization.weight”, “encoder.convolutions.0.batch_normalization.bias”, “encoder.convolutions.0.batch_normalization.running_mean”, “encoder.convolutions.0.batch_normalization.running_var”, “encoder.convolutions.0.batch_normalization.num_batches_tracked”, “encoder.convolutions.1.convolution1d.weight”, “encoder.convolutions.1.convolution1d.bias”, “encoder.convolutions.1.batch_normalization.weight”, “encoder.convolutions.1.batch_normalization.bias”, “encoder.convolutions.1.batch_normalization.running_mean”, “encoder.convolutions.1.batch_normalization.running_var”, “encoder.convolutions.1.batch_normalization.num_batches_tracked”, “encoder.convolutions.2.convolution1d.weight”, “encoder.convolutions.2.convolution1d.bias”, “encoder.convolutions.2.batch_normalization.weight”, “encoder.convolutions.2.batch_normalization.bias”, “encoder.convolutions.2.batch_normalization.running_mean”, “encoder.convolutions.2.batch_normalization.running_var”, “encoder.convolutions.2.batch_normalization.num_batches_tracked”, “decoder.attention.query_layer.linear_layer.weight”, “decoder.attention.inputs_layer.linear_layer.weight”, “decoder.attention.v.linear_layer.weight”, “decoder.attention.v.linear_layer.bias”, “decoder.attention.location_layer.location_conv1d.weight”, “decoder.attention.location_layer.location_dense.linear_layer.weight”, “decoder.prenet.linear_layers.0.linear_layer.weight”, “decoder.prenet.linear_layers.1.linear_layer.weight”, “postnet.convolutions.0.convolution1d.weight”, “postnet.convolutions.0.convolution1d.bias”, “postnet.convolutions.0.batch_normalization.weight”, “postnet.convolutions.0.batch_normalization.bias”, “postnet.convolutions.0.batch_normalization.running_mean”, “postnet.convolutions.0.batch_normalization.running_var”, “postnet.convolutions.0.batch_normalization.num_batches_tracked”, “postnet.convolutions.1.convolution1d.weight”, “postnet.convolutions.1.convolution1d.bias”, “postnet.convolutions.1.batch_normalization.weight”, “postnet.convolutions.1.batch_normalization.bias”, “postnet.convolutions.1.batch_normalization.running_mean”, “postnet.convolutions.1.batch_normalization.running_var”, “postnet.convolutions.1.batch_normalization.num_batches_tracked”, “postnet.convolutions.2.convolution1d.weight”, “postnet.convolutions.2.convolution1d.bias”, “postnet.convolutions.2.batch_normalization.weight”, “postnet.convolutions.2.batch_normalization.bias”, “postnet.convolutions.2.batch_normalization.running_mean”, “postnet.convolutions.2.batch_normalization.running_var”, “postnet.convolutions.2.batch_normalization.num_batches_tracked”, “postnet.convolutions.3.convolution1d.weight”, “postnet.convolutions.3.convolution1d.bias”, “postnet.convolutions.3.batch_normalization.weight”, “postnet.convolutions.3.batch_normalization.bias”, “postnet.convolutions.3.batch_normalization.running_mean”, “postnet.convolutions.3.batch_normalization.running_var”, “postnet.convolutions.3.batch_normalization.num_batches_tracked”, “postnet.convolutions.4.convolution1d.weight”, “postnet.convolutions.4.convolution1d.bias”, “postnet.convolutions.4.batch_normalization.weight”, “postnet.convolutions.4.batch_normalization.bias”, “postnet.convolutions.4.batch_normalization.running_mean”, “postnet.convolutions.4.batch_normalization.running_var”, “postnet.convolutions.4.batch_normalization.num_batches_tracked”.
size mismatch for embedding.weight: copying a param with shape torch.Size([129, 512]) from checkpoint, the shape in current model is torch.Size([62, 512]).
size mismatch for decoder.linear_projection.linear_layer.weight: copying a param with shape torch.Size([560, 1536]) from checkpoint, the shape in current model is torch.Size([80, 1536]).
size mismatch for decoder.linear_projection.linear_layer.bias: copying a param with shape torch.Size([560]) from checkpoint, the shape in current model is torch.Size([80]).
size mismatch for decoder.stopnet.1.linear_layer.weight: copying a param with shape torch.Size([1, 1584]) from checkpoint, the shape in current model is torch.Size([1, 1104]).

The model sizes don’t match, so you are not using the same code base. Check branches, commits and configs.

you need to use the right commit given with the pretrained model (if you use one of our released models)