I am looking to use the released Multi-Speaker-Tacotron2 model for a project, but I am unsure which speaker encoder was used to generate the training data for the model.
Is it the one that is downloaded in the sample notebook and used to clone a voice, or is it the released Speaker Encoder model? Or is it a different encoder I can find somewhere else?
(I guess this question could also be rephrased as which encoder created the speakers.json for the released model?).