I have been training a Tacotron2 model with the world English bible dataset for over 25K steps now, and the TestSentences show progression TestSentence_3.zip (2.0 MB) , however my synthesized audio is completely different (very noisy)He_is_your_father.zip (2.8 MB)
If anyone could shine a light on why the synthesized audio is of much poorer quality, I’d be grateful!
You need to the audio parameters for this dataset specifically. It’s a male voice and default values would not work for it. Use CheckSpectrograms notebook for finding right values.