I’m trying to train a custom voice, with 24 hours of an audio book recording and associated transcription in LJSpeech format for the dataloader (I’ve posted a sample on my github https://github.com/dubreuia/hosting/tree/master/mozilla-tts/custom-dataset-sample). I’ve used the notebooks AnalyzeDataset and CheckSpectrograms to check my data first, it looks good to me.
After 60K iter (see screenshots), I have nothing (I’ve trained to 100K to no avail), the audio is blank, or faintly humming. I’ve trained on LJSheech, I have good results at 100K as it should.
It is probably similar to Custom voice - TTS not learning.
Does someone have an idea why my dataset wouldn’t train?