Untrained Vocoder Question

I am performing the first phase of the training with a code that I clone from TTS recepies(hier). So I’m running the first code snippet below.

# change the GPU id if needed 
    CUDA_VISIBLE_DEVICES="0" python TTS/train.py --config_path model_config.json
    # train vocoder ...
    CUDA_VISIBLE_DEVICES="0" python TTS/vocoder/train.py --config_path vocoder_config.json

Although I haven’t trained the vocoder yet, under the tts_model/test_audios folder, there are test sounds (corresponding to the sample sentences I entered). How is this possible even though I have not trained the vocoder? I used a 3.5GB data set and now it’s 180K in training (single GPU). Sounds are still coming, will the robots get better when you train the vocoder?

If anyone knows the answer to my question and shares it with me, I would be very happy.

Your model is already synthesized by using a classic synthesizer called Griffin Lim. It doesn’t sound great, but you can hear what is being said.

Yes, a vocoder will make it sound much more natural.

1 Like

Thank you so much @othiele . Another question;

Do I have to start the vocoder training after the decoder training finished?

Normally the Vocoder takes mel spectograms as input from the decoder. So as far as I understand we need to train the vocoder after the decoder. But the vocoder training code works without training the decoder, would this be a correct training? So,does it make sense to start training the decoder on one computer and start training the vocoder on another computer?

Both are seperate, you can train a vocoder now if you want to or try one that erogol already made public. Might not be great for other stuff than English.

1 Like