Training 2 New Custom Datasets with TTS-recipes, need suggestions for inference/synthesis

Zain_Mujahid · April 12, 2021, 10:32am

Hi, we have recorded two datasets (male and female) for URDU language. Both the datasets are in LJSpeech format with total length of 10 hours each and sample rate of 48kHz. I have previously trained simple Mozilla Tacotron (Griffin-Lim) model using distributed training for upto 10k epochs, and got good results for the Male voice, however, the female voice didnt come out good. Maybe its because of the recording style of the data idk.

Now I want to try my luck on https://github.com/coqui-ai/TTS-recipes/tree/master/LJSpeech/DoubleDecoderConsistency this following recipe. Is there any notebook that I can use later on for the voice synthesis once I have the trained models from the above recipe? I am looking to improve the prosody and intonation of the voice.

Thank You.
Regards,
Zain

Saad_Raza · January 27, 2022, 6:47am

Aoa, Zain can you share the dataset you collect with me? I am also trying to develop an urdu tts model but cannot find sufficient data online. I can provide you with email if needed.

Thankyou,
Saad

Zain_Mujahid · January 28, 2022, 10:10pm

W.S. Saad,

The datasets are private and can not be shared. My apologies.