Is it possible to train a TTS model in a custom language (Latin) with only a couple hours of good quality training data

Rohan_Garg · July 23, 2020, 1:02am

So I am completely new to ML and TTS, and am trying to learn how to create text to speech for right now maybe 20-30 phrases, but train it more as I get more data. Is this feasible in a custom language? My current track of steps is,

creating a LJSpeech type dataset with my 20-30 phrases
I am going to skip adding my custom slphabet since I plan on using phonemes for training
Writing a text cleaner
How do I figure out how to set the parameters?

georroussos · July 23, 2020, 6:52am

20-30 phrase is way too small. You would need at least 100 samples and a very good model pretrained on the same language.

Rohan_Garg · July 23, 2020, 7:06am

Would I be able to use 100 Latin phrases with the LJSpeech model? Or can that only be used with english

georroussos · July 23, 2020, 7:17am

Probably not, but you can try

LucasRotsen · July 23, 2020, 4:34pm

I found the strategy used in Semi-Supervised Training for Improving Data Efficiency in End-to-End Speech Synthesis very interesting to improve Tacotron’s data efficiency. As long as you can find text and audio corpora for Latin, it can be a good starting point.