Love the mozilla TTS library. Previously I’ve used the 185k iterations model, but now I want to train my own model. Given @erogol’s suggestion, https://github.com/mozilla/TTS/issues/165, I had 18 hours of audio produced by a voice actor, recorded to WAV files, that have been annotated, and they are each one or two sentences long.
My question is, how do I actually use the TTS library to train on my corpus? Can someone point me to the right approach? I’m guessing it’s not as simple as pointing to a corpus manifest, but I could be wrong. I couldn’t find anything about how to set up my workspace for training, so I wanted to see if anyone here had any tips for training a new model with mozilla TTS.