I fully understand that the model is incomplete. However, I want to try using one of the pre-generated models for generating audio. The issue is that, as a individual who has never used a model like this, (although I have played around with other TTS systems while I was still on windows,) I have absolutely no idea how to actually use the darn thing. I was wondering if there was a easy way to just use the pre-trained model.
Additionally, since if I understand right the models take ASCII input, does anyone have a good down-converter to go from SSML to ASCII? Or is there just a Python API to generate speech.
If it’s not possible to do it currently that’s fine, I would like to know if there is a way to get notified when it reaches that point.