How to use the TTS models

Black-Kitsune-Gold-Tail · July 10, 2019, 1:05am

I fully understand that the model is incomplete. However, I want to try using one of the pre-generated models for generating audio. The issue is that, as a individual who has never used a model like this, (although I have played around with other TTS systems while I was still on windows,) I have absolutely no idea how to actually use the darn thing. I was wondering if there was a easy way to just use the pre-trained model.

Additionally, since if I understand right the models take ASCII input, does anyone have a good down-converter to go from SSML to ASCII? Or is there just a Python API to generate speech.

If it’s not possible to do it currently that’s fine, I would like to know if there is a way to get notified when it reaches that point.

nmstoker · July 17, 2019, 1:05pm

Hello @Black-Kitsune-Gold-Tail welcome to the forum

With the README being smartened up a bit recently it might not be so immediately obvious how to do what you’re asking, but luckily the info is there, so here are a few pointers:

You need to download a trained model: there are links to those here (this used to be on the README): Released Models · mozilla/TTS Wiki · GitHub
Depending on which of Tacotron or Tacotron2 you want to try, I’d go with one of the last two in the table
Then refer to the methods for testing a model here:
Training and Testing · mozilla/TTS Wiki · GitHub

I’d suggest if you want some continuous use that you try the Demo server

With these pointers you’ll still need to do a bit of digging around, so if you’re not happy setting up python environments, looking through code and GitHub issues you might struggle but it should be fairly straightforward if you’re not a complete beginner. Best of luck!

paul4 · July 18, 2019, 9:08pm

I’m trying to get one of the pre-trained models going without much luck, in the same spirit of the OP. I’ve tried a few different configurations to no avail. I setup a project that would be a good starting point for a reproducible nvidia-docker build that would not be dependent on a local configuration, and was hoping to perhaps get some feedback on getting it to work without producing runtime errors - https://github.com/iGoog/TTSbuild

zephyr · October 29, 2019, 6:42pm

I found the colab notebook in the corresponding Github issue here to be helpful.

Topic		Replies	Views
How to generate actual speech TTS (Text-to-Speech)	6	11047	June 25, 2020
What is the latest Colab Notebook / example on how to generate speech from text using Mozilla TTS? TTS (Text-to-Speech)	2	1609	July 30, 2020
Noob need help with Mozilla TTS TTS (Text-to-Speech)	3	980	August 26, 2020
Any plans for SSML, prosody control; GST TTS (Text-to-Speech)	0	774	September 24, 2019
Running a Prebuilt LJSpeech Model via Server.py TTS (Text-to-Speech)	0	541	March 16, 2020

How to use the TTS models

Related topics