Any plans for SSML, prosody control; GST

vcjobacc · September 24, 2019, 10:44am

Good afternoon!

Are there any plans for SSML leveraging? It would be great to be able to change pauses, emphasize specific parts of the sentences and so on.

GST model supposed to encode specific style of prosody as in the target audio, right? You have added the TacortonGST model, but how to use it? How to provide the target audio, and what are requirements for it (e.g. speaker must be the same, length of the target audio must be roughly the same as the synthesized one and so on).
BTW, still can’t make GST as well as Tacotron2 models to learn (issue on github https://github.com/mozilla/TTS/issues/287)

Topic		Replies	Views
Combining GST and multi-speaker for adaptation and prosody control TTS (Text-to-Speech)	1	789	December 6, 2021
Tacotron-gst branch TTS (Text-to-Speech)	1	561	July 10, 2019
How to use the TTS models TTS (Text-to-Speech)	3	14130	October 29, 2019
[TWEB dataset] TestSentence audio is progressing while synthesized audio is noisy TTS (Text-to-Speech)	1	319	July 28, 2020
What are the TTS models you know to be faster than Tacotron? TTS (Text-to-Speech)	62	14774	April 25, 2021

Any plans for SSML, prosody control; GST

Related topics