Best model + vocoder combination for realistic speech generation

JohnnyWobble · March 11, 2021, 12:09am

Hi, I’m looking to implement this in a project. The goal is to generate speech that sounds a lot like a regular human. Does anyone know the best model and vocoder combination that can achieve this?

dkreutz · March 11, 2021, 1:05pm

For the german “Thorsten” dataset we are seeing good results with a TTS-model having DDC enabled and WaveGrad vocoder model. This is still highly depending on the quality of your dataset and intended use-case, though.