First of all, thank you very much for contributing the code.
I used this project to train a very good Mandarin model.
But there is a problem. Every time I use short text to synthesize audio, the end of the audio will have a long silence.
When I am not using WaveRnn, I can use do_trim_silence=True
to valid vad (in the utils/syntheis.py
Line 81).
But when I use Wavennn as a vocoder, this method doesn’t work.
So Is there any way to stop the extra synthesis?