BorisHudson
(drfinkus@gmail.com)
June 9, 2020, 9:42pm
1
While looking into CorentinJ’s SV2TTS implementation , I came across a comment where he mentions SV2TTS is actually implemented in Mozilla TTS.
Specifically he mentions that @erogol used parts of his code for implementation in Mozilla TTS:
Last I checked, erogol had a lot of features from different papers implemented, including sv2tts. In fact he’s even copied some code from my repo.
However, I cannot find any reference to an SV2TTS implementation in the Mozilla TTS repo.
Does anyone know more about whether SV2TTS is currently supported?
erogol
(Egolge)
June 9, 2020, 9:54pm
2
Yes we have speaker encoder you can use for that but I DID NOT copied his code.
BorisHudson
(drfinkus@gmail.com)
June 9, 2020, 10:35pm
3
Yes we have speaker encoder
Thanks for your fast reply!
It looks like the speaker encoder is an implementation of this paper: [1710.10467] Generalized End-to-End Loss for Speaker Verification
However, I was referring to this paper instead. Is that something you support?
erogol
(Egolge)
June 9, 2020, 10:48pm
4
We don’t have a direct implementation yet. But @edresson works on it actively. You can check his fork
Deep learning for Text to Speech
BorisHudson
(drfinkus@gmail.com)
June 9, 2020, 11:12pm
5
Great, thanks! I see that the last commit on that repo was a year ago, so if you want some help to get restarted, just let me know @edresson1
This is the active branch.
Edresson has a notebook that can be used to extract embeddings using the encoder from ContentinJ and use them for training (using Edresson’s repo). You can check it out, it works nicely