SV2TTS support

BorisHudson · June 9, 2020, 9:42pm

While looking into CorentinJ’s SV2TTS implementation, I came across a comment where he mentions SV2TTS is actually implemented in Mozilla TTS.

Specifically he mentions that @erogol used parts of his code for implementation in Mozilla TTS:

Last I checked, erogol had a lot of features from different papers implemented, including sv2tts. In fact he’s even copied some code from my repo.

However, I cannot find any reference to an SV2TTS implementation in the Mozilla TTS repo.

Does anyone know more about whether SV2TTS is currently supported?

erogol · June 9, 2020, 9:54pm

Yes we have speaker encoder you can use for that but I DID NOT copied his code.

BorisHudson · June 9, 2020, 10:35pm

Yes we have speaker encoder

Thanks for your fast reply!

It looks like the speaker encoder is an implementation of this paper: [1710.10467] Generalized End-to-End Loss for Speaker Verification

However, I was referring to this paper instead. Is that something you support?

erogol · June 9, 2020, 10:48pm

We don’t have a direct implementation yet. But @edresson works on it actively. You can check his fork

BorisHudson · June 9, 2020, 11:12pm

Great, thanks! I see that the last commit on that repo was a year ago, so if you want some help to get restarted, just let me know @edresson1

sanjaesc · June 10, 2020, 7:31am

This is the active branch.

georroussos · June 10, 2020, 9:09am

Edresson has a notebook that can be used to extract embeddings using the encoder from ContentinJ and use them for training (using Edresson’s repo). You can check it out, it works nicely