Train Multispeaker Dataset + WaveRNN

Hello everyone, I created a custom German dataset extracting audiofiles from a game called Gothic. I’ve succesfully trained a model using the repo from fatchord on one speaker. Here you can hear samples from the main hero of the game.

Any tips on how I would go ahead to train a multispeaker model? Would I have to split every speaker into separate folders? Is there an entry point I could start reading about this topic?
I’m also interested in training a vocoder, would it be possible to train an universal vocoder explicit for this dataset?

Thanks for the repository and all the work you put into.
Best regards.

1 Like

Sounds great. Is the game State of Mind? (Just curious)

TTS should already work for multi speaker. The best way is to format your dataset as LibriTTS and use the formatter already there.

Regarding the vocoder. We already trained and released an universal vocoder in WaveRNN repo https://github.com/erogol/WaveRNN . It works for different speakers and languages nicely.