TTS + MelNet Can we try?

MelNet Paper
MelNet Pytorch Code
MelNet Voice example

Awesome. voice more realistic.

here my questions,

  1. how do we build MelNet model like this WaveNet Mol model?
  2. if i replace WaveRNN to MelNet code in TTS, which part to start and do stuff?

i am a beginner for all this. my way of understanding i am asking this sir.
continuously i am researching it. i will ask more doubts about MelNet.

Train your tacotron2 till you have decent o/p. Then use erogol’s wavernn fork to:

First, generate mel specs with the ipynb(ExtractTTSpectrogram.ipynb).

Add the data paths to config.

Train wavernn.

2 Likes

I think, we can try the idea that predicts the spectrograms incrementally in different scales regardless of MelNet.

also please correct me if Im wrong, but MelNet uses GL like vocoder. Therefore, I don’t see its gain as an alternative to WaveRNN

1 Like

thanks @alchemi5t. here the same way we need to train MelNet by using ExtractTTSpectrogram.ipynb.

Add the data paths to config.
that is fine. but that config.json parameters are same? or what we need to change in MelNet architecture?

sorry, i will start processing. any doubts i will ask here.

thanks @alchemi5t @erogol

I don’t think it’s that straight forward.

For training WaveRNN, you can find discussions on the config file here:

Like erogol said, I don’t think it’s an alternate vocoder either.
Also, config files and any/all implementation of neural networks are not implicitly connected. You could possibly change everything about it depending on how you implement the pipeline.

Regardless, if you’re looking for a vocoder, Erogol’s fork of wavernn should work well with mozilla TTS’s taco2.

Good luck!

P.S. I might be wrong about MelNet. Reading the paper atm.

1 Like

Thank you so much @alchemi5t :slight_smile:

Then i will try to research on pipline different between WaveRNN and MelNet. like you said that "You could possibly change everything about it depending on how you implement the pipeline."

Thanks :slight_smile: