Loss spikes and does not go down on LJ Speech


I am trying to train tacotron from scratch, using LJ Speech and the config from current master.

Some time after scheduled r reduction from 7 to 5 after 50k steps, losses across the board spike and stay flat:

I’m not a machine learning expert, and would appreciate any help.

Trying to bump the topic, as it did not get any attention.

@erogol was there an attempt to build a model from scratch with new additions since the last model was released?

Also, what set of hyperparameters should I use if I want to disable newly added behaviors, like spectral and ssim loss, but only use mixed precision training? I see that GA and stopnet loss weights were adjusted as well, and simply setting the former to 0 does not lead to model convergence.

I am not sure what are good values for your specific problem but I suggest you to start from https://github.com/erogol/TTS_recipes and edit one of them for your dataset. I’d give you a good start.

The graphs that I’ve provided and the dataset that I’m using for training now is vanilla LJSpeech.

My goal though is to fine-tune an LJSpeech model on my custom dataset. Mixed precision would help to get a batch size of 64 on my GTX 1070, but it was added only recently. Released model is trained on a much older revision, so I decided to re-train it on LJSpeech from scratch on the current master.

I was wondering if there were any newer experiments with LJSpeech that yielded good results, and if current config reflects that.

The best config is always with the latest model released. But depending on your technical level, I’d still suggest to start with the receipt.

1 Like