Loss spikes and does not go down on LJ Speech

Daniil_Balenko · December 8, 2020, 8:51am

Hi,

I am trying to train tacotron from scratch, using LJ Speech and the config from current master.

Some time after scheduled r reduction from 7 to 5 after 50k steps, losses across the board spike and stay flat:

I’m not a machine learning expert, and would appreciate any help.

Daniil_Balenko · December 22, 2020, 9:15pm

Trying to bump the topic, as it did not get any attention.

@erogol was there an attempt to build a model from scratch with new additions since the last model was released?

Also, what set of hyperparameters should I use if I want to disable newly added behaviors, like spectral and ssim loss, but only use mixed precision training? I see that GA and stopnet loss weights were adjusted as well, and simply setting the former to 0 does not lead to model convergence.

erogol · December 22, 2020, 11:41pm

I am not sure what are good values for your specific problem but I suggest you to start from https://github.com/erogol/TTS_recipes and edit one of them for your dataset. I’d give you a good start.

Daniil_Balenko · December 23, 2020, 12:04am

The graphs that I’ve provided and the dataset that I’m using for training now is vanilla LJSpeech.

My goal though is to fine-tune an LJSpeech model on my custom dataset. Mixed precision would help to get a batch size of 64 on my GTX 1070, but it was added only recently. Released model is trained on a much older revision, so I decided to re-train it on LJSpeech from scratch on the current master.

I was wondering if there were any newer experiments with LJSpeech that yielded good results, and if current config reflects that.

erogol · December 23, 2020, 10:54am

The best config is always with the latest model released. But depending on your technical level, I’d still suggest to start with the receipt.