Assigning weights to certain words while training DeepSpeech Model

othiele · June 24, 2020, 10:17am

Hm, you should be able to get a bit more out of it, especially if the entities occur more often.

Try a higher batch size, if your GPU can make it: 4, 8 or higher

Use a learning rate of 0.0001

Definitely use a higher dropout of 0.25-0.4

tieonster · June 24, 2020, 5:00pm

Hi, higher batch size results in OOM errors for me the maximum I could go was unfortunately, 1. Just curious how would a higher batch size lead to better results?

For the learning rate, I was using 0.00005 but it led to overfitting and hence early stopping at epoch 25 so I was thinking that I should lower the learning rate?

Yep, will try a dropout rate of 0.3.

othiele · June 24, 2020, 7:30pm

Bigger batch doesn’t make it better, only way faster, you are right there.

Lower learning rate just trains longer, 15 epochs could easily be enough for your material.

You need sth like that for dropout.

reuben · June 24, 2020, 8:33pm

A bigger batch size can lead to better training results by more closely matching the non-stochastic loss over the entire dataset. The smaller the batch size, the smaller you’ll have to make your learning rate to avoid weird samples/batches throwing off the entire model too badly.

tieonster · June 25, 2020, 1:30am

Thank you for the explanation and advice.

@othiele would you recommend me to retrain the model from scratch or continue training from the 25 epochs I left off at with the new parameters that you’ve suggested?

Also, any idea as to why increasing the number of sentences in building the LM led to a poorer entity rate?

othiele · June 25, 2020, 7:55am

I would restart from scratch and keep the logs with losses etc. This will help you comparing it to future runs. Even though it will take a while with batch 1.

Scorers are a game of probability between words. A scorer tries to find the most likely combination for given letters from the neural net. So check the raw output without scorer for a defined test set and compare it to the scorer output. Usually you’ll find some sort of pattern. Maybe it’s something special within Maly?

Topic		Replies	Views
Learning new words for STT DeepSpeech	8	570	November 10, 2020
Help: how to generate a custom scorer? DeepSpeech	18	2727	August 13, 2021
Setting language model weight to 0 gives different results for different language models DeepSpeech	25	2300	September 30, 2019
First contact with Deep Speech DeepSpeech	10	920	July 29, 2020
How to get good transcription results with only a specific English vocabulary? DeepSpeech	15	1772	June 3, 2020

Assigning weights to certain words while training DeepSpeech Model

Related topics