Assigning weights to certain words while training DeepSpeech Model

Hm, you should be able to get a bit more out of it, especially if the entities occur more often.

Try a higher batch size, if your GPU can make it: 4, 8 or higher

Use a learning rate of 0.0001

Definitely use a higher dropout of 0.25-0.4

Hi, higher batch size results in OOM errors for me the maximum I could go was unfortunately, 1. :confused: Just curious how would a higher batch size lead to better results?

For the learning rate, I was using 0.00005 but it led to overfitting and hence early stopping at epoch 25 so I was thinking that I should lower the learning rate?

Yep, will try a dropout rate of 0.3.

Bigger batch doesn’t make it better, only way faster, you are right there.

Lower learning rate just trains longer, 15 epochs could easily be enough for your material.

You need sth like that for dropout.

A bigger batch size can lead to better training results by more closely matching the non-stochastic loss over the entire dataset. The smaller the batch size, the smaller you’ll have to make your learning rate to avoid weird samples/batches throwing off the entire model too badly.

Thank you for the explanation and advice.

@othiele would you recommend me to retrain the model from scratch or continue training from the 25 epochs I left off at with the new parameters that you’ve suggested?

Also, any idea as to why increasing the number of sentences in building the LM led to a poorer entity rate?

I would restart from scratch and keep the logs with losses etc. This will help you comparing it to future runs. Even though it will take a while with batch 1.

Scorers are a game of probability between words. A scorer tries to find the most likely combination for given letters from the neural net. So check the raw output without scorer for a defined test set and compare it to the scorer output. Usually you’ll find some sort of pattern. Maybe it’s something special within Maly?