Language Model Tuning

Hello,

Can anyone recommend sources of information for tuning the language model? Specifically, there is the order of the language model (-o) that is set to 5grams in Ken H’s website but I can’t find much on when it’s recommended to change this.

If I’m transcribing shorter utterances, should I set this order value to a lower number?

If I’m fine tuning to a new domain specific audio set, should I keep the original LM provided by Mozilla or create my own?

I’m not expecting anyone to solve my problem but besides lecture PowerPoints on chain probabilities and language models, I can’t find many resources for this last step. Thanks.

Hello!

About your questions:

“If I’m fine tuning to a new domain specific audio set, should I keep the original LM provided by Mozilla or create my own?”

A: If you can, do your own where domain specific words occur … then you get best results. Of course “general sentences” are also needed.

Lenght of NGrams … I recommend you test them in your case… shorter sentences can do well with shorter ngrams but for example in my case 2-Grams were too short to capture context in my case… (My case: Speech to text, phone conversation …)

Hope this helps!

Thank you! I’ll run some experiments with that in mind.

@tuttlebr there is also an option of trying something like this (called interpolation):

Final LM = W1*Mozilla LM+W2*Customized Domain Specific LM
(Note: W1+W2 = 1)

I am trying to work on something suitable to this. SRILM supports this pretty nicely. The thing is how to tune W1 & W2… So the pipeline is CTC output, weights, LM and then fine tune the weights. Has anyone worked on the same?

Any updates would be really helpful.

Thank you, @sayantangangs.911 ! I have gained a slight improvement in WER by using a domain specific language model where the -o value was the average word count per sample in my training data utterances.
WER 19.75% to 19.1%
1:1 Match 69% to 71.3%

I am beginning to gradually increase the lm_alpha parameter as well and have seen some improvement on a holdout of examples which previously had a very high WER.

WER 19.1% to 18.28
1:1 Match 71.3% to 72.1

can anyone elaborate on how they have modified the lm_alpha in their own models?

Hey, @tuttlebr, where are you changing the lm_alpha… I mean, isn’t it meant to be changed only during training and once the model is ready, how and where should lm_alpha be changed? Are you directly changing the values in the client.py file??

I don’t believe the language model is used in training, only inference. I wrote my own script, inspired by the examples on github, which allow you to modify this parameter. You can also modify the beam search, and lm_beta (word isnertion) parameter. You can also just change this via the util.flags.py file via command --lm_alpha, --lm_beta.

I understand. Since I’m currently using just to test out stuffs, I’m using Ds’ cmd line and to effect a change, I’m having to change exactly this parameter.

Regarding the affect of LM in reaining, while its indeed not used, I felt the lm_alpha and the word insertion parameter would affect it to some extent. But I understand its more in the decoding phase.

Important point though is, could someone help with some production level guidance on LM.

Thanks a lot @tuttlebr