Can anyone recommend sources of information for tuning the language model? Specifically, there is the order of the language model (-o) that is set to 5grams in Ken H’s website but I can’t find much on when it’s recommended to change this.
If I’m transcribing shorter utterances, should I set this order value to a lower number?
If I’m fine tuning to a new domain specific audio set, should I keep the original LM provided by Mozilla or create my own?
I’m not expecting anyone to solve my problem but besides lecture PowerPoints on chain probabilities and language models, I can’t find many resources for this last step. Thanks.
“If I’m fine tuning to a new domain specific audio set, should I keep the original LM provided by Mozilla or create my own?”
A: If you can, do your own where domain specific words occur … then you get best results. Of course “general sentences” are also needed.
Lenght of NGrams … I recommend you test them in your case… shorter sentences can do well with shorter ngrams but for example in my case 2-Grams were too short to capture context in my case… (My case: Speech to text, phone conversation …)
@tuttlebr there is also an option of trying something like this (called interpolation):
Final LM = W1*Mozilla LM+W2*Customized Domain Specific LM
(Note: W1+W2 = 1)
I am trying to work on something suitable to this. SRILM supports this pretty nicely. The thing is how to tune W1 & W2… So the pipeline is CTC output, weights, LM and then fine tune the weights. Has anyone worked on the same?
Thank you, @sayantangangs.911 ! I have gained a slight improvement in WER by using a domain specific language model where the -o value was the average word count per sample in my training data utterances.
WER 19.75% to 19.1%
1:1 Match 69% to 71.3%
I am beginning to gradually increase the lm_alpha parameter as well and have seen some improvement on a holdout of examples which previously had a very high WER.
WER 19.1% to 18.28
1:1 Match 71.3% to 72.1
can anyone elaborate on how they have modified the lm_alpha in their own models?
Hey, @tuttlebr, where are you changing the lm_alpha… I mean, isn’t it meant to be changed only during training and once the model is ready, how and where should lm_alpha be changed? Are you directly changing the values in the client.py file??
I don’t believe the language model is used in training, only inference. I wrote my own script, inspired by the examples on github, which allow you to modify this parameter. You can also modify the beam search, and lm_beta (word isnertion) parameter. You can also just change this via the util.flags.py file via command --lm_alpha, --lm_beta.
I understand. Since I’m currently using just to test out stuffs, I’m using Ds’ cmd line and to effect a change, I’m having to change exactly this parameter.
Regarding the affect of LM in reaining, while its indeed not used, I felt the lm_alpha and the word insertion parameter would affect it to some extent. But I understand its more in the decoding phase.
Important point though is, could someone help with some production level guidance on LM.