Language Model For Deepspeech

Hello,
I used a Kenlm built language model of vocabulary.txt ( text of transcriptions ) of Nepali language to build a DeepSpeech model and i built another model without using any language model.
The latter seems to be not working at all.

It means DeepSpeech use language model while training as well?
If so can I also use another language model for inferencing besides the one used for traning?

The language model is not used during training.

Why does the two models trained with and without language model infer differently while testing?

I said it’s not used during training, but it is used for inference (that’s what it’s for), which includes the final test epoch in the end. You can re-use the same checkpoint/model with and without the language model, or with different language models. The training phase itself is not dependent on the LM.

  1. Trained with Nepali language model ( path to language model given while training )
    with_lm

  2. Trained without Language model ( path not given… It might have used default lm.binary)
    without_lm

I used language model for inference for both above test.

The language model is not the cause of the discrepancy, something else is different in your training procedure. Like I said, the LM is not used during training.

Okay. Thanks for the information.

Set alpha and beta=0, so that any loaded LM won’t affect your results. You cannot be having the default lm loaded for nepali. @spygaurad

Do i need to retrain setting alpha and beta parameters to 0 @alchemi5t
or should i be using them during testing/inferencing ?

I can simply not provide language model in inferencing if i dont want my LM to affect the result.

You don’t need to retrain it. Just set them to 0 while inferencing.

1 Like

Thnks for the information @alchemi5t