Hi!
I’ve gone through the documentation to create an external scorer so that we can use deepspeech for our projects and I had a few questions:
-
In DeepSpeech 0.6.1, the language model files used were ‘lm.binary’ and ‘trie’ file. And as far as I know, the ctc decoder uses beam search to assess the probability of the next word using the language model. How does the scorer work? I see the lm_alpha and lm_beta parameters but no beam size…
-
The generate_lm.py generates the lm.binary file and top-k word file. What exactly is the top-k word file and how is it used in the scorer package generation and in the actual transcription process overall?
-
The lm_alpha & lm_beta values have been changed in the newest release. How exactly were these values achieved via tuning?
Thanks in advance!