How does the scorer in DeepSpeech 0.7 work?

Hi!

I’ve gone through the documentation to create an external scorer so that we can use deepspeech for our projects and I had a few questions:

  1. In DeepSpeech 0.6.1, the language model files used were ‘lm.binary’ and ‘trie’ file. And as far as I know, the ctc decoder uses beam search to assess the probability of the next word using the language model. How does the scorer work? I see the lm_alpha and lm_beta parameters but no beam size…

  2. The generate_lm.py generates the lm.binary file and top-k word file. What exactly is the top-k word file and how is it used in the scorer package generation and in the actual transcription process overall?

  3. The lm_alpha & lm_beta values have been changed in the newest release. How exactly were these values achieved via tuning?

Thanks in advance!

1 Like