Hello,
after I did some first experiments with deepspeech now I want to optimize the recognition with an own language model and an own scorer. Can you help me, what the parameters to use “generate_lm.py” mean and what I can change with them?
I have a few ideas, but I didn´t find an explanation for all parameters. It would be great if some of you could help me.
Here is the code and partly my ideas to the parameters:
–input_txt phrases.txt:
All words from „phrases.txt“ are valid words to recognize.
–output_dir .
directory for saving the LM
–top_k 500000
–kenlm_bins …/…/…/kenlm/build/bin
maybe just the path to the “bin”-folder from KenLM?
–arpa_order 5
–max_arpa_memory “85%”
–arpa_prune “0|0|1”
–binary_a_bits 255
–binary_q_bits 8
–binary_type trie
–discount_fallback
Thanks in advance!