I’m using ore trained Deepspeech v0.3 for my use case. I’m trying to use wikipedia dump as the language model. Hence, I’m doing the following:
- …/kenlm/build/bin/lmplz -o 4 -T /home/sayantan <wiki_dimp.txt>lm_new.arpa
- …/kenlm/build/bin/build_binary trie -T /home/sayantan lm_new.arpa lm_new.binary
- ./DeepSpeech/native_client/generate_trie ./wiki_model/alphabet_new.txt ./wiki_model/lm_new.binary ./wiki_model/trie_new
What I did is I changed the alphabet.txt file a bit. Does that cause problem in decoding? (probably it does),
And I’m getting gibberish output, using LM and not the most perfect output without using LM.
Can anyone confirm on the “alphabet” file and whether there’s something missing in the steps?