How can i add custom vocab.txt and build a language model lm.binary, trie for pretrained model v0.2.0.
Sir now that pretrained model vocab.txt(data/lm/vocab.txt) and i added my own vocab.txt for the same format.and i started to convert lm model, trie.
result: throws error
_…/…/new_native_client/kenlm/build/bin/lmplz -o 5 <vocab.txt >lm.arpa
_
=== 1/5 Counting and sorting n-grams ===
Reading /home/dell/Music/12-09-2018/DeepSpeech/data/own_lmm/vocab.txt
----5—10—15—20—25—30—35—40—45—50—55—60—65—70—75—80—85—90—95–100
Unigram tokens 974571 types 973693
=== 2/5 Calculating and sorting adjusted counts ===
Chain sizes: 1:11684316 2:641120832 3:1202101632 4:1923362432 5:2804903936
/home/dell/Music/12-09-2018/DeepSpeech/new_native_client/kenlm/lm/builder/adjust_counts.cc:52 in void lm::builder::{anonymous}::StatCollector::CalculateDiscounts(const lm::builder::DiscountConfig&) threw BadDiscountException because `s.n[j] == 0’.
Could not calculate Kneser-Ney discounts for 1-grams with adjusted count 3 because we didn’t observe any 1-grams with adjusted count 2; Is this small or artificial data?
Try deduplicating the input. To override this error for e.g. a class-based model, rerun with --discount_fallback…/…/new_native_client/kenlm/build/bin/build_binary lm.arpa lm.binary
Reading lm.arpa
----5—10—15—20—25—30—35—40—45—50—55—60—65—70—75—80—85—90—95–100
End of file Byte: 0
I search this same issue, solution was not clear,
how can i combine vocab and build a custom LM sir.