Hi, I was reading differents topics about drop space in output, now i want to train a model without a language model.
There’s no way to do it without modifying the code. You can replace the
decode_with_lm call with
tf.nn.ctc_beam_search_decoder and then WER reports won’t use the language model when decoding.
Did I get it right that the LM is only used for decoding after the training? Because when I included my own lm and had a typo in the path to the lm.binary file it finished training (all epochs) and I got the error afterwards. I thought the lm is also included in training somehow. That was wrong?
That is correct, the LM is only used for the test epoch, it does not influence training.
I can’t find the
decode_with_lm call --> is there an easy way for the newest version 0.5.0 as well not using the language model?
If you don’t specify the language model arguments on the command-line it won’t use the language model.
Actually, it’ll just use the default one in data/lm. I think we don’t have an easy way to disable that, you have to comment out the loading of the Scorer in evaluate.py.
Won’t setting lm_alpha and lm_beta to 0 ignore the LM? @lissyx suggested this would work.
Are you sure?
I trained a non-English model, and tested it without passing lm parameters, and it gives output of my language without any problem. I think this for sure means the language model is disabled because otherwise non-English alphabets/words should be removed from the results. right?
try to remove/rename data/lm folder and test, please!
evaluate.py unconditionally loads a LM/trie for scoring, if you don’t pass the flags it’ll use the default paths in data/lm. This is pretty easy to test, you already suggested an effective way to do it