Usage instructions for lm_optimizer

sigma_g · May 27, 2020, 9:59am

I am reading the Scorer page, and built my own scorer! However, it does not recognize anything So, I decided to tune my alpha and beta values, using the lm_optimizer script. But I cannot figure out how to use it. There are various FLAGS being used in the script however it’s not clear how to run that.

I tried this for example:

python ../../lm_optimizer.py --test_files=vocab-5000.txt --alphabet_config_path=../alphabet.txt --scorer=kenlm.scorer

(here kenlm.scorer and vocab-5000.txt are generated by me)

But it exits with error code 1 and message:

/home/gt/otherrepos/DeepSpeech/venv/lib/python3.7/site-packages/pandas/compat/__init__.py:117: UserWarning: Could not import the lzma module. Your installed Python is incomplete. Attempting to use lzma compression will result in a RuntimeError.
  warnings.warn(msg)
swig/python detected a memory leak of type 'Alphabet *', no destructor found.
swig/python detected a memory leak of type 'Alphabet *', no destructor found.
I Could not find best validating checkpoint.
I Could not find most recent checkpoint.
E All initialization methods failed (['best', 'last']).

Which is because I’m pretty sure I got the arguments wrong. Could anyone maybe show how a sample command is run?

reuben · May 27, 2020, 10:20am

The arguments are mostly the same as evaluate.py, it expects a validation dataset CSV file for --test_files (not a vocabulary file) and also a valid --checkpoint_dir pointing to a trained model checkpoint.

sigma_g · May 27, 2020, 10:48am

Alright, thanks @reuben! I had not seen evaluate.py earlier, hence, the confusion. I will look into it now.

Topic		Replies	Views
Questions about the lm_optimizer.py process? DeepSpeech	9	1931	August 18, 2020
Usage of lm_alpha and lm_beta while testing DeepSpeech	7	1648	December 14, 2020
Debugging/validating custom lm.scorer DeepSpeech	8	707	June 5, 2020
Question regarding the new scorer function instead of LM+trie DeepSpeech	8	830	May 20, 2020
Custom LM causes terrible false positive rate DeepSpeech	48	2577	August 28, 2020

Usage instructions for lm_optimizer

Related topics