Problems trying to run model inference from DS repo clone (without pip install)

I have a clone (technically git subtree) of DeepSpeech repo in my_project_dir/vendor/DeepSpeech, I’ve cloned the repo but not pip installed deep speech because I want to retrain a model (see https://github.com/mozilla/DeepSpeech/issues/2219)

It’s unclear to me from the readme and code what the recommended way to run inference on my test-set files is, but the latest thing I tried (that I was most confident in) was running the command below. (As you can see in the command below, I’m running the DS pretrained model directly, not my own retrained checkpoint (though I have produced a few successfully), trying to narrow the scope of possible causes of error):

python evaluate.py --model deepspeech-0.5.1-models/output_graph.pbmm --alphabet deepspeech-0.5.1-models/alphabet.txt --lm deepspeech-0.5.1-models/lm.binary --trie deepspeech-0.5.1-models/trie --test_files /home/mepstein/voice_to_text/data/ds_csvs/test.csv

^ from my DS venv from path ~/my_project_dir/vendor/DeepSpeech.

The error I’m getting is:

Loading the LM will be faster if you build a binary file. Reading data/lm/lm.binary ----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100 terminate called after throwing an instance of 'lm::FormatLoadException' what(): ../kenlm/lm/read_arpa.cc:65 in void lm::ReadARPACounts(util::FilePiece&, std::vector<long unsigned int>&) threw FormatLoadException. first non-empty line was "version https://git-lfs.github.com/spec/v1" not \data\. Byte: 43 Aborted (core dumped)

I think I’ve seen advice that this message would be due to not having git-lfs installed, but I do.

Also, the log note that it’s looking for the binary file in data/lm/lm.binary was interesting to me, since (based on the readme) I always pass the lm.binary file in deepspeech-0.5.1-models.

So then I checked if the two lm.binary files are identical:

(voice_to_text) mepstein@pop-os:~/voice_to_text/vendor/DeepSpeech$ diff deepspeech-0.5.1-models/lm.binary data/lm/lm.binary

and got output: Binary files deepspeech-0.5.1-models/lm.binary and data/lm/lm.binary differ

So two questions - 1) what is the recommended way to run speech-to-text inference when using vendor’d/cloned DeepSpeech repo instead of pip install, and 2) which of those two (different) lm.binary files should I be using? They both come from DeepSpeech, I did not produce either of them.

Thanks!

Max

You’re right, it means somehow git-lfs did not properly work, you need to make sure it’s properly visible by git, maybe run some extra git lfs commands to fetch and get models.

evaluate.py is the closest way, but it’s not really intended for running inference.
Indeed pip install deepspeech is the best way to run inference.

In the described care you should use the one packaged with the 0.5.1 model, they do live together. Mixing different ones will produce potentially unexpected results.

1 Like

Thanks for the prompt reply, lissyx! Ok I’ll dig some more into what’s going on with git-lfs on my machine.

Re

evaluate.py is the closest way, but it’s not really intended for running inference.
Indeed pip install deepspeech is the best way to run inference.

So I originally had pip-installed DeepSpeech(-gpu) instead of cloning the repo, but then turned to cloning based on advice here (given my usecase that I need to (re)train the English model): https://github.com/mozilla/DeepSpeech/issues/2219

So, given my usecase that I need to both A) (re)train the English model on my data, and B) run inference using those retrained models on my test-set audio files, is the recommended approach to both pip-install DeepSpeech and also clone the repo?

That seems somewhat unorthodox to me, though maybe it’s what’s recommended here? Or if there’s another recommended setup/approach for my use case (needing to retrain models and then run inference on them), what would it be?

Thanks again!
Max

if you need to run against the test set, evaluate.py is a good fit

Thanks again lissyx.

Update on my git-lfs issue in case it’s helpful to anyone else:

I hosed my vendor/DeepSpeech folder/subtree and tried re-adding it as a git subtree again with the following command:
(voice_to_text) mepstein@pop-os:~/voice_to_text$ git subtree add --prefix vendor https://github.com/mozilla/DeepSpeech.git master --squash
(git subtree is the git solution for vendoring a dependency external-repo inside your own git repo, without taking on the complexity of git-submodule)

but got the following git-lfs error:

Downloading vendor/data/lm/lm.binary (1.8 GB)
Error downloading object: vendor/data/lm/lm.binary (e1fa680): Smudge error: Error downloading vendor/data/lm/lm.binary (e1fa6801b25912a3625f67e0f6cafcdacb24033be9fad5fa272152a0828d7193): batch request: missing protocol: ""

Errors logged to /home/mepstein/voice_to_text/.git/lfs/logs/20190706T101705.161005699.log
Use `git lfs logs last` to view the log.
error: external filter 'git-lfs filter-process' failed
fatal: vendor/data/lm/lm.binary: smudge filter lfs failed

From some googling, the fundamental issue seems to be that “Git LFS currently does not support subtrees, and this is not currently an item on our ROADMAP.”

So I’m going to move forward populating my vendor/DeepSpeech/ subfolder with just a manually downloaded/unzipped copy of the DeepSpeech repo. For anyone considering doing this but nervous about reproduceability from manually downloading a repo as a vendored dependency, github will let you download a snapshot of a repo as of any commit

This will likely not embed the LM that git-lfs packages. git submodule should work, however.

Thanks for the warning lissyx. In general I try to avoid using git submodule. I think I have a couple other reasonable ways forward to get the LM though so no worries.

One question I do still have though - why did DeepSpeech even try to load the lm file from data/lm/lm.binary when I called evaluate.py with --lm deepspeech-0.5.1-models/lm.binary? (my first message in this thread shows the command I called, and that the error shows DeepSpeech tried to load the other lm.binary file)

Hm ok I think I’ve found the answer to my last question?

From https://github.com/mozilla/DeepSpeech/blob/master/util/flags.py, it looks like the cli param for specifying the lm model to use is lm_binary_path, not lm?

That’s the param that seems to flow to main(_)and evaluate() in evaluate.py as well as main(_) and do_single_file_inference() in DeepSpeech.py.

But the main DeepSpeech readme references using --lm models/lm.binary to specify the lm, at least for using the ./deepspeech binary (which I’m assuming is running DeepSpeech.py/, entering from __main__)?

Is the readme out of date in this regard, or am I making an incorrect assumption somewhere here?

You are mixing CLI arguments from the inference code-base with the training code.

Thanks for the quick response lissyx.

Although the readme example using --lm param
(deepspeech --model models/output_graph.pbmm --alphabet models/alphabet.txt --lm models/lm.binary --trie models/trie --audio my_audio_file.wav)

and the code I tried, which seems to have ignored my --lmparam
(python evaluate.py --model deepspeech-0.5.1-models/output_graph.pbmm --alphabet deepspeech-0.5.1-models/alphabet.txt --lm deepspeech-0.5.1-models/lm.binary --trie deepspeech-0.5.1-models/trie --test_files /home/mepstein/voice_to_text/data/ds_csvs/test.csv)

are both inference tasks. Neither of those are training a model, right?

Except evaluate.py is, because it re-uses training infra.

ok thanks.

In case it’s helpful to anyone else, I think the ultimate source of my confusion was more about the CLI params for native client (which ./deepspeech command shown in the readme is) vs for calling python scripts, e.g. using DeepSpeech.py or evaluate.py with cli args with DeepSpeech repo as a vendored dependency/sub-repo.

I was assuming they were the same set of cli args, defined in DeepSpeech/util/config.py and DeepSpeech/util/flags.py. And those reference e.g. cli args for lm_binary_path, lm_trie_path, etc, but not for lm and trie.

But the lm and trie cli args are in the set of cli args defined for native client at DeepSpeech/native_client/args.h