Problem with deepspeech-0.3.0-checkpoint.tar.gz

ssinha89 · November 23, 2018, 12:46pm

The model inference carried out using the installed deepspeech and pre-trained model deepspeech-0.3.0-models.tar.gz gives a different result fron when deepspeech.py is run with the deepspeech-0.3.0-checkpoint.tar.gz checkpoint files and only model inference is done.
It might be because the deepspeech-0.3.0-checkpoint.tar.gz provided in the Releases page contains the files named model.v0.2.0. Please clarify the above.

lissyx · November 23, 2018, 12:59pm

Can you share more information on that ? v0.3.0 was not a version touching the model itself, it should have been a 0.2.1, but we did backward-incompatible changes to inference code and preferred to bump version.

reuben · November 23, 2018, 1:55pm

DeepSpeech.py by default uses a default beam width of 1024, while the clients use 512. It could be just that.

ssinha89 · November 23, 2018, 2:40pm

Are all the other model parameters same ? The exported model and the checkpoint files in the release 0.3.0 denote the same model configuration , and trained on the same datasets ?

ssinha89 · November 23, 2018, 3:32pm

Running the code from the checkpoint:

python3 -u DeepSpeech.py --checkpoint_dir ‘deepspeech-0.3.0-checkpoint’ --one_shot_infer ‘data/ldc93s1/LDC93S1.wav’ --train 0 --test 0

Inference : she had acuteness water

Running the installed package without language models:

deepspeech --model models/output_graph.pbmm --alphabet models/alphabet.txt --audio ‘data/ldc93s1/LDC93S1.wav’

Inference : she hadered uc sut and greasy washwar or year

Running with language model

deepspeech --model models/output_graph.pbmm --alphabet models/alphabet.txt --lm models/lm.binary --trie models/trie --audio ‘data/ldc93s1/LDC93S1.wav’

Inference: she had ereducsutandgreasywashwaroryear

ssinha89 · November 24, 2018, 6:01am

Please let me know about the above

lissyx · November 24, 2018, 9:17am

I’ve already told you there was no change in training between v0.2.0 and v0.3.0

ssinha89 · November 24, 2018, 9:36am

Thanks @lissyx for the information.

ssinha89 · November 24, 2018, 7:02pm

@reuben I executed the following command with the beam size 512 using the checkpoint files.

python3 -u DeepSpeech.py --checkpoint_dir ‘deepspeech-0.3.0-checkpoint’ --train 0 --test 0 --beam_width 512 --one_shot_infer ‘data/ldc93s1/LDC93S1.wav’
Inference: she had acuteness of ear

When I run following on the same audio file I am still getting a different output

deepspeech --model models/output_graph.pbmm --alphabet models/alphabet.txt --audio ‘data/ldc93s1/LDC93S1.wav’

Inference: she hadered uc sut and greasy washwar or year

Am I missing something in the arguments for running the model from the checkpoint?

reuben · November 24, 2018, 7:26pm

Looks like you’re using an incompatible model (output_graph.pbmm). If you’re using the v0.3 checkpoint/model, you should also use the v0.3 code/client.

ssinha89 · November 24, 2018, 7:37pm

As mentioned in the ReadMe I used the following command to get the model files, seems to be the same version as the checkpoint.
wget -O - https://github.com/mozilla/DeepSpeech/releases/download/v0.3.0/deepspeech-0.3.0-models.tar.gz | tar xvfz -

reuben · November 24, 2018, 7:42pm

And you’re also using the native_client package from that same page?

ssinha89 · November 24, 2018, 7:51pm

I installed the DeepSpeech wheel via the following

pip3 install deepspeech

reuben · November 24, 2018, 7:55pm

And DeepSpeech.py is also at v0.3.0?

ssinha89 · November 24, 2018, 8:00pm

I did a
git clone https://github.com/mozilla/DeepSpeech

Should I use the one v0.3.0 from the Releases page ?

reuben · November 24, 2018, 8:04pm

Just do git checkout v0.3.0

ssinha89 · November 25, 2018, 5:46pm

Thanks a lot @reuben. It solved my problem !