Can you give us access to chunk15.wav
to test too?
Looking at chunk15.wav
two things jump out at me:
- The accent is Indian however, our model was trained for an American accent. The performance will be worse on an Indian accent and some fine-tuning of the model should be done.
- I am getting different results than you are with 0.4.0 (āi can dig and people of minutes for a god in perfecting latin poem notepaperā) which suggests your 0.4.0 setup is amiss.
I already tried fine tuning the DeepSpeech 0.3.0
model with this data, and it exported a model after three epochs. The exported modelās inference result was null
.
Then I just stopped fine tuning assuming that fine tuning a model means training a new model for our data knowledge, and not adding new knowledge to the existing model (no knowledge can be carried back to the new model)
ā¦
I downloaded the pretrained DeepSpeech 0.4.0
model from github release and used it with the below command as usual.
~$ deepspeech --model models/output_graph.pbmm --audio /mnt/c/users/karthikeyan/downloads/chunk15.wav --alphabet models/alphabet.txt --lm models/lm.binary --trie models/trie
Is there any other ways to inference deepspeech to get the output like yours
ā¦!! Do i need to add any extra parameters or any other thingsā¦!!
Let start at the start.
How did you install deepspeech 0.4.0? Did you install it in a fresh virtual environment as suggested in the documentation[1].
I downloaded the pretrained DeepSpeech 0.4.0
models from the releases. using this,
wget https://github.com/mozilla/DeepSpeech/releases/download/v0.4.0/deepspeech-0.4.0-models.tar.gz tar xvfz deepspeech-0.4.0-models.tar.gz
ā¦
and extracted in to a directory, created a virtual environment, install requirement.txtā¦ started inferenceā¦
@karthikeyank Iām sorry to insist again, but please, stop removing the versions from the output, itās really important when we do that kind of debugging.
Am sorry @lissyx, I donāt know, what you are talking aboutā¦ I think i didnāt mentioned the version of the models at some places ( I have corrected them).
When you run inference, thereās TensorFlow and DeepSpeech build versions written in the output. We need that, as well as the exact model files. There might be small divergences with high impact.
@karthikeyank Could you use the instructions I referenced[1] to install the Python package? Just so we are starting at the same point. (In particular it mentions nothing about āinstall requirement.txtā.)
Yesā¦ I followed the Same method and the version of DeepSpeech Python package is DeepSpeech: v0.4.0-0-g48ad711
ā¦
I installed all the python packages mentioned in the requirements.txt
in the virtual environmentā¦
I tried again and It produces the following output now,
(deep4.0) userk@PSSHSRDT034:~/DeepSpeechPro/native_client4.0$ deepspeech --model models/output_graph.pbmm --alphabet models/alphabet.txt --lm models/lm.binary --trie models/trie --audio /mnt/c/users/karthikeyan/Downloads/chunk15.wav
Loading model from file models/output_graph.pbmm
TensorFlow: v1.12.0-10-ge232881
DeepSpeech: v0.4.0-0-g48ad711
2019-01-08 16:21:49.623456: Itensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
Loaded model in 0.042s.
Loading language model from files models/lm.binary models/trie
Loaded language model in 0.269s.
Running inference.
i can do and picture of minutes for eating for the saintly in bloom no deterrent
Inference took 21.940s for 6.060s audio file.
The instructions I reference do not mention requirement.txt. Could you please follow the instructions I reference or itās hard/impossible for use to debug.
okay now I have created a new virtual environment and installed deepspeechā¦
doneā¦
and this is the outputā¦
(newenv) userk@PSSHSRDT034:~/pycodes$ deepspeech --model /home/userk/DeepSpeechPro/native_client4.0/models/output_graph.pbmm --alphabet /home/userk/DeepSpeechPro/native_client4.0/models/alphabet.txt --lm /home/userk/DeepSpeechPro/native_client4.0/models/lm.binary --trie /home/userk/DeepSpeechPro/native_client4.0/ models/trie --audio /mnt/c/users/karthikeyan/Downloads/chunk15.wav
Loading model from file
/home/userk/DeepSpeechPro/native_client4.0/models/output_graph.pbmm
TensorFlow: v1.12.0-10-ge232881
DeepSpeech: v0.4.0-0-g48ad711
2019-01-08 16:45:21.728660: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
Loaded model in 0.0149s.
Loading language model from files /home/userk/DeepSpeechPro/native_client4.0/models/lm.binary /home/userk/DeepSpeechPro/native_client4.0/models/trie
Loaded language model in 0.294s.
Running inference.
i can do and picture of minutes for eating for the saintly in bloom no deterrent
Inference took 3.389s for 6.060s audio file.
This is different from my result which seems odd. The entire process should be deterministic.
Could you also indicate how you created your virtual environment?
Also, as you have at least 2 models youāve been using, our release model and your fine-tuned model, could you check to make sure that the model you are using is indeed the release model and not your fine tuned one?
For example you could download the release model again or compute the md5 hash of the model. On my machine I have:
kdavis-19htdh:models kdavis$ md5 output_graph.pbmm
MD5 (output_graph.pbmm) = a3cafcb87fcf09d38ce58cbc41e5c681
Similarly for the language model and trie
kdavis-19htdh:models kdavis$ md5 lm.binary
MD5 (lm.binary) = 5f762eecdc4c4cc2068dc1a84ec57873
kdavis-19htdh:models kdavis$ md5 trie
MD5 (trie) = 182f72835a19800a3564b5da75ffc526
I created the Virtual environment by the following commands,
sudo apt-get install python3-venv
python -m venv newenv
Nope. Actually the fine tuned model was producing null values, So am not using itā¦ Iām using DeepSpeech 0.3.0
and DeepSpeech 0.4.0
ā¦
The md5
hash of DeepSpeech 0.4.0
model is
a3cafcb87fcf09d38ce58cbc41e5c681 output_graph.pbmm
The md5
hash of language model
is
5f762eecdc4c4cc2068dc1a84ec57873 lm.binary
The md5
hash of trie
file is
182f72835a19800a3564b5da75ffc526 trie
Are you sure you are using python3? What does
python --version
give?
python --version
Python 3.5.2