Hello everyone,
I downloaded and installed deepspeech
pip3 install deepspeech
and the model using
wget -c https://github.com/mozilla/DeepSpeech/releases/download/v0.4.1/deepspeech-0.4.1-models.tar.gz
To check if it is working I used self made and publicly available wav files, e.g. man_wb1.wav from https://github.com/EN10/DeepSpeech
The output is always the same:
jh186076@MUSJH186076-933 DeepSpeech (master) $ deepspeech --model models/output_graph.pbmm --alphabet models/alphabet.txt --lm models/lm.binary --trie models/trie --audio man1_wb.wav
Loading model from file models/output_graph.pbmm
TensorFlow: v1.12.0-10-ge232881c5a
DeepSpeech: v0.4.1-0-g0e40db6
2019-04-29 14:37:41.623082: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
Loaded model in 0.0102s.
Loading language model from files models/lm.binary models/trie
Loaded language model in 1.68s.
Running inference.
Inference took 3.477s for 8.208s audio file.
What am I doing wrong? Any help or advice how to further investigate this behaviour would be much appreciated.