After training with the run-ldc93s1.sh
test run, I cannot use output_graph.pb
to run inference.
Steps taken:
- Install git-lfs
- Clone the repo
- Set up virtual environment and install requirements
pip install deepspeech
- Install native client to target directory
native_client
using tackluster - Make a directory
ld-model
to put theoutput_graph.pb
into - Delete previous checkpoints for
ldc93s1
(from previous experiments) - Run a modified version of
run-ldc93s1.sh
. The only changes are adding--export_dir ld-model
and changing the epochs to5
(for brevity here)
The training output looks like this:
(venv) thuselem DeepSpeech $ ./bin/run-ldc93s1.sh
+ [ ! -f DeepSpeech.py ]
+ [ ! -f data/ldc93s1/ldc93s1.csv ]
+ [ -d ]
+ python -c from xdg import BaseDirectory as xdg; print(xdg.save_data_path("deepspeech/ldc93s1"))
+ checkpoint_dir=/home/thuselem/.local/share/deepspeech/ldc93s1
+ python -u DeepSpeech.py --train_files data/ldc93s1/ldc93s1.csv --dev_files data/ldc93s1/ldc93s1.csv --test_files data/ldc93s1/ldc93s1.csv --train_batch_size 1 --dev_batch_size 1 --test_batch_size 1 --n_hidden 494 --epoch 5 --checkpoint_dir /home/thuselem/.local/share/deepspeech/ldc93s1 --export_dir ld-model
/home/thuselem/development/deepspeech-training/DeepSpeech/venv/lib/python3.6/importlib/_bootstrap.py:219: RuntimeWarning: numpy.dtype size changed, may indicate binary incompatibility. Expected 96, got 88
return f(*args, **kwds)
/home/thuselem/development/deepspeech-training/DeepSpeech/venv/lib/python3.6/importlib/_bootstrap.py:219: RuntimeWarning: numpy.dtype size changed, may indicate binary incompatibility. Expected 96, got 88
return f(*args, **kwds)
/home/thuselem/development/deepspeech-training/DeepSpeech/util/audio.py:17: DeepSpeechDeprecationWarning: DeepSpeech Python bindings could not be imported, resorting to slower code to compute audio features. Refer to README.md for instructions on how to install (or build) the DeepSpeech Python bindings.
category=DeepSpeechDeprecationWarning)
W Parameter --validation_step needs to be >0 for early stopping to work
I STARTING Optimization
I Training epoch 0...
I Training of Epoch 0 - loss: 341.650848
100% (1 of 1) |#######################################| Elapsed Time: 0:00:01 Time: 0:00:01
I Training epoch 1...
I Training of Epoch 1 - loss: 161.463470
100% (1 of 1) |#######################################| Elapsed Time: 0:00:01 Time: 0:00:01
I Training epoch 2...
I Training of Epoch 2 - loss: 166.970291
100% (1 of 1) |#######################################| Elapsed Time: 0:00:01 Time: 0:00:01
I Training epoch 3...
I Training of Epoch 3 - loss: 168.173965
100% (1 of 1) |#######################################| Elapsed Time: 0:00:01 Time: 0:00:01
I Training epoch 4...
I Training of Epoch 4 - loss: 142.360580
I FINISHED Optimization - training time: 0:00:06
100% (1 of 1) |#######################################| Elapsed Time: 0:00:01 Time: 0:00:01
I Testing epoch 5...
I Test of Epoch 5 - WER: 1.000000, loss: 143.72097778320312, mean edit distance: 0.846154
I --------------------------------------------------------------------------------
I WER: 1.000000, loss: 143.720978, mean edit distance: 0.846154
I - src: "she had your dark suit in greasy wash water all year"
I - res: "he says "
I --------------------------------------------------------------------------------
I Exporting the model...
Converted 12 variables to const ops.
100% (1 of 1) |#######################################| Elapsed Time: 0:00:00 ETA: 00:00:00
Following training, I attmept to run inference:
deepspeech ld-model/output_graph.pb data/ldc93s1/LDC93S1.wav data/alphabet.txt data/lm/lm.binary data/lm/trie
with the following output:
Loading model from file ld-model/output_graph.pb
2018-08-20 15:29:53.358417: I tensorflow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
Not found: Op type not registered 'BlockLSTM' in binary running on archyton. Make sure the Op and Kernel are registered in the binary running in this process.
Loaded model in 0.027s.
Loading language model from files data/lm/lm.binary data/lm/trie
Loaded language model in 0.958s.
Running inference.
Segmentation fault (core dumped)
I have been able to run inference on my machine (Ubunutu 18.04, no gpu of note) using the pre-trained model. I get the same warning about AVX2 FMA. The not found ‘BlockLSTM’ only shows up on my test run model.
Suggestions appreciated. I have not found much about BlockLSTM that has been helpful to me (admitted: I am new to tensorflow).