[Solved] Unable to train from frozen model - " Requested return_element 'lstm_fused_cell/kernel:0' not found in graph_def."

I’m trying to train from the frozen pre-trained model and keep running into this error. I’ve previously had it working, although I had to set up my VM again and I can’t get this to work the second time around. I’m running the Azure Data Science VM with an NVIDIA Tesla P40 GPU. CUDA and CUdNN come pre-installed.

Setup:

  • Install Python3.6
  • Install git-lfs
  • git clone https://github.com/mozilla/DeepSpeech
  • cd DeepSpeech
  • wget -O - https://github.com/mozilla/DeepSpeech/releases/download/v0.1.1/deepspeech- 0.1.1-models.tar.gz | tar xvfz -
  • pip install virtualenv
  • virtualenv -p python3.6 $HOME/tmp/deepspeech-venv/
  • source $HOME/tmp/deepspeech-venv/bin/activate
  • pip3 install deepspeech-gpu
  • pip3 install -r requirements.txt
  • python3.6 util/taskcluster.py --arch gpu --target ~/DeepSpeech/native_client
  • pip3 uninstall tensorflow
  • pip3 install 'tensorflow-gpu==1.6.0'

With this setup, I am able to run

/bin/run-ldc93s1.sh

and train the model without any issues. However, when I attempt to train from the frozen model like so:

python3.6 DeepSpeech.py --n_hidden 2048 --initialize_from_frozen_model models/output_graph.pb --checkpoint_dir fine_tuning_checkpoints --epoch 10 --train_files ~/DeepSpeech/data/train/Data.csv --dev_files ~/DeepSpeech/data/dev/Data.csv --test_files ~/DeepSpeech/data/test/Data.csv --learning_rate 0.0001"

I get the following output:

/home/dwhettam/tmp/deepspeech-venv/lib/python3.6/importlib/_bootstrap.py:219: RuntimeWarning: numpy.dtype size changed, may indicate binary incompatibility. Expected 96, got 88 return f(*args, **kwds)
/home/dwhettam/tmp/deepspeech-venv/lib/python3.6/importlib/_bootstrap.py:219: RuntimeWarning: numpy.dtype size changed, may indicate binary incompatibility. Expected 96, got 88 return f(*args, **kwds)
/home/dwhettam/DeepSpeech/util/audio.py:17: DeepSpeechDeprecationWarning: DeepSpeech Python bindings could not be imported, resorting to slower code to compute audio features. Refer to README.md for instructions on how to install (or build) the DeepSpeech Python bindings. category=DeepSpeechDeprecationWarning)
W Parameter --validation_step needs to be >0 for early stopping to work
Traceback (most recent call last):
File "/home/dwhettam/tmp/deepspeech-venv/lib/python3.6/site-packages/tensorflow/python/framework/importer.py", line 764, in import_graph_def ret.append(name_to_op[operation_name].outputs[output_index])
KeyError: 'lstm_fused_cell/kernel'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "DeepSpeech.py", line 1945, in <module>This text will be hidden tf.app.run(main)
File "/home/dwhettam/tmp/deepspeech-venv/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 126, in run _sys.exit(main(argv))
File "DeepSpeech.py", line 1901, in main train()
File "DeepSpeech.py", line 1566, in train var_tensors = tf.import_graph_def(graph_def, return_elements=var_names)
File "/home/dwhettam/tmp/deepspeech-venv/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 432, in new_func return func(*args, **kwargs)
File "/home/dwhettam/tmp/deepspeech-venv/lib/python3.6/site-packages/tensorflow/python/framework/importer.py", line 767, in import_graph_def
Requested return_element %r not found in graph_def.' % name) ValueError: Requested return_element 'lstm_fused_cell/kernel:0' not found in graph_def."

I can’t find any posts on here with the same error message, so any help would be much appreciated. Thanks!


Fix (https://github.com/mozilla/DeepSpeech/issues/1503):

  • After cloning the repo, checkout to e00bfd0

git checkout e00bfd0

  • Use client v0.2.0a8

python3 util/taskcluster.py --arch gpu --branch v0.2.0-alpha.8 --target native_client

As documented in https://github.com/mozilla/DeepSpeech/issues/1503 you cannot do that on top of 0.1.1 model starting with 0.2.0-alpha-9, since the architecture of the model changed.

Yep, just spotted that issue last night. Thanks! I’ll update my post with the fix